WO2022147908A1 - Procédé et appareil de récupération de données perdues sur la base d'une association de tables, dispositif et support - Google Patents

Procédé et appareil de récupération de données perdues sur la base d'une association de tables, dispositif et support Download PDF

Info

Publication number
WO2022147908A1
WO2022147908A1 PCT/CN2021/083104 CN2021083104W WO2022147908A1 WO 2022147908 A1 WO2022147908 A1 WO 2022147908A1 CN 2021083104 W CN2021083104 W CN 2021083104W WO 2022147908 A1 WO2022147908 A1 WO 2022147908A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
extracted
slave
incremental
association
Prior art date
Application number
PCT/CN2021/083104
Other languages
English (en)
Chinese (zh)
Inventor
陈伟
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2022147908A1 publication Critical patent/WO2022147908A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Definitions

  • the present application relates to the technical field of big data, and in particular, to a method, apparatus, device and medium for recovering lost data based on table association.
  • ETL Extract-Transform-Load
  • ETL Extract-Transform-Load
  • the usual data processing strategy is to incrementally extract data from the source system to the data warehouse system, and then transform and load the data in the data warehouse system.
  • incremental synchronization is usually preferred, that is, the source system synchronizes data to the data warehouse according to the incremental timestamp.
  • the strategy of the data warehouse is to extract each table separately. If the two tables have a master-slave relationship in the source system (such as the customer table and the account table), but the extraction time of the two tables is not exactly the same, or due to the source system transaction Management strategy, resulting in inconsistent submission time during extraction, or for any other reason, the incremental timestamp cannot guarantee the consistency of business logic, which will cause the incremental data of the master-slave table to not match. Then, load the table on the data warehouse side , the dependent key of the slave table cannot be found in the master table, resulting in data loss in subsequent data conversion processing on the data warehouse side.
  • a first aspect of the present application provides a method for recovering lost data based on table association, and the method for recovering lost data based on table association includes:
  • the data extracted from the to-be-extracted data table is incrementally synchronized to the incremental table, and the main table is constructed according to the extracted data in the incremental table;
  • the data extracted from the to-be-extracted data table is incrementally synchronized to the incremental table according to the second data extraction instruction, and the extracted data is updated to the main table;
  • a second aspect of the present application provides an electronic device comprising a processor and a memory, the processor being configured to execute at least one computer-readable instruction stored in the memory to implement the following steps:
  • the data extracted from the to-be-extracted data table is incrementally synchronized to the incremental table, and the main table is constructed according to the extracted data in the incremental table;
  • the data extracted from the to-be-extracted data table is incrementally synchronized to the incremental table according to the second data extraction instruction, and the extracted data is updated to the main table;
  • a third aspect of the present application provides a computer-readable storage medium on which at least one computer-readable instruction is stored, and the at least one computer-readable instruction is executed by a processor to implement the following steps:
  • the data extracted from the to-be-extracted data table is incrementally synchronized to the incremental table, and the main table is constructed according to the extracted data in the incremental table;
  • the data extracted from the to-be-extracted data table is incrementally synchronized to the incremental table according to the second data extraction instruction, and the extracted data is updated to the main table;
  • a fourth aspect of the present application provides an apparatus for recovering lost data based on table association, wherein the recovery of lost data based on table association includes:
  • an acquisition unit configured to acquire the data table to be extracted from the source system according to the first data extraction instruction in response to the first data extraction instruction;
  • a determining unit for determining an associated data table associated with the to-be-extracted data table
  • the construction unit is further configured to obtain the extracted data of the associated data table from the incremental table, and construct a slave table in the incremental table according to the extracted data;
  • an association unit used for associating the data in the slave table with the data in the master table, and obtaining the data for which the association fails;
  • a writing unit used to write the data of the association failure into the recycling table
  • an update unit configured to, in response to a second data extraction instruction for the to-be-extracted data table, extract data incrementally from the to-be-extracted data table to the incremental table according to the second data extraction instruction, and updating the extracted data to the main table;
  • the update unit is also used to obtain the current slave table from the incremental table, and calculate the union of the current slave table and the recovery table as the updated slave table;
  • the associating unit is further configured to associate the updated slave table with the updated master table, and remove the successfully associated data from the recovery table.
  • the present application can respond to the first data extraction instruction, obtain the data table to be extracted from the source system according to the first data extraction instruction, and extract the data incremental synchronization from the to-be-extracted data table. to the incremental table, and construct the main table according to the extracted data in the incremental table, determine the associated data table associated with the to-be-extracted data table, and obtain the data of the associated data table from the incremental table.
  • the data has been extracted, and a slave table is constructed in the incremental table according to the extracted data, the data in the slave table is associated with the data in the master table, and the data that fails to be associated is obtained, and the data in the slave table is obtained.
  • the data for which the association fails is written into the recovery table to ensure that all data lost in association will be recovered.
  • the extracted data in the extracted data table is incrementally synchronized to the incremental table, and the extracted data is updated to the master table, the current slave table is obtained from the incremental table, and the current slave table and the current slave table are calculated.
  • the union of the recovery table is used as the updated slave table, the updated slave table is associated with the updated master table, and the successfully associated data is removed from the recovery table, thereby solving various problems.
  • the problem of loss of associated data caused by the unsynchronization of associated table data caused by the cause reduces the cost of manual data problem analysis and data supplementation and correction, and enhances the data integrity of the data warehouse.
  • FIG. 1 is a flowchart of a preferred embodiment of the method for recovering lost data based on table association in the present application.
  • FIG. 2 is a functional block diagram of a preferred embodiment of the apparatus for recovering lost data based on table association in the present application.
  • FIG. 3 is a schematic structural diagram of an electronic device implementing a preferred embodiment of the method for recovering lost data based on table association in the present application.
  • FIG. 1 it is a flowchart of a preferred embodiment of the method for recovering lost data based on table association in the present application. According to different requirements, the order of the steps in this flowchart can be changed, and some steps can be omitted.
  • the method for recovering lost data based on table association is applied to one or more electronic devices, and the electronic device is a device that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions, which Hardware includes but is not limited to microprocessors, application specific integrated circuits (ASICs), programmable gate arrays (Field-Programmable Gate Arrays, FPGAs), digital processors (Digital Signal Processors, DSPs), embedded devices, etc. .
  • ASICs application specific integrated circuits
  • FPGAs Field-Programmable Gate Arrays
  • DSPs Digital Signal Processors
  • embedded devices etc.
  • the electronic device can be any electronic product that can interact with the user, such as a personal computer, a tablet computer, a smart phone, a personal digital assistant (PDA), a game console, an interactive network television ( Internet Protocol Television, IPTV), smart wearable devices, etc.
  • a personal computer a tablet computer
  • a smart phone a personal digital assistant (PDA)
  • PDA personal digital assistant
  • IPTV interactive network television
  • smart wearable devices etc.
  • the electronic equipment may also include network equipment and/or user equipment.
  • the network device includes, but is not limited to, a single network server, a server group formed by multiple network servers, or a cloud formed by a large number of hosts or network servers based on cloud computing (Cloud Computing).
  • the network where the electronic device is located includes, but is not limited to, the Internet, a wide area network, a metropolitan area network, a local area network, a virtual private network (Virtual Private Network, VPN), and the like.
  • VPN Virtual Private Network
  • a data warehouse (Extract-Transform-Load, ETL) is used to describe the process of extracting, transforming, and loading data from a source to a destination.
  • the first data extraction instruction may be configured to be triggered periodically, for example, periodically triggered every day.
  • the source system refers to a source-end system that stores data, and the data in the source system is extracted to a data warehouse for subsequent use.
  • a data warehouse pulls incremental data from source systems on a daily basis.
  • the obtaining the data table to be extracted from the source system according to the first data extraction instruction includes:
  • the data table with the target table name is acquired from the source system as the to-be-extracted data table.
  • the first data extraction instruction is essentially a piece of code, and in the first data extraction instruction, according to the code writing principle, the content between ⁇ is called the method body.
  • the information carried by the first data extraction instruction may be a specific address or various specific data to be processed, and the content of the information mainly depends on the code composition of the first data extraction instruction.
  • the preset label can be custom configured.
  • the preset label and the table name have a one-to-one correspondence, for example, the preset label may be configured as NAME.
  • data can be directly obtained from the instructions to improve processing efficiency, and data is obtained by tags, and the accuracy of data acquisition is also improved due to the unique configuration of labels.
  • extracting data from the to-be-extracted data table is incrementally synchronized to the incremental table, and constructs a main table in the incremental table according to the extracted data.
  • the incremental synchronization of extracting data from the to-be-extracted data table to the incremental table includes:
  • the first time stamp range obtained by parsing the first data extraction instruction to obtain the data extraction includes:
  • the data with the configuration is searched in the information carried by the first data extraction instruction, and the found data is determined as the first timestamp range.
  • the data records that have been changed since the last synchronization until the current synchronization, if they are not within this time interval, are judged to not meet the extraction conditions.
  • the determining the associated data table associated with the to-be-extracted data table includes:
  • the detected data table is determined as the associated data table.
  • the associated data table detected in the above manner has a table association relationship with the to-be-extracted data table, that is, the two data tables have a master-slave relationship in the source system, such as the customer table and the account table.
  • the extraction time of the two tables is often inconsistent, or due to the transaction management strategy of the source system, the submission time during extraction is not completely consistent, or for any other reason, the incremental timestamp cannot guarantee the business.
  • the logical consistency will cause the incremental data of the master and slave tables to not match. Then, when the table is loaded on the data warehouse side, the dependent keywords of the slave table will not be found in the master table, resulting in subsequent Data loss occurs in the data transformation process on the data warehouse side.
  • the master-slave table of the source system is not updated in a transaction, resulting in different update times of the master-slave table, resulting in the data warehouse extracting the master table, but the data from the slave table cannot be extracted.
  • this embodiment detects a data table that has a table association relationship with the to-be-extracted data table, so as to perform targeted processing and avoid data loss.
  • using the join operation to perform table association includes:
  • ticket.id is equal to job.t_id
  • S13 Acquire the extracted data of the associated data table from the incremental table, and construct a secondary table in the incremental table according to the extracted data.
  • the extraction time of the data in the associated data table is not necessarily the same as the extraction time of the data in the to-be-extracted data table. Since each table in the data warehouse is extracted separately, the extraction time is often inconsistent. , it is easy to cause data loss.
  • the associating the data in the slave table with the data in the master table includes:
  • mapping table stores the corresponding relationship between the data identification of each slave data and the data identification of each master data
  • the mapping table When it is found in the mapping table that the data identifier of the first data in the slave table has a corresponding relationship with the data identifier of the second data in the master table, it is determined that the first data is associated with the second data , and determine that the first data association is successful; or
  • the mapping table stores the corresponding relationship between the customer ID and the account ID, it means that the data corresponding to the customer ID is associated with the data corresponding to the account ID.
  • the data association corresponding to the customer ID is successful; if the account ID corresponding to the customer ID cannot be found in the mapping table, it means that there is no data associated with the data corresponding to the customer ID in the main table. , it is determined that the data association corresponding to the customer ID fails.
  • the method before writing the data of the association failure into the recovery table, the method further includes:
  • the created homogeneous table is determined as the recycling table.
  • the isomorphic table of the incremental table is created as the recovery table. Since the structures of the tables are completely consistent, it can ensure that the data that fails to be associated is written into the recovery table more completely, avoiding causing more Data loss also enables subsequent data recovery to have a more comprehensive data foundation and reduce error rates.
  • the recycling table is a dynamically updated and cyclically recycled data table to ensure that all associated lost data will be recycled, and then try to associate again next time to repair the data.
  • the second data extraction instruction may also be configured to be triggered periodically, for example, the second data extraction instruction may be triggered the day after the first data extraction instruction is triggered.
  • the incremental synchronization of extracting data from the to-be-extracted data table to the incremental table according to the second data extraction instruction includes:
  • S17 Acquire the current slave table from the incremental table, and calculate the union of the current slave table and the recovery table as the updated slave table.
  • the current slave table is also an updated slave table.
  • the current slave table is also incrementally synchronized according to the timestamp range, which is not described here.
  • the union of the current slave table and the recovery table is used as the updated slave table, so as to perform the association again in the current cycle, which effectively avoids data loss.
  • C004 of the customer table does not meet the extraction conditions and is not extracted to the data warehouse. Since A004 of the account table needs to be related to the customer C004 of the main table for related calculations, usually the data of A004 of the account table will be discarded because the records that cannot be related to the main table will be discarded. In this case, the unrelated data will be written into the recycling table , for subsequent use, the data is extracted again in the early morning of the next day, and C004 of the customer table is extracted and entered into the data warehouse.
  • the account table combines the data set extracted the next day with the data from the previous day's recovery table to form a new incremental table.
  • the recovery table completes the data, and the data can be associated.
  • the data that fails to be associated is continuously written to the recycle table, and the recycle table is merged and written to the incremental table in the next incremental cycle, and the association is attempted again. If the link is not associated, it will enter the recovery table again, and the cycle will continue until the link is successful and flows into the next link.
  • the above cycle method effectively reduces the probability of data loss.
  • the successfully associated data is removed from the reclaim table to avoid data redundancy in the reclaim table.
  • This embodiment can solve the problem of loss of associated data caused by asynchronous data in associated tables caused by various reasons, reduce the cost of manual data problem analysis and data supplementation and correction, and enhance the data integrity of the data warehouse.
  • the method further includes:
  • the detected data is determined as the data to be verified
  • the master table, slave table and recovery table can also be deployed on the blockchain to prevent malicious tampering of data.
  • the present application can respond to the first data extraction instruction, obtain the data table to be extracted from the source system according to the first data extraction instruction, and extract the data incremental synchronization from the to-be-extracted data table. to the incremental table, and construct the main table according to the extracted data in the incremental table, determine the associated data table associated with the to-be-extracted data table, and obtain the data of the associated data table from the incremental table.
  • the data has been extracted, and a slave table is constructed in the incremental table according to the extracted data, the data in the slave table is associated with the data in the master table, and the data that fails to be associated is obtained, and the data in the slave table is obtained.
  • the data for which the association fails is written into the recovery table to ensure that all data lost in association will be recovered.
  • the extracted data in the extracted data table is incrementally synchronized to the incremental table, and the extracted data is updated to the master table, the current slave table is obtained from the incremental table, and the current slave table and the current slave table are calculated.
  • the union of the recovery table is used as the updated slave table, the updated slave table is associated with the updated master table, and the successfully associated data is removed from the recovery table, thereby solving various problems.
  • the problem of loss of associated data caused by the unsynchronization of associated table data caused by the cause reduces the cost of manual data problem analysis and data supplementation and correction, and enhances the data integrity of the data warehouse.
  • FIG. 2 it is a functional block diagram of a preferred embodiment of the apparatus for recovering lost data based on table association in the present application.
  • the apparatus 11 for recovering lost data based on table association includes an acquisition unit 110 , a construction unit 111 , a determination unit 112 , an association unit 113 , a writing unit 114 , and an updating unit 115 .
  • the modules/units referred to in this application refer to a series of computer program segments that can be executed by the processor 13 and can perform fixed functions, and are stored in the memory 12 . In this embodiment, the functions of each module/unit will be described in detail in subsequent embodiments.
  • the acquiring unit 110 acquires the data table to be extracted from the source system according to the first data extraction instruction.
  • Data warehouse (Extract-Transform-Load, ETL) is used to describe the process of extracting, transforming, and loading data from the source to the destination.
  • the first data extraction instruction may be configured to be triggered periodically, for example, periodically triggered every day.
  • the source system refers to a source-end system that stores data, and the data in the source system is extracted to a data warehouse for subsequent use.
  • a data warehouse pulls incremental data from source systems on a daily basis.
  • the obtaining unit 110 obtaining the data table to be extracted from the source system according to the first data extraction instruction includes:
  • the data table with the target table name is acquired from the source system as the to-be-extracted data table.
  • the first data extraction instruction is essentially a piece of code, and in the first data extraction instruction, according to the code writing principle, the content between ⁇ is called the method body.
  • the information carried by the first data extraction instruction may be a specific address or various specific data to be processed, and the content of the information mainly depends on the code composition of the first data extraction instruction.
  • the preset label can be custom configured.
  • the preset label and the table name have a one-to-one correspondence, for example, the preset label may be configured as NAME.
  • data can be directly obtained from the instructions to improve processing efficiency, and data is obtained by tags, and the accuracy of data acquisition is also improved due to the unique configuration of labels.
  • the construction unit 111 extracts data from the to-be-extracted data table to incrementally synchronize to the incremental table, and constructs a main table in the incremental table according to the extracted data.
  • the construction unit 111 extracts data from the to-be-extracted data table to incrementally synchronize to the incremental table, including:
  • the construction unit 111 parses the first data extraction instruction, and obtains the first time stamp range for data extraction including:
  • the data with the configuration is searched in the information carried by the first data extraction instruction, and the found data is determined as the first timestamp range.
  • the data records that have been changed since the last synchronization until the current synchronization, if they are not within this time interval, are judged to not meet the extraction conditions.
  • the determining unit 112 determines the associated data table associated with the to-be-extracted data table.
  • the determining unit 112 determines that the associated data table associated with the to-be-extracted data table includes:
  • the detected data table is determined as the associated data table.
  • the associated data table detected in the above manner has a table association relationship with the to-be-extracted data table, that is, the two data tables have a master-slave relationship in the source system, such as the customer table and the account table.
  • the extraction time of the two tables is often inconsistent, or due to the transaction management strategy of the source system, the submission time during extraction is not completely consistent, or for any other reason, the incremental timestamp cannot guarantee the business.
  • the logical consistency will cause the incremental data of the master and slave tables to not match. Then, when the table is loaded on the data warehouse side, the dependent keywords of the slave table will not be found in the master table, resulting in subsequent Data loss occurs in the data transformation process on the data warehouse side.
  • the master-slave table of the source system is not updated in a transaction, resulting in different update times of the master-slave table, resulting in the data warehouse extracting the master table, but the data from the slave table cannot be extracted.
  • this embodiment detects a data table that has a table association relationship with the to-be-extracted data table, so as to perform targeted processing and avoid data loss.
  • using the join operation to perform table association includes:
  • ticket.id is equal to job.t_id
  • the construction unit 111 acquires the extracted data of the associated data table from the incremental table, and constructs a secondary table in the incremental table according to the extracted data.
  • the extraction time of the data in the associated data table is not necessarily the same as the extraction time of the data in the to-be-extracted data table. Since each table in the data warehouse is extracted separately, the extraction time is often inconsistent. , it is easy to cause data loss.
  • the associating unit 113 associates the data in the slave table with the data in the master table, and acquires the data for which the association fails.
  • the associating unit 113 associates the data in the slave table with the data in the master table including:
  • mapping table stores the corresponding relationship between the data identification of each slave data and the data identification of each master data
  • the mapping table When it is found in the mapping table that the data identifier of the first data in the slave table has a corresponding relationship with the data identifier of the second data in the master table, it is determined that the first data is associated with the second data , and determine that the first data association is successful; or
  • the mapping table stores the corresponding relationship between the customer ID and the account ID, it means that the data corresponding to the customer ID is associated with the data corresponding to the account ID.
  • the data association corresponding to the customer ID is successful; if the account ID corresponding to the customer ID cannot be found in the mapping table, it means that there is no data associated with the data corresponding to the customer ID in the main table. , it is determined that the data association corresponding to the customer ID fails.
  • the writing unit 114 writes the data for which the association fails into the recycle table.
  • the created homogeneous table is determined as the recycling table.
  • the isomorphic table of the incremental table is created as the recovery table. Since the structures of the tables are completely consistent, it can ensure that the data that fails to be associated is written into the recovery table more completely, avoiding causing more Data loss also enables subsequent data recovery to have a more comprehensive data foundation and reduce error rates.
  • the recycling table is a dynamically updated and cyclically recycled data table to ensure that all associated lost data will be recycled, and then try to associate again next time to repair the data.
  • the update unit 115 extracts data incrementally from the to-be-extracted data table to the incremental table according to the second data extraction instruction, and synchronizes the extracted data.
  • the data is updated to the main table.
  • the second data extraction instruction may also be configured to be triggered periodically, for example, the second data extraction instruction may be triggered the day after the first data extraction instruction is triggered.
  • the updating unit 115 extracts data incrementally from the to-be-extracted data table to the incremental table according to the second data extraction instruction, including:
  • the updating unit 115 acquires the current slave table from the increment table, and calculates the union of the current slave table and the recycling table as the updated slave table.
  • the current slave table is also an updated slave table.
  • the current slave table is also incrementally synchronized according to the timestamp range, which is not described here.
  • the union of the current slave table and the recovery table is used as the updated slave table, so as to perform the association again in the current cycle, which effectively avoids data loss.
  • the associating unit 113 associates the updated slave table with the updated master table, and removes successfully associated data from the recycle table.
  • C004 of the customer table does not meet the extraction conditions and is not extracted to the data warehouse. Since A004 of the account table needs to be related to the customer C004 of the main table for related calculations, usually the data of A004 of the account table will be discarded because the records that cannot be related to the main table will be discarded. In this case, the unrelated data will be written into the recycling table , for subsequent use, the data is extracted again in the early morning of the next day, and C004 of the customer table is extracted and entered into the data warehouse.
  • the account table combines the data set extracted the next day with the data from the previous day's recovery table to form a new incremental table.
  • the recovery table completes the data, and the data can be associated.
  • the data that fails to be associated is continuously written to the recycle table, and the recycle table is merged and written to the incremental table in the next incremental cycle, and the association is attempted again. If the link is not associated, it will enter the recovery table again, and the cycle will continue until the link is successful and flows into the next link.
  • the above cycle method effectively reduces the probability of data loss.
  • the successfully associated data is removed from the reclaim table to avoid data redundancy in the reclaim table.
  • This embodiment can solve the problem of loss of associated data caused by asynchronous data in associated tables caused by various reasons, reduce the cost of manual data problem analysis and data supplementation and correction, and enhance the data integrity of the data warehouse.
  • the detected data is determined as the data to be verified
  • the master table, slave table and recovery table can also be deployed on the blockchain to prevent malicious tampering of data.
  • the present application can respond to the first data extraction instruction, obtain the data table to be extracted from the source system according to the first data extraction instruction, and extract the data incremental synchronization from the to-be-extracted data table. to the incremental table, and construct the main table according to the extracted data in the incremental table, determine the associated data table associated with the to-be-extracted data table, and obtain the data of the associated data table from the incremental table.
  • the data has been extracted, and a slave table is constructed in the incremental table according to the extracted data, the data in the slave table is associated with the data in the master table, and the data that fails to be associated is obtained, and the data in the slave table is obtained.
  • the data for which the association fails is written into the recovery table to ensure that all data lost in association will be recovered.
  • the extracted data in the extracted data table is incrementally synchronized to the incremental table, and the extracted data is updated to the master table, the current slave table is obtained from the incremental table, and the current slave table and the current slave table are calculated.
  • the union of the recovery table is used as the updated slave table, the updated slave table is associated with the updated master table, and the successfully associated data is removed from the recovery table, thereby solving various problems.
  • the problem of loss of associated data caused by the unsynchronization of associated table data caused by the cause reduces the cost of manual data problem analysis and data supplementation and correction, and enhances the data integrity of the data warehouse.
  • FIG. 3 it is a schematic structural diagram of an electronic device implementing a preferred embodiment of the method for recovering lost data based on table association in the present application.
  • the electronic device 1 may include a memory 12, a processor 13 and a bus, and may also include a computer program stored in the memory 12 and executable on the processor 13, such as a table association-based lost data recovery program.
  • the electronic device 1 can be either a bus-type structure or a star-shaped structure.
  • the device 1 may also include more or less other hardware or software than shown, or different component arrangements, for example, the electronic device 1 may also include input and output devices, network access devices, and the like.
  • the electronic device 1 is only an example. If other existing or possible electronic products can be adapted to this application, they should also be included in the protection scope of this application, and are incorporated herein by reference. .
  • the memory 12 includes at least one type of computer-readable storage medium, and the computer-readable storage medium may be non-volatile or volatile.
  • the computer-readable storage medium may include flash memory, removable hard disk, multimedia card, card-type memory (eg, SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, and the like.
  • the memory 12 may be an internal storage unit of the electronic device 1 in some embodiments, such as a mobile hard disk of the electronic device 1 .
  • the memory 12 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) equipped on the electronic device 1 ) card, Flash Card, etc.
  • the memory 12 may also include both an internal storage unit of the electronic device 1 and an external storage device.
  • the memory 12 can not only be used to store application software installed in the electronic device 1 and various types of data, such as the codes of the lost data recovery program based on table association, etc., but also can be used to temporarily store data that has been output or will be output.
  • the processor 13 may be composed of integrated circuits in some embodiments, for example, may be composed of a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same function or different functions, including one or more central processing units.
  • CPU Central Processing Unit
  • the processor 13 is the control core (Control Unit) of the electronic device 1, and uses various interfaces and lines to connect the various components of the entire electronic device 1, by running or executing the programs or modules stored in the memory 12 (such as executing Lost data recovery program based on table association, etc.), and call data stored in the memory 12 to perform various functions of the electronic device 1 and process data.
  • the processor 13 executes the operating system of the electronic device 1 and various installed application programs.
  • the processor 13 executes the application program to implement the steps in each of the foregoing embodiments of the method for recovering lost data based on table association, for example, the steps shown in FIG. 1 .
  • the computer program may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 12 and executed by the processor 13 to complete the present invention.
  • the one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, and the instruction segments are used to describe the execution process of the computer program in the electronic device 1 .
  • the computer program may be divided into an acquisition unit 110 , a construction unit 111 , a determination unit 112 , an association unit 113 , a writing unit 114 , and an updating unit 115 .
  • the above-mentioned integrated units implemented in the form of software functional modules may be stored in a computer-readable storage medium.
  • the above-mentioned software function modules are stored in a storage medium, and include several instructions to enable a computer device (which may be a personal computer, a computer device, or a network device, etc.) or a processor (processor) to execute the based on the various embodiments of the present application. Part of the lost data recovery method associated with the table.
  • modules/units integrated in the electronic device 1 are implemented in the form of software functional units and sold or used as independent products, they may be stored in a computer-readable storage medium. Based on this understanding, the present application can implement all or part of the processes in the methods of the above embodiments, and can also be completed by instructing relevant hardware devices through a computer program, and the computer program can be stored in a computer-readable storage medium. When the computer program is executed by the processor, the steps of the above method embodiments can be implemented.
  • the computer program includes computer program code
  • the computer program code may be in the form of source code, object code, executable file or some intermediate form, and the like.
  • the computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, removable hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) , random access memory, etc.
  • the computer-readable storage medium may mainly include a stored program area and a stored data area, wherein the stored program area may store an operating system, an application program required for at least one function, and the like; Use the created data, etc.
  • the blockchain referred to in this application is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • the bus may be a peripheral component interconnect (PCI for short) bus or an extended industry standard architecture (EISA for short) bus or the like.
  • PCI peripheral component interconnect
  • EISA extended industry standard architecture
  • the bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one arrow is shown in FIG. 3, but it does not mean that there is only one bus or one type of bus.
  • the bus is arranged to enable connection communication between the memory 12 and at least one processor 13 and the like.
  • the electronic device 1 may also include a power source (such as a battery) for supplying power to various components, preferably, the power source may be logically connected to the at least one processor 13 through a power management device, so as to be implemented by the power management device Charge management, discharge management, and power management functions.
  • the power source may also include one or more DC or AC power sources, recharging devices, power failure detection circuits, power converters or inverters, power status indicators, and any other components.
  • the electronic device 1 may further include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
  • the electronic device 1 may also include a network interface, optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is usually used in the electronic device 1 Establish a communication connection with other electronic devices.
  • a network interface optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is usually used in the electronic device 1 Establish a communication connection with other electronic devices.
  • the electronic device 1 may further include a user interface, and the user interface may be a display (Display), an input unit (such as a keyboard (Keyboard)), optionally, the user interface may also be a standard wired interface or a wireless interface.
  • the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, and the like.
  • the display may also be appropriately called a display screen or a display unit, which is used for displaying information processed in the electronic device 1 and for displaying a visualized user interface.
  • FIG. 3 only shows the electronic device 1 with components 12-13. Those skilled in the art can understand that the structure shown in FIG. 3 does not constitute a limitation on the electronic device 1, and may include less than shown in the figure. Or more components, or a combination of certain components, or a different arrangement of components.
  • the memory 12 in the electronic device 1 stores multiple instructions to implement a method for recovering lost data based on table association, and the processor 13 can execute the multiple instructions to implement:
  • the data extracted from the to-be-extracted data table is incrementally synchronized to the incremental table, and the main table is constructed according to the extracted data in the incremental table;
  • the data extracted from the to-be-extracted data table is incrementally synchronized to the incremental table according to the second data extraction instruction, and the extracted data is updated to the main table;
  • modules described as separate components may or may not be physically separated, and the components shown as modules may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional module in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware, or can be implemented in the form of hardware plus software function modules.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente invention concerne un procédé et un appareil de récupération de données perdues sur la base d'une association de tables, un dispositif et un support, se rapportant au domaine des mégadonnées. Le procédé consiste : à obtenir une table de données à extraire; à extraire et à synchroniser un incrément de données avec une table d'incrémentation, et à construire une table maîtresse selon les données extraites; à déterminer une table de données associée; à obtenir des données extraites, et à construire une table esclave selon les données extraites; à associer des données dans la table esclave avec des données dans la table maîtresse, et à obtenir des données associées de manière incorrecte (S14); à écrire les données associées de manière incorrecte dans une table de récupération (S15); à extraire l'incrément de données de la table de données à extraire et à synchroniser l'incrément de données avec la table incrémentielle, et à mettre à jour les données extraites vers la table maîtresse; à obtenir la table esclave actuelle à partir de la table d'incrémentation, et à calculer une union de la table esclave actuelle et de la table de récupération comme table esclave mise à jour (S17); et à associer la table esclave mise à jour à la table maîtresse mise à jour, et à éliminer les données associées avec succès de la table de récupération (S18). L'intégrité des données d'un entrepôt de données est améliorée. Le procédé de récupération de données perdues basé sur une association de tables concerne en outre une technologie de chaîne de blocs. La table maîtresse, la table et la table de récupération peuvent être stockées dans une chaîne de blocs.
PCT/CN2021/083104 2021-01-05 2021-03-25 Procédé et appareil de récupération de données perdues sur la base d'une association de tables, dispositif et support WO2022147908A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110005207.8 2021-01-05
CN202110005207.8A CN112328677B (zh) 2021-01-05 2021-01-05 基于表关联的丢失数据回收方法、装置、设备及介质

Publications (1)

Publication Number Publication Date
WO2022147908A1 true WO2022147908A1 (fr) 2022-07-14

Family

ID=74302154

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/083104 WO2022147908A1 (fr) 2021-01-05 2021-03-25 Procédé et appareil de récupération de données perdues sur la base d'une association de tables, dispositif et support

Country Status (2)

Country Link
CN (1) CN112328677B (fr)
WO (1) WO2022147908A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117251448A (zh) * 2023-09-18 2023-12-19 北京数方科技有限公司 一种宽表拉链表数据处理方法及装置
CN117349377A (zh) * 2023-10-08 2024-01-05 中电云计算技术有限公司 一种主外键表数据同步方法和***

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112328677B (zh) * 2021-01-05 2021-04-02 平安科技(深圳)有限公司 基于表关联的丢失数据回收方法、装置、设备及介质
CN113420057A (zh) * 2021-06-29 2021-09-21 未鲲(上海)科技服务有限公司 对账数据处理方法及相关装置

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102799634A (zh) * 2012-06-26 2012-11-28 中国农业银行股份有限公司 数据存储方法及装置
CN105320680A (zh) * 2014-07-15 2016-02-10 ***通信集团公司 一种数据同步方法及装置
CN106407360A (zh) * 2016-09-07 2017-02-15 广州视源电子科技股份有限公司 一种数据的处理方法及装置
CN106933823A (zh) * 2015-12-29 2017-07-07 北京国双科技有限公司 数据同步方法及装置
CN109408565A (zh) * 2018-10-19 2019-03-01 浪潮软件集团有限公司 一种数据同步交互方法、***和数据交互平台
CN110347672A (zh) * 2019-05-27 2019-10-18 深圳壹账通智能科技有限公司 数据表关联更新的验证方法及装置、电子设备及存储介质
US20190347345A1 (en) * 2018-05-14 2019-11-14 Sap Se Database independent detection of data changes
CN112328677A (zh) * 2021-01-05 2021-02-05 平安科技(深圳)有限公司 基于表关联的丢失数据回收方法、装置、设备及介质

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5634528B2 (ja) * 2010-12-13 2014-12-03 株式会社日立製作所 ストレージ装置及びストレージ装置の電源障害検出方法
US8874505B2 (en) * 2011-01-11 2014-10-28 Hitachi, Ltd. Data replication and failure recovery method for distributed key-value store
JP6499958B2 (ja) * 2015-12-22 2019-04-10 日立オートモティブシステムズ株式会社 車両故障診断装置
CN107169003B (zh) * 2017-03-31 2020-05-22 北京奇艺世纪科技有限公司 一种数据关联方法及装置
CN110908995B (zh) * 2018-09-17 2023-04-11 阿里巴巴集团控股有限公司 数据处理方法、装置以及设备
CN112015790A (zh) * 2019-05-30 2020-12-01 北京沃东天骏信息技术有限公司 一种数据处理的方法和装置
CN112035463B (zh) * 2020-07-22 2023-07-21 武汉达梦数据库股份有限公司 基于日志解析的异构数据库的双向同步方法和同步装置

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102799634A (zh) * 2012-06-26 2012-11-28 中国农业银行股份有限公司 数据存储方法及装置
CN105320680A (zh) * 2014-07-15 2016-02-10 ***通信集团公司 一种数据同步方法及装置
CN106933823A (zh) * 2015-12-29 2017-07-07 北京国双科技有限公司 数据同步方法及装置
CN106407360A (zh) * 2016-09-07 2017-02-15 广州视源电子科技股份有限公司 一种数据的处理方法及装置
US20190347345A1 (en) * 2018-05-14 2019-11-14 Sap Se Database independent detection of data changes
CN109408565A (zh) * 2018-10-19 2019-03-01 浪潮软件集团有限公司 一种数据同步交互方法、***和数据交互平台
CN110347672A (zh) * 2019-05-27 2019-10-18 深圳壹账通智能科技有限公司 数据表关联更新的验证方法及装置、电子设备及存储介质
CN112328677A (zh) * 2021-01-05 2021-02-05 平安科技(深圳)有限公司 基于表关联的丢失数据回收方法、装置、设备及介质

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117251448A (zh) * 2023-09-18 2023-12-19 北京数方科技有限公司 一种宽表拉链表数据处理方法及装置
CN117251448B (zh) * 2023-09-18 2024-04-30 北京数方科技有限公司 一种宽表拉链表数据处理方法及装置
CN117349377A (zh) * 2023-10-08 2024-01-05 中电云计算技术有限公司 一种主外键表数据同步方法和***
CN117349377B (zh) * 2023-10-08 2024-05-10 中电云计算技术有限公司 一种主外键表数据同步方法和***

Also Published As

Publication number Publication date
CN112328677A (zh) 2021-02-05
CN112328677B (zh) 2021-04-02

Similar Documents

Publication Publication Date Title
WO2022147908A1 (fr) Procédé et appareil de récupération de données perdues sur la base d'une association de tables, dispositif et support
CN112653760B (zh) 跨服务器的文件传输方法、装置、电子设备及存储介质
CN112559535B (zh) 基于多线程的异步任务处理方法、装置、设备及介质
CN115118738B (zh) 基于rdma的灾备方法、装置、设备及介质
CN113806434B (zh) 大数据处理方法、装置、设备及介质
CN111538573A (zh) 异步任务处理方法、装置及计算机可读存储介质
CN115543198A (zh) 非结构化数据入湖方法、装置、电子设备及存储介质
CN111986765A (zh) 电子病例实体标记方法、装置、计算机设备及存储介质
CN114185776A (zh) 应用程序的大数据埋点方法、装置、设备及介质
CN115002062B (zh) 消息处理方法、装置、设备及可读存储介质
CN114816371B (zh) 消息处理方法、装置、设备及介质
CN111429085A (zh) 合同数据生成方法、装置、电子设备及存储介质
CN113254446B (zh) 数据融合方法、装置、电子设备及介质
CN115687384A (zh) Uuid标识生成方法、装置、设备及存储介质
WO2022134820A1 (fr) Procédé et appareil d'extraction de données de page web, dispositif électronique et support de stockage
CN112925753B (zh) 文件追加写入方法、装置、电子设备及存储介质
CN114741422A (zh) 查询请求方法、装置、设备及介质
CN114116673A (zh) 基于人工智能的数据迁移方法及相关设备
CN114626103A (zh) 数据一致性比对方法、装置、设备及介质
CN114547011A (zh) 数据抽取方法、装置、电子设备及存储介质
CN113419718A (zh) 数据报送方法、装置、设备及介质
CN111857883A (zh) 页面数据校验方法、装置、电子设备及存储介质
CN115065642B (zh) 带宽限制下的代码表请求方法、装置、设备及介质
CN114139199A (zh) 数据脱敏方法、装置、设备及介质
CN114860349B (zh) 数据加载方法、装置、设备及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21916959

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21916959

Country of ref document: EP

Kind code of ref document: A1