CN115858625A - Big data multi-table data joint verification method, system, equipment and medium - Google Patents

Big data multi-table data joint verification method, system, equipment and medium Download PDF

Info

Publication number
CN115858625A
CN115858625A CN202211630654.3A CN202211630654A CN115858625A CN 115858625 A CN115858625 A CN 115858625A CN 202211630654 A CN202211630654 A CN 202211630654A CN 115858625 A CN115858625 A CN 115858625A
Authority
CN
China
Prior art keywords
data
characteristic information
acquiring
joint
source tables
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202211630654.3A
Other languages
Chinese (zh)
Inventor
黎学军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qizhidao Network Technology Co Ltd
Original Assignee
Qizhidao Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qizhidao Network Technology Co Ltd filed Critical Qizhidao Network Technology Co Ltd
Priority to CN202211630654.3A priority Critical patent/CN115858625A/en
Publication of CN115858625A publication Critical patent/CN115858625A/en
Withdrawn legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of data processing, in particular to a big data multi-table data joint verification method, a system, equipment and a medium, wherein the big data multi-table data joint verification method comprises the following steps: acquiring respective data of a plurality of data source tables in a preset database, wherein the plurality of data source tables comprise an initial data table, an intermediate data table and an application data table; acquiring first data characteristic information based on data in a plurality of data source tables; performing joint processing on the data source tables based on the first data characteristic information to obtain a joint data table, and acquiring second data characteristic information according to data in the joint data table; comparing the first data characteristic information with second data characteristic information to obtain a data comparison result; and judging whether the data of the combined data table is consistent with the data of the plurality of data source tables or not according to the data comparison result. The method and the device have the advantages of high efficiency of the data verification process and effect of verifying a plurality of data source tables.

Description

Big data multi-table data joint verification method, system, equipment and medium
Technical Field
The invention relates to the technical field of data processing, in particular to a big data multi-table data joint verification method, a system, equipment and a medium.
Background
With the development of the times, the popularization of interconnection informatization technology and the rise of big data concepts, more and more enterprises pay attention to the development of big data services. In order to ensure the accuracy of big data, the big data of an enterprise needs to be tested, at present, a big data testing method for the enterprise comprises sampling testing, but the coverage rate of the sampling testing method for the data testing is small, and the hidden danger of big data missing testing easily exists, but the big data of the enterprise has large data size magnitude, many fields, long content, many types of data tables, and too long testing and verifying time for the big data, so that the data verifying efficiency is low, and therefore, a certain improvement space exists.
Disclosure of Invention
In order to improve the efficiency of a data verification process and verify a plurality of data source tables, the application provides a big data multi-table data combined verification method, a system, equipment and a medium.
The above object of the present invention is achieved by the following technical solutions:
a big data multi-table data joint verification method comprises the following steps:
acquiring respective data of a plurality of data source tables in a preset database, wherein the plurality of data source tables comprise an initial data table, an intermediate data table and an application data table;
acquiring first data characteristic information based on data in a plurality of data source tables;
performing joint processing on the data source tables based on the first data characteristic information to obtain a joint data table, and acquiring second data characteristic information according to data in the joint data table;
and comparing the first data characteristic information with the second data characteristic information to obtain a data comparison result, and judging whether the data of the combined data table is consistent with the data of the plurality of data source tables according to the data comparison result.
By adopting the technical scheme, when the verification test is carried out on the big data of an enterprise, a plurality of data source tables are obtained in a pre-constructed database, the data source tables comprise an initial data table, an intermediate data table and an application data table, data analysis is respectively carried out on the initial data table, the intermediate data table and the application data table to obtain first data characteristic information, the initial data table, the intermediate data table and the application data table are jointly processed by utilizing the first data characteristic information, repeated data found in the plurality of data source tables are removed to form a combined data table, repeated verification can be avoided for the repeated data, the data in the combined data table is analyzed to obtain second data characteristic information, the data verification process is completed by comparing the first data characteristic information with the second data characteristic information and judging whether the data content of the combined data table is consistent with the data content in the plurality of data source tables according to the comparison result, and the data verification process is carried out after the combined processing of the plurality of data source tables, so that the data cost time of the big data is reduced and the data verification efficiency is improved.
The present application may be further configured in a preferred example to: the obtaining of the respective data of the plurality of data source tables in the preset database specifically includes:
acquiring initial source data in a preset database, and sorting and collecting the initial source data to form an initial data table;
performing aggregation processing on the initial source data to obtain aggregated data, and forming an intermediate data table based on the aggregated data set;
and acquiring data type information based on the aggregated data, and classifying the data according to the data type information to form an application data table.
By adopting the technical scheme, the data in the database are sorted and collected to obtain the initial data table, the initial data in the initial data table is aggregated to form aggregated data, some repeated data form intermediate data, the work of repeated processing on the repeated data can be reduced, the efficiency of data verification is effectively improved, the aggregated data is collected to form the intermediate data table, the aggregated data in the intermediate data table is subjected to data classification, the aggregated data of the same type is sorted and collected to form the application data table, the data in the database forms a data layer structure of the initial data table, the intermediate data table and the application data table, massive large data in the database can be conveniently stored, and then the data verification on a plurality of data source tables can be conveniently carried out.
The application may be further configured in a preferred example to: the performing joint processing on the plurality of data source tables based on the first data characteristic information to obtain a joint data table, and acquiring second data characteristic information according to data in the joint data table specifically includes:
acquiring first data content information according to the first data characteristic information, and acquiring a repeated data set based on the first data content information;
acquiring content information repeated data according to the repeated data set, combining the content information repeated data, forming combined data by the data obtained by combining the content information repeated data and other data, and obtaining a combined data table based on the combined data set;
and performing data feature extraction on the combined data to obtain second data feature information.
By adopting the technical scheme, the data content information of the data source tables is obtained by analyzing the first data characteristic information, the data with repeated content information is extracted according to the data content information, the data with repeated content information is merged, only one piece of data with repeated content information is left, the merged repeated content information and other data are arranged and collected to form combined data, the combined data tables are collected and formed on the basis of the combined data, the function of combining the data source tables is realized, the combined data in the combined data tables is analyzed, the second data characteristic information is extracted, and the second data characteristic information is utilized to facilitate the combined data verification of the data source tables.
The application may be further configured in a preferred example to: the comparing the first data characteristic information with the second data characteristic information to obtain a data comparison result specifically includes:
acquiring a first data total amount based on the first data characteristic information, and acquiring a second data total amount based on the second data characteristic information;
and comparing whether the first data total amount is the same as the second data total amount, and forming a data comparison result according to the comparison result.
By adopting the technical scheme, the data total amount of a single data source table in the multiple data source tables is obtained by analyzing the first data characteristic information, the data total amount of the data in the data combination table is obtained by analyzing the second data characteristic information, whether the data total amount of the combination data table formed by combining the multiple data source tables is consistent with the data total amount of the data in the multiple data source tables or not is judged by comparison, and the data verification function is realized by using the comparison result of the data total amount.
The application may be further configured in a preferred example to: the acquiring a first total amount of data based on the first data characteristic information and acquiring a second total amount of data based on the second data characteristic information specifically includes:
acquiring a plurality of first data characteristic segments according to the first data characteristic information, counting the number of the first data characteristic segments, acquiring first data content based on the first data characteristic segments, and taking the first data content and the number of the first data characteristic segments as a first data total amount;
and acquiring a plurality of second data characteristic segments according to the second data characteristic information, counting the number of the second data characteristic segments, acquiring second data content based on the second data characteristic segments, and combining the second data content and the number of the second data characteristic segments as a total second data amount.
By adopting the technical scheme, the first data characteristic information is analyzed to obtain a plurality of first data characteristic segments, the number of the first data characteristic segments is counted, the first data content is identified through the first data characteristic segments, the first data total quantity is formed by the number of the first data characteristic segments and the first data content, the data calculation and statistics functions of a plurality of data source tables are realized, the second data characteristic segments are obtained by analyzing the second data characteristic information, the number of the second data characteristic segments is counted, the second data content is identified through the second data characteristic segments, the second data total quantity is formed by the number of the second data characteristic segments and the second data content, and the data calculation and statistics functions of a combined data table are realized.
The present application may be further configured in a preferred example to: the big data multi-table data joint verification method further comprises the following steps:
acquiring a second data source table of the same data source as the plurality of data source tables, and acquiring third data characteristic information based on the second data source table;
and comparing the first data characteristic information with third data characteristic information, and judging whether the data of the second data source table is consistent with the data of the plurality of data source tables according to the comparison result.
By adopting the technical scheme, in the process of data migration or transfer of big data to second terminal equipment, a second data source table of the same data source as the plurality of data source tables is obtained, data in the second data source table is arranged and analyzed to obtain third data characteristic information, the third data characteristic information is compared with the first data characteristic information to judge whether the data of the second data source table is consistent with the data of the plurality of data source tables, and the function of verifying the data in the process of data migration is realized.
The second purpose of the invention of the application is realized by the following technical scheme:
a big data multi-table data joint verification device comprises:
the data source table module is used for acquiring respective data of a plurality of data source tables from a preset database, wherein the plurality of data source tables comprise an initial data table, an intermediate data table and an application data table;
the first data characteristic information acquisition module is used for acquiring first data characteristic information based on data in the data source tables;
the multi-table combination module is used for performing combination processing on the data source tables based on the first data characteristic information to obtain a combined data table, and acquiring second data characteristic information according to data in the combined data table;
and the data verification module is used for comparing the first data characteristic information with the second data characteristic information to obtain a data comparison result, and judging whether the data of the combined data table is consistent with the data of the plurality of data source tables or not according to the data comparison result.
By adopting the technical scheme, when the verification test is carried out on the big data of an enterprise, a plurality of data source tables are obtained in a pre-constructed database, the data source tables comprise an initial data table, an intermediate data table and an application data table, data analysis is respectively carried out on the initial data table, the intermediate data table and the application data table to obtain first data characteristic information, the initial data table, the intermediate data table and the application data table are jointly processed by utilizing the first data characteristic information, repeated data found in the plurality of data source tables are removed to form a combined data table, repeated verification can be avoided for the repeated data, the data in the combined data table is analyzed to obtain second data characteristic information, the data verification process is completed by comparing the first data characteristic information with the second data characteristic information and judging whether the data content of the combined data table is consistent with the data content in the plurality of data source tables according to the comparison result, and the data verification process is carried out after the combined processing of the plurality of data source tables, so that the data cost time of the big data is reduced and the data verification efficiency is improved.
The third purpose of the present application is achieved by the following technical solutions:
a computer device comprises a memory, a processor and a computer program which is stored in the memory and can run on the processor, wherein the processor realizes the steps of the big data multi-table data joint checking method when executing the computer program.
The fourth purpose of the present application is achieved by the following technical solutions:
a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the steps of the big data multi-table data joint verification method are implemented.
In summary, the present application includes at least one of the following beneficial technical effects:
1. the initial data table, the intermediate data table and the application data table are subjected to combined processing, repeated data found in the multiple data source tables are removed to form a combined data table, repeated verification of the repeated data can be avoided, data verification is performed after the multiple data source tables are subjected to combined processing, the time spent on data verification of big data is reduced, and the data verification efficiency is improved;
2. the method comprises the steps of collecting data in a database to obtain an initial data table, carrying out aggregation processing on the initial data in the initial data table to form aggregated data, forming a plurality of repeated data into intermediate data, reducing repeated processing work on the repeated data, effectively improving data verification efficiency, collecting the aggregated data into the intermediate data table, carrying out data classification on the aggregated data in the intermediate data table, collecting the same type of aggregated data into an application data table, forming a data layer structure of the initial data table, the intermediate data table and the application data table on the data in the database, facilitating storage of massive large data in the database, and further facilitating data verification on a plurality of data source tables;
3. in the process of data migration or transfer of big data to second terminal equipment, a second data source table of a data source identical to the data source tables is obtained, data in the second data source table is sorted and analyzed to obtain third data characteristic information, the third data characteristic information is compared with the first data characteristic information to judge whether the data of the second data source table is consistent with the data of the data source tables, and the function of verifying the data in the data migration process is achieved.
Drawings
FIG. 1 is a flowchart illustrating a method for joint verification of big data and multi-table data according to an embodiment of the present application;
FIG. 2 is a flowchart illustrating an implementation of step S10 in a big data multi-table data joint verification method according to an embodiment of the present application;
FIG. 3 is a flowchart illustrating an implementation of step S30 in a big data multi-table data joint verification method according to an embodiment of the present application;
FIG. 4 is a flowchart illustrating an implementation of step S40 in a big data multi-table data joint verification method according to an embodiment of the present application;
FIG. 5 is a flowchart illustrating an implementation of step S41 in a big data multi-table data joint verification method according to an embodiment of the present application;
FIG. 6 is a flowchart illustrating another implementation of a big data multi-table data joint verification method according to an embodiment of the present application;
FIG. 7 is a schematic block diagram of a big data multi-table data joint verification system according to an embodiment of the present application;
FIG. 8 is a schematic diagram of a computer device in an embodiment of the present application.
Detailed Description
The present application is described in further detail below with reference to the attached drawings.
In an embodiment, as shown in fig. 1, the present application discloses a big data multi-table data joint verification method, which specifically includes the following steps:
s10: the method comprises the steps of obtaining data of a plurality of data source tables in a preset database, wherein the data source tables comprise an initial data table, an intermediate data table and an application data table.
In the present embodiment, the plurality of data source tables refer to an initial data table (ODS layer table), an intermediate data table (DW layer table), and an application data table (ADS layer table).
Specifically, in a database of big data, storing initial source data into an initial data table, processing by using the initial source data in the initial data table, storing the processed data into an intermediate data table, sorting and sorting the data in the intermediate data table, and storing the data of the same type into an application data table.
S20: and acquiring first data characteristic information based on data in a plurality of data source tables.
In this embodiment, the first data characteristic information refers to data content and data amount information in each data source table.
Specifically, the data in the initial data table, the intermediate data table and the application data table are analyzed and sorted respectively to obtain the data content and the data amount information in each data table.
S30: and performing joint processing on the data source tables based on the first data characteristic information to obtain a joint data table, and acquiring second data characteristic information according to data in the joint data table.
In this embodiment, the joint processing refers to merging and removing processing of duplicate data, the joint data table refers to a data set table in which a piece of duplicate data and other data are left, and the second data characteristic information refers to data content and data amount information of data in the joint data table.
Specifically, data content information of a plurality of data source tables is obtained by analyzing first data characteristic information, data with repeated content information is extracted according to the data content information, the data with repeated content information is merged, only one piece of data with repeated content information and other data are left, a data set table with a piece of repeated data and other data is formed, and the data in the combined data table is analyzed and sorted to obtain the data content and the data volume information of the data in the combined data table.
S40: and comparing the first data characteristic information with the second data characteristic information to obtain a data comparison result, and judging whether the data of the combined data table is consistent with the data of the plurality of data source tables according to the data comparison result.
Specifically, the data content and the data volume information of the data in the initial data table, the intermediate data table and the application data table are respectively compared with the data content and the data volume information of the data in the joint data table, and whether the data content of the joint data table is consistent with the data content in the data source tables or not is judged according to the comparison result, so that the data verification process is completed.
In this embodiment, when a verification test is performed on big data of an enterprise, a plurality of data source tables are obtained in a pre-constructed database, where the data source tables include an initial data table, an intermediate data table, and an application data table, data analysis is performed on the initial data table, the intermediate data table, and the application data table, respectively, to obtain first data characteristic information, joint processing is performed on the initial data table, the intermediate data table, and the application data table by using the first data characteristic information, duplicate data found in the plurality of data source tables is removed to form a joint data table, duplicate data can be prevented from being repeatedly verified, data in the joint data table is analyzed, to obtain second data characteristic information, the first data characteristic information and the second data characteristic information are compared, whether data content of the joint data table is consistent with data content in the plurality of data source tables is determined according to a comparison result, a data verification process is further completed, data verification is performed after joint processing is performed on the plurality of data source tables, data verification time spent on the big data is reduced, and data verification efficiency is improved.
In an embodiment, as shown in fig. 2, in step S10, that is, acquiring data of each of the plurality of data source tables in the preset database specifically includes:
s11: acquiring initial source data in a preset database, and collecting the initial source data to form an initial data table.
In the present embodiment, the initial source data refers to the basic data in the big data.
Specifically, the basic data in the database is sorted and aggregated to form an initial data table.
S12: and performing aggregation processing on the initial source data to obtain aggregated data, and forming an intermediate data table based on the aggregated data set.
In this embodiment, the aggregation processing refers to forming some repeated data into intermediate data, and the aggregation data refers to data obtained by aggregating basic data.
Specifically, the basic data in the initial data table is analyzed, the repeated data in the basic data form intermediate data, the null data, the dirty data and the outlier are removed from the basic data, aggregated data are formed, the repeated processing work of the repeated data can be reduced, the data verification efficiency is effectively improved, and the aggregated data are collected to form the intermediate data table.
S13: and acquiring data type information based on the aggregated data, and classifying the data according to the data type information to form an application data table.
In the present embodiment, the data type information refers to data application direction information.
Specifically, the data after the basic data are aggregated is sorted according to the data application direction information, and is sorted into an application data table according to the application direction information of different data, so that technical personnel can directly call the application data conveniently.
Furthermore, a data layer structure of an initial data table, an intermediate data table and an application data table is formed for the data in the database, so that massive large data in the database can be conveniently stored, and further, data verification can be conveniently carried out on a plurality of data source tables.
In an embodiment, as shown in fig. 3, in step S30, that is, performing joint processing on a plurality of data source tables based on the first data characteristic information to obtain a joint data table, and acquiring second data characteristic information according to data in the joint data table specifically includes:
s31: and acquiring first data content information according to the first data characteristic information, and acquiring a repeated data set based on the first data content information.
In this embodiment, the first data content information refers to data contents in the multiple data source tables, and the repeated data sets refer to data with the same data contents.
Specifically, the data in each data source table is analyzed, the data contents in the plurality of data source tables are extracted, and the data with the same data content are sorted out according to the data contents in the plurality of data source tables to form a repeated data set.
S32: and acquiring content information repeated data according to the repeated data set, combining the content information repeated data, forming combined data by the data obtained by combining the content information repeated data and other data, and acquiring a combined data table based on the combined data set.
Specifically, the data with repeated content information is extracted according to the data content information, the data with repeated content information is merged, only one piece of data with repeated content information is left, the merged data with repeated content information and other data are collated and collected to form combined data, and the combined data is collected on the basis of the combined data to form a combined data table.
S33: and performing data characteristic extraction on the combined data to obtain second data characteristic information.
Specifically, data analysis is performed on the combined data in the combined data table, and data content and data volume information of the data in the combined data table are extracted.
In an embodiment, as shown in fig. 4, in step S40, comparing the first data characteristic information with the second data characteristic information to obtain a data comparison result, specifically including:
s41: and acquiring a first data total amount based on the first data characteristic information, and acquiring a second data total amount based on the second data characteristic information.
In this embodiment, the first total amount of data refers to the amount and type of data in the multiple data source tables, and the second total amount of data refers to the amount and type of data in the joint data table.
Specifically, the data volume and the data type of the data in the multiple data source tables are obtained by analyzing the first data characteristic information, and the data volume and the data type of the data in the combined data table are obtained by analyzing the second data characteristic information.
S42: and comparing whether the first data total amount is the same as the second data total amount, and forming a data comparison result according to the comparison result.
Specifically, the data verification function is realized by comparing whether the data volume and the data type of the data in the multiple data source tables and the data volume and the data type of the data in the combined data table are completely consistent or not.
In an embodiment, as shown in fig. 5, in step S41, that is, acquiring a first total amount of data based on the first data characteristic information, and acquiring a second total amount of data based on the second data characteristic information specifically include:
s411: acquiring a plurality of first data characteristic segments according to the first data characteristic information, counting the number of the first data characteristic segments, acquiring first data content based on the first data characteristic segments, and taking the first data content and the number of the first data characteristic segments as a first data total amount.
Specifically, the first data characteristic information is analyzed to obtain a plurality of first data characteristic segments, the number of the first data characteristic segments is counted, meanwhile, first data content is identified through the first data characteristic segments, and a first data total amount is formed by the number of the first data characteristic segments and the first data content.
S412: and acquiring a plurality of second data characteristic segments according to the second data characteristic information, counting the number of the second data characteristic segments, acquiring second data content based on the second data characteristic segments, and combining the second data content and the number of the second data characteristic segments as a total second data amount.
Specifically, the second data characteristic information is analyzed to obtain a plurality of second data characteristic segments, the number of the second data characteristic segments is counted, the second data content is identified through the second data characteristic segments, and a second data total amount is formed by the number of the second data characteristic segments and the second data content, so that a data calculation and counting function of the combined data table is realized.
In an embodiment, as shown in fig. 6, the big data multi-table data joint verification method further includes the steps of:
s50: and acquiring a second data source table of the same data source as the plurality of data source tables, and acquiring third data characteristic information based on the second data source table.
In this embodiment, the second data source table refers to a copy of the data source table generated in the data migration process, and the third data characteristic information refers to data content and data amount information of data in the copy of the data source table.
Specifically, in the process of data migration or transfer of the big data to the second terminal device, the multiple data source tables are copied and backed up to form a copy of the data source tables, and data analysis is performed on the copy of the data source tables to obtain data content and data volume information of data in the copy of the data source tables, so that data verification is performed on the copy of the data source tables and the multiple data source tables.
S60: and comparing the first data characteristic information with third data characteristic information, and judging whether the data of the second data source table is consistent with the data of the plurality of data source tables according to the comparison result.
Specifically, the data content and the data volume information of the data in the copy of the data source table are compared with the data content and the data volume information in each data source table, whether the data of the copy of the data source table is consistent with the data of the multiple data source tables is judged, and the function of verifying the data in the data migration process is achieved.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
In an embodiment, a big data multi-table data joint verification device is provided, and the big data multi-table data joint verification device corresponds to the big data multi-table data joint verification method in the embodiment one to one. As shown in fig. 7, the big data multi-table data joint verification apparatus includes a data source table module, a first data characteristic information obtaining module, a multi-table joint module, and a data verification module. The functional modules are explained in detail as follows:
the data source table module is used for acquiring respective data of a plurality of data source tables from a preset database, wherein the plurality of data source tables comprise an initial data table, an intermediate data table and an application data table;
the first data characteristic information acquisition module is used for acquiring first data characteristic information based on data in the data source tables;
the multi-table combination module is used for performing combination processing on the data source tables based on the first data characteristic information to obtain a combination data table and acquiring second data characteristic information according to data in the combination data table;
and the data verification module is used for comparing the first data characteristic information with the second data characteristic information to obtain a data comparison result, and judging whether the data of the combined data table is consistent with the data of the plurality of data source tables or not according to the data comparison result.
For specific limitations of the big data multi-table data joint verification apparatus, reference may be made to the above limitations of the big data multi-table data joint verification method, and details are not described here again. All or part of each module in the big data multi-table data joint checking device can be realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 8. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing a plurality of data source tables, a joint data table, first data characteristic information and second data characteristic information. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to realize a big data multi-table data joint checking method.
In one embodiment, a computer device is provided, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
a big data multi-table data joint verification method comprises the following steps:
acquiring respective data of a plurality of data source tables in a preset database, wherein the plurality of data source tables comprise an initial data table, an intermediate data table and an application data table;
acquiring first data characteristic information based on data in a plurality of data source tables;
performing joint processing on the data source tables based on the first data characteristic information to obtain a joint data table, and acquiring second data characteristic information according to data in the joint data table;
and comparing the first data characteristic information with the second data characteristic information to obtain a data comparison result, and judging whether the data of the combined data table is consistent with the data of the plurality of data source tables according to the data comparison result.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
a big data multi-table data joint verification method comprises the following steps:
acquiring respective data of a plurality of data source tables in a preset database, wherein the plurality of data source tables comprise an initial data table, an intermediate data table and an application data table;
acquiring first data characteristic information based on data in a plurality of data source tables;
performing joint processing on the data source tables based on the first data characteristic information to obtain a joint data table, and acquiring second data characteristic information according to data in the joint data table;
and comparing the first data characteristic information with the second data characteristic information to obtain a data comparison result, and judging whether the data of the joint data table is consistent with the data of the plurality of data source tables or not according to the data comparison result.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by hardware instructions of a computer program, which may be stored in a non-volatile computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), rambus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims (10)

1. A big data multi-table data joint verification method is characterized by comprising the following steps:
a big data multi-table data joint verification method comprises the following steps:
acquiring respective data of a plurality of data source tables in a preset database, wherein the plurality of data source tables comprise an initial data table, an intermediate data table and an application data table;
acquiring first data characteristic information based on data in a plurality of data source tables;
performing joint processing on the data source tables based on the first data characteristic information to obtain a joint data table, and acquiring second data characteristic information according to data in the joint data table;
and comparing the first data characteristic information with the second data characteristic information to obtain a data comparison result, and judging whether the data of the combined data table is consistent with the data of the plurality of data source tables according to the data comparison result.
2. The big data multi-table data joint verification method according to claim 1, wherein the obtaining of the respective data of the plurality of data source tables in the preset database specifically includes:
acquiring initial source data in a preset database, and sorting and collecting the initial source data to form an initial data table;
performing aggregation processing on the initial source data to obtain aggregated data, and forming an intermediate data table based on the aggregated data set;
and acquiring data type information based on the aggregated data, and classifying the data according to the data type information to form an application data table.
3. The big data multi-table data joint verification method according to claim 1, wherein the joint processing is performed on the plurality of data source tables based on the first data characteristic information to obtain a joint data table, and second data characteristic information is obtained according to data in the joint data table, specifically including:
acquiring first data content information according to the first data characteristic information, and acquiring a repeated data set based on the first data content information;
acquiring content information repeated data according to the repeated data set, combining the content information repeated data, forming combined data by the data obtained by combining the content information repeated data and other data, and obtaining a combined data table based on the combined data set;
and performing data characteristic extraction on the combined data to obtain second data characteristic information.
4. The big data multi-table data joint verification method according to claim 1, wherein the comparing the first data characteristic information with the second data characteristic information to obtain a data comparison result specifically comprises:
acquiring a first data total amount based on the first data characteristic information, and acquiring a second data total amount based on the second data characteristic information;
and comparing whether the first data total amount is the same as the second data total amount, and forming a data comparison result according to the comparison result.
5. The big data multi-table data joint verification method according to claim 4, wherein the obtaining of the first total data amount based on the first data characteristic information and the obtaining of the second total data amount based on the second data characteristic information specifically include:
acquiring a plurality of first data characteristic segments according to the first data characteristic information, counting the number of the first data characteristic segments, acquiring first data content based on the first data characteristic segments, and taking the first data content and the number of the first data characteristic segments as a first data total amount;
and acquiring a plurality of second data characteristic segments according to the second data characteristic information, counting the number of the second data characteristic segments, acquiring second data content based on the second data characteristic segments, and combining the second data content and the number of the second data characteristic segments as a total second data amount.
6. The big data multi-table data joint verification method according to claim 1, wherein the big data multi-table data joint verification method further comprises the steps of:
acquiring a second data source table of the same data source as the plurality of data source tables, and acquiring third data characteristic information based on the second data source table;
and comparing the first data characteristic information with third data characteristic information, and judging whether the data of the second data source table is consistent with the data of the plurality of data source tables according to the comparison result.
7. A big data multi-table data joint verification device is characterized by comprising:
the data source table module is used for acquiring respective data of a plurality of data source tables from a preset database, wherein the plurality of data source tables comprise an initial data table, an intermediate data table and an application data table;
the first data characteristic information acquisition module is used for acquiring first data characteristic information based on data in the data source tables;
the multi-table combination module is used for performing combination processing on the data source tables based on the first data characteristic information to obtain a combined data table, and acquiring second data characteristic information according to data in the combined data table;
and the data verification module is used for comparing the first data characteristic information with the second data characteristic information to obtain a data comparison result, and judging whether the data of the combined data table is consistent with the data of the plurality of data source tables or not according to the data comparison result.
8. The big data multi-table data joint verification device according to claim 7, further comprising:
and the second data source table module is used for acquiring a second data source table of a data source same as the plurality of data source tables, comparing the first data characteristic information with third data characteristic information based on third data characteristic information acquired in the second data source table, and judging whether the data of the second data source table is consistent with the data of the plurality of data source tables according to the comparison result.
9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the big data multi-table data joint checking method according to any one of claims 1 to 6 when executing the computer program.
10. A computer-readable storage medium, which stores a computer program, wherein the computer program, when executed by a processor, implements the steps of a big data multi-table data joint check method according to any one of claims 1 to 6.
CN202211630654.3A 2022-12-19 2022-12-19 Big data multi-table data joint verification method, system, equipment and medium Withdrawn CN115858625A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211630654.3A CN115858625A (en) 2022-12-19 2022-12-19 Big data multi-table data joint verification method, system, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211630654.3A CN115858625A (en) 2022-12-19 2022-12-19 Big data multi-table data joint verification method, system, equipment and medium

Publications (1)

Publication Number Publication Date
CN115858625A true CN115858625A (en) 2023-03-28

Family

ID=85674029

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211630654.3A Withdrawn CN115858625A (en) 2022-12-19 2022-12-19 Big data multi-table data joint verification method, system, equipment and medium

Country Status (1)

Country Link
CN (1) CN115858625A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117453743A (en) * 2023-11-09 2024-01-26 继善(广东)科技有限公司 Multi-table data joint analysis method, system, equipment and medium based on big data

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117453743A (en) * 2023-11-09 2024-01-26 继善(广东)科技有限公司 Multi-table data joint analysis method, system, equipment and medium based on big data

Similar Documents

Publication Publication Date Title
CN108197306B (en) SQL statement processing method and device, computer equipment and storage medium
CN112559365A (en) Test case screening method and device, computer equipment and storage medium
US10496459B2 (en) Automated software program repair candidate selection
CN110956269A (en) Data model generation method, device, equipment and computer storage medium
JP7404839B2 (en) Identification of software program defect location
CN111767350A (en) Data warehouse testing method and device, terminal equipment and storage medium
CN115858625A (en) Big data multi-table data joint verification method, system, equipment and medium
CN112559364A (en) Test case generation method and device, computer equipment and storage medium
Amusuo et al. Reflections on software failure analysis
CN111367782A (en) Method and device for automatically generating regression test data
CN117391306A (en) Homeland space planning result examination method, device, equipment and storage medium
CN112181845A (en) Interface testing method and device
CN109542947B (en) Data statistical method, device, computer equipment and storage medium
CN114090462B (en) Software repeated defect identification method and device, computer equipment and storage medium
CN113448867B (en) Software pressure testing method and device
CN114971926A (en) Premium calculation model test method, system, device and storage medium
CN115393128A (en) Panoramic analysis method and device for intellectual property, computer equipment and medium
CN114996151A (en) Interface testing method and device, electronic equipment and readable storage medium
CN113849484A (en) Big data component upgrading method and device, electronic equipment and storage medium
CN117453743A (en) Multi-table data joint analysis method, system, equipment and medium based on big data
CN113282496A (en) Automatic interface test method, device, equipment and storage medium
CN112181838B (en) Automatic testing method based on image comparison
US11347722B2 (en) Big data regression verification method and big data regression verification apparatus
CN112347095B (en) Data table processing method, device and server
CN110727582B (en) Program testing method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 518000 2201, block D, building 1, Chuangzhi Yuncheng bid section 1, Liuxian Avenue, Xili community, Xili street, Nanshan District, Shenzhen City, Guangdong Province

Applicant after: Qizhi Technology Co.,Ltd.

Address before: 518000 2201, block D, building 1, Chuangzhi Yuncheng bid section 1, Liuxian Avenue, Xili community, Xili street, Nanshan District, Shenzhen City, Guangdong Province

Applicant before: Qizhi Network Technology Co.,Ltd.

CB02 Change of applicant information
WW01 Invention patent application withdrawn after publication

Application publication date: 20230328

WW01 Invention patent application withdrawn after publication