CN116701375A

CN116701375A - Data real-time checking method and device, electronic equipment and storage medium

Info

Publication number: CN116701375A
Application number: CN202310729761.XA
Authority: CN
Inventors: 李宏元; 郑浩; 侯鹏
Original assignee: Peoples Insurance Company of China
Current assignee: Peoples Insurance Company of China
Priority date: 2023-06-19
Filing date: 2023-06-19
Publication date: 2023-09-05

Abstract

In the method, part or all of the data stream of the source database is used as the target data stream to be written into the storage data units of different links, and the verification database uses the abstract value corresponding to the target data stream written into the storage data units of different links to verify whether the storage data units of different links are successfully written into the target data stream or not, so that the purpose of verifying the target data streams written into the storage data units of multiple links in real time is finally achieved. The method comprises the following steps: obtaining a data stream from a source database; writing the target data stream into storage data units of different links; determining abstract values corresponding to target data streams written into storage data units of different links; and sending all the summary values to a checking database so that the checking database executes a checking data method.

Description

Data real-time checking method and device, electronic equipment and storage medium

Technical Field

The present application relates to the field of data real-time verification, and in particular, to a data real-time verification method, apparatus, electronic device, and storage medium.

Background

Currently, a terminal device may provide a variety of services for a user to use, for example, the terminal device may provide services for the user to view, find data, and the like. In order to achieve the above object, the terminal device needs to continuously acquire data from the source database and store the data in a storage data unit called when implementing the service, where the process may involve a plurality of storage data units, and the storage data unit includes a message queue, a database, and the like. In order to ensure the real-time performance and accuracy of service data, the data written in a plurality of storage data units need to be checked, and the problem of writing the data is found in time.

In the related art, the method for checking the data in the two databases is an ETL (data warehouse technology) process, and the process checks the data by a manual comparison method, so that the problem of manual intervention exists, and human errors and omission easily occur.

Therefore, how to verify the data in multiple storage data units in real time is a technical problem to be solved by those skilled in the art.

Disclosure of Invention

The application provides a data real-time checking method, a device, electronic equipment and a storage medium, wherein part or all of data streams of a source database are used as target data streams to be written into storage data units of different links, and the checking database checks whether the storage data units of different links are successfully written into the target data streams by utilizing abstract values corresponding to the target data streams written into the storage data units of different links, so that the aim of real-time checking of the target data streams written into the storage data units of multiple links is finally achieved.

In a first aspect, an embodiment of the present application provides a method for checking data in real time, including:

obtaining a data stream from a source database; writing a target data stream into storage data units of different links, wherein the target data stream is a part or all of data streams; determining abstract values corresponding to target data streams written into storage data units of different links;

transmitting all the digest values to a collation database to cause the collation database to execute a collation data method, wherein the collation data method comprises: storing a digest value corresponding to the target data stream; comparing whether the abstract values are the same or not, and counting the number of the abstract values; if the digest values are the same and the number of digest values is equal to the total number of links, it is determined that the stored data unit was all successfully written to the target data stream.

In some embodiments, the target data stream includes primary key information; the method further comprises the steps of:

and sending the primary key information to a check database, so that the check database receives the primary key information before executing the check data method, and preliminarily detecting whether the storage data units of different links are successfully written into the target data stream according to the primary key information.

In some embodiments, the step of verifying that the database performs the preliminary detection of whether the stored data units of different links are all successfully written to the target data stream based on the primary key information includes:

and if the primary key information sent by the storage data units of different links is the same, preliminarily determining that the storage data units of different links are successfully written into the target data stream.

and if the primary key information sent by the storage data units of different links is different, or the number of the primary key information received by the check database is smaller than the total number of all links, repeating the step of acquiring the data stream from the source database.

In some embodiments, the target data stream further comprises field information; the step of determining the digest value corresponding to the target data stream written into the storage data unit of the different link includes:

detecting whether field information of target data streams written into storage data units of different links has intersection content or not;

if the intersection content exists, determining a digest value corresponding to the target data stream according to the primary key information and the intersection content;

and if the intersection content does not exist, determining a summary value corresponding to the target data stream according to the primary key information.

In some embodiments, the step of determining a digest value corresponding to the data stream comprises: and determining a digest value corresponding to the target data stream according to the primary key information.

In some embodiments, the method further comprises:

generating a unit identifier corresponding to the stored data unit;

and sending the unit identifier to the check database so that the check database executes the process of determining the storage data units which are not successfully written into the target data stream through the unit identifier if the number of the summary values is smaller than the total number of links.

In some embodiments, after performing the step of determining, by the element identification, that the stored data element of the target data stream was not successfully written to, the method further comprises: deleting the summary value and the unit identifier stored in the check database, and repeating the step of acquiring the data stream from the source data.

In some embodiments, the step of writing the target data stream into the stored data units of the different links includes:

writing the target data stream into a message queue;

and sending the target data stream in the message queue to different service databases.

In a second aspect, an embodiment of the present application further provides a data real-time checking device, including:

an acquisition unit for acquiring a data stream from a source database;

the writing unit is used for writing the target data stream into the storage data units of different links, wherein the target data stream is a part or all of the data stream;

a determining unit, configured to determine a digest value corresponding to the target data stream written into the storage data unit of the different links;

a transmitting unit configured to transmit all the digest values to a collation database so that the collation database performs a collation data method, wherein the collation data method comprises: storing a digest value corresponding to the target data stream; comparing whether the abstract values are the same or not, and counting the number of the abstract values; if the digest values are the same and the number of digest values is equal to the total number of links, it is determined that the stored data unit was all successfully written to the target data stream.

In a third aspect, an embodiment of the present application further provides an electronic device, including a memory and a processor, where the memory stores a computer program, and the processor implements the steps of the data stream real-time checking method when executing the computer program.

In a fourth aspect, embodiments of the present application also provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the data stream real-time collation method.

The data real-time checking method can check whether the storage data units of a plurality of links are successfully written into the target data stream, shield the difference between databases and timely find the data stream problem. The method has the advantages that the mode of writing the data flow into the storage data unit in real time is adopted, so that the service response time of the terminal equipment is faster, and the method is beneficial to enterprises to respond to service demands more quickly; the real-time data processing and checking can automatically detect the abnormality and the error of the data stream, thereby reducing the data quality problem; real-time data acquisition, processing and checking are helpful for avoiding various data problems such as data delay and data errors, and effectively improving the data quality and accuracy; the collation database can support a large number of data flows, and the processing capacity can be easily extended to accommodate the increase in the amount of data delivered from the source database.

Drawings

FIG. 1 illustrates a flow chart providing a method of data real-time reconciliation in accordance with some embodiments;

FIG. 2 illustrates a data flow diagram provided in accordance with some embodiments;

FIG. 3 illustrates yet another data flow diagram provided in accordance with some embodiments;

FIG. 4 schematically illustrates a schematic diagram of yet another data live verification method provided in accordance with some embodiments;

FIG. 5 illustrates a schematic diagram of yet another data live verification method provided in accordance with some embodiments;

FIG. 6 illustrates a schematic diagram of yet another data live verification method provided in accordance with some embodiments;

fig. 7 schematically illustrates a structure of a data real-time collation apparatus provided according to some embodiments.

Detailed Description

For the purposes of making the objects and embodiments of the present application more apparent, an exemplary embodiment of the present application will be described in detail below with reference to the accompanying drawings in which exemplary embodiments of the present application are illustrated, it being apparent that the exemplary embodiments described are only some, but not all, of the embodiments of the present application.

It should be noted that the brief description of the terminology in the present application is for the purpose of facilitating understanding of the embodiments described below only and is not intended to limit the embodiments of the present application. Unless otherwise indicated, these terms should be construed in their ordinary and customary meaning.

The terms first, second, third and the like in the description and in the claims and in the above-described figures are used for distinguishing between similar or similar objects or entities and not necessarily for describing a particular sequential or chronological order, unless otherwise indicated. It is to be understood that the terms so used are interchangeable under appropriate circumstances.

The terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a product or apparatus that comprises a list of elements is not necessarily limited to all elements explicitly listed, but may include other elements not expressly listed or inherent to such product or apparatus.

In order to solve the technical problems described above, an embodiment of the present application provides a data real-time checking method, in which part or all of a data stream of a source database is written as a target data stream into storage data units of different links, and a checking database checks whether the storage data units of different links are successfully written into the target data stream by using a digest value corresponding to the target data stream written into the storage data units of different links, so as to finally achieve the purpose of real-time checking of the target data streams written into the storage data units of multiple links.

Fig. 1 illustrates a flow chart providing a method of data real-time reconciliation in accordance with some embodiments. The method includes S100-S400.

S100, acquiring a data stream from a source database.

In the embodiment of the application, new data is continuously added to the source database, and the new data is continuously transmitted to the storage data unit in the form of data stream, so that the terminal equipment can realize service through the storage data unit. The embodiment of the application adopts the form of data stream to ensure the real-time property of the data obtained by the storage data unit.

And S200, writing the target data stream into storage data units of different links, wherein the target data stream is a part or all of data streams.

In the embodiment of the application, according to the actual service requirement, the data stream acquired from the source database may be partially written into the storage data unit or may be completely written into the storage data unit.

In the embodiment of the present application, the storage data unit may be a unit with a storage function, such as a message queue or a service database. By way of example, the message queue may be a kafka (distributed message queue) message queue, the service database may be a NoSql (non-relational) database, an Hbase (distributed storage system) database, an ES (elstincsearch, high expansion and open-source full-text search and analysis engine) database, or the like.

It should be noted that, the specific type of the service database may be set according to the service implemented by the terminal device, and the specific type of the message queue and the service database is not limited herein.

In some embodiments, the target data stream may have a condition where the stored data unit cannot be successfully written, and embodiments of the present application may check for the error condition.

In some embodiments, the step of writing the target data stream into the stored data units of the different links includes: writing the target data stream obtained from the source database into a message queue; and sending the target data streams in the message queue to different service databases respectively.

By way of example, fig. 2 illustrates a target data stream flow diagram provided in accordance with some embodiments. The storage data units include a kafka message queue, a NoSql database, an Hbase database, and an ES database. The source database sends the target data stream to a kafka message queue which sends the target data stream to the NoS ql database, hbase database, and ES database, respectively.

In the above example, the target data stream may be sent to the kafka message queue, the NoSql database, the Hbase database, and the ES database, i.e., there are four links. The four links are links corresponding to the kafka message queue, the NoSql database, the Hbase database and the ES database respectively.

Of course, in other examples, the target data stream may be sent to the kafka message queue, the first NoSql database, the Hbase database, the ES database, and the second NoSql database, i.e., there are five links.

It should be noted that, the specific setting of the links is determined according to the actual needs, and the embodiment of the application does not limit the total number of links.

In other embodiments, due to different processing manners of data required for implementing different services of the terminal device, there may be a case that the target data stream is not directly written into the service database corresponding to the service from the message queue directly communicated with the source database.

In one example, the stored data unit includes a first message queue, a first traffic database, a second message queue, and a second traffic database. The step of writing the target data stream into the storage data units of different links comprises the following steps: transmitting the target data stream to a first message queue; transmitting the target data stream in the first message queue to a first service database; transmitting the target data stream in the first service database to a second message queue; and writing the target data flow in the second message queue into a second service database.

For example, fig. 3 schematically illustrates yet another target data stream flow diagram provided in accordance with some embodiments. The source database sends the target data stream to a kafka message queue, which sends the target data stream to the Hbase database, which then sends the target data stream to a two-way kafka message queue, which sends the target data stream to the ES database. In the above example, the target data stream may be stored in the kafka message queue, the Hbase database, the two-way kafka message queue, and the ES database, i.e., there are four links. The four links respectively comprise links respectively corresponding to a kafka message queue, an Hbase database, two kafka message queues and an ES database.

In another example, the stored data unit includes a first message queue, a first traffic database, a second message queue, a second traffic database, a third message queue, and a third traffic database. The step of writing the target data stream into the storage data units of different links comprises the following steps: transmitting the database to a first message queue; writing the target data flow in the first message queue into a first service database; transmitting the target data stream in the first service database to a second message queue; writing the target data flow in the second message queue into a second service database; transmitting the target data stream in the second service database to a third message queue; and writing the target data flow in the third message queue into a third service database.

In the embodiment of the application, the arrangement modes of the message queues and the service databases are related to service requirements, the arrangement modes and the quantity are not limited, and the arrangement modes and the quantity are determined according to actual requirements.

The embodiment of the application is not limited to the above-mentioned process of transferring the target data stream between different storage data units, and other contents which are not contrary to the protection content of the application are all within the protection scope of the application.

S300, determining the abstract values corresponding to the target data streams written into the different link storage data units.

In the embodiment of the application, the content of the target data stream is different, and the calculated abstract value is also different. The content of the target data stream is the same, and the calculated abstract value is the same.

When written target data streams exist in storage data units of different links, the digest values can be calculated according to the target data streams, so that the condition that the number of the digest values can be multiple exists. Of course, if a storage data unit of a certain link fails to be written into the target data stream, a summary value corresponding to the storage data unit cannot be calculated.

In some embodiments, the step of calculating the digest value from the target data stream is performed by a predetermined program in the electronic device.

S400, transmitting all the summary values to a checking database so that the checking database executes a checking data method, wherein the checking data method comprises the following steps: storing a digest value corresponding to the target data stream; comparing whether the abstract values are the same or not, and counting the number of the abstract values; if the digest values are the same and the number of digest values is equal to the total number of links, it is determined that the stored data unit was all successfully written to the target data stream.

In the embodiment of the application, all the summary values corresponding to the stored data units are sent to the check database. If the corresponding digest values exist for each of the plurality of stored data units, the plurality of digest values are all sent to the collation database.

In the embodiment of the application, the check database is provided with a check data method.

In the embodiment of the application, whether the digest values are the same is compared. And if the digest values are the same, indicating that the target data streams written into the storage data units of different links are the same. If the abstract values are different, the fact that the target data streams written into the storage data units of different links are different is indicated, at the moment, the step of acquiring the data streams from the source database can be re-executed from the beginning, then the target data streams are written into the storage data units of different links again, and after the target data streams are re-written, whether the target data streams are successfully written into all the storage data units can be continuously detected.

In the embodiment of the application, the quantity of the abstract values is counted. The relation between the number of digest values and the total number of links is compared.

When the number of digest values and the total number of links, i.e., the number of storage data units, are equal, it is determined that the storage data units were all successfully written to the target data stream. If the number of digest values is less than the total number of links, it is determined that there are stored data units that have not been written to the target data stream, at which point the step of retrieving the data stream from the source database may be re-performed and then the target data stream is re-written to the stored data units of a different link.

The embodiment of the application does not limit the execution sequence of the step of comparing the number of the summary values with the total number of links and the step of comparing whether the summary values are the same or not.

In some embodiments, the step of comparing whether the digest values are the same may be performed first by counting the number of digest values, comparing the number of digest values with the total number of links, and when the number of digest values counted and the total number of links are equal. It is also possible to perform the step of comparing whether the digest values are the same, and when all digest values are the same, counting the number of digest values, comparing the number of digest values and the total number of links.

In the embodiment of the application, in order to more accurately determine whether the target data stream is successfully written into the storage data units of different links, the method is realized through two steps of the primary key information and the abstract value, and the accuracy of checking the data is greatly improved.

In some embodiments, FIG. 4 illustrates a flow chart with a method of data real-time reconciliation provided in accordance with some embodiments. The step of checking the database to preliminarily detect whether the storage data units of different links are successfully written into the target data stream according to the primary key information comprises the following steps:

s500, judging whether the primary key information sent by the storage data units of different links is the same, counting the number of the primary key information, and comparing the number of the primary key information with the total number of all links.

S600, if the main key information sent by the storage data units of different links is the same, and the number of the main key information is equal to the total number of all links, the storage data units of different links are preliminarily determined to be successfully written into the target data stream, and then the checking database continues to execute the data checking method.

And S700, if the primary key information sent by the storage data units of different links is different, or the number of the received primary key information is smaller than the total number of all links, re-executing the step of acquiring the data stream from the source database.

In one example, as shown in FIG. 3, the primary key information sent by the kafka message queue of the first link is Pri [ A ], the primary key information of the Hbase database of the second link is Pri [ A ], the primary key information sent by the two kafka message queues of the third link is Pri [ A ], and the primary key information sent by the ES database of the fourth link is Pri [ A ]. As the total of four links and the received primary key information of the first link to the fourth link are Pri [ A ], the storage data units of different links are preliminarily determined to be successfully written into the target data stream.

In another example, the main key information sent by the kafka message queue of the first link is Pri [ A ], the main key information of the Hbase database of the second link is Pri [ A ], the main key information sent by the two kafka message queues of the third link is Pri [ A ], and the main key information sent by the ES database of the fourth link is Pri [ B ]. Since the primary key information of the first three links is Pri [ A ], and the primary key information of the fourth link is Pri [ B ], it is determined that the primary key information sent by the storage data units of different links is different.

If the primary key information sent by the stored data units received by the collation database is different, it is explained that the target data streams written to the stored data units may not be the same, and an error may occur.

In yet another example, the primary key information sent by the kafka message queue of the first link is received by the collation database is Pri [ a ], the primary key information of the Hbase database of the second link is Pri [ a ], the primary key information sent by the two kafka message queues of the third link is Pri [ a ], the primary key information sent by the ES database of the fourth link is not received, and since the number of primary key information is three and the total number of all links is four, it is determined that the primary key information sent by all storage data units is not received by the collation database.

If the primary key information sent by all of the stored data units is not received by the collation database, it is indicated that the target data stream may not be successfully written into the stored data units.

In the embodiment of the application, if the check database can receive the primary key information sent by all the storage data units, the target data stream is written into the storage data units, and if the primary key information received by the check database is the same, the target data stream written into the storage data units is preliminarily judged to be possibly the same. When it is determined that the primary key information sent by the storage data units of different links is different, or the number of the received primary key information is smaller than the total number of all links, which indicates that there are storage data units which are not successfully written into the target data stream, the step of obtaining the data stream from the source database is re-executed at this time, and the target data stream is re-written into the storage data units.

In other embodiments, the step of initially detecting whether the stored data units of the different links are successfully written to the target data stream based on the primary key information is performed before the step of determining the digest value corresponding to the target data stream written to the stored data units of the different links.

If the fact that the target data stream is not successfully written into the storage data units of different links is detected preliminarily according to the primary key information, determining the abstract value corresponding to the target data stream written into the storage data units of different links is executed. If the fact that all the storage data units of different links are successfully written into the target data stream is detected preliminarily according to the primary key information, the method continues to determine the abstract value corresponding to the target data stream written into the storage data units of different links.

In some embodiments, fig. 5 illustrates a flow chart of yet another data real-time reconciliation method provided in accordance with some embodiments. The target data stream further includes field information; the step of determining a digest value corresponding to the target data stream includes:

s301, detecting whether field information of target data streams written into storage data units of different links has intersection content. The intersection content refers to the same content in the field information.

In one example, as shown in fig. 3, the kafka message queue of the first link sends field information of Attr [ b\c\d\e\f ], the Hbase database of the second link sends field information of Attr [ b\c\d\e ], the two kafka message queue of the third link sends field information of Attr [ b\c\d\e ], and the ES database of the fourth link sends field information of Attr [ b\c\e ]. The field information of the target data stream written into the storage data units of different links has intersection content, namely the intersection content is Attr [ B\C\E ].

In another example, as shown in fig. 3, the field information sent by the kafka message queue of the first link is Attr [ b\c\d\e\f ], the field information of the Hbase database of the second link is Attr [ b\c\d\e ], the field information sent by the two kafka message queues of the third link is Attr [ b\c\d\e ], and the field information sent by the ES database of the fourth link is Attr [ G ], and it is determined that the intersection content does not exist in the field information of the target data stream written into the storage data unit of the different link.

S302, if the intersection content exists, determining a digest value corresponding to the target data stream according to the primary key information and the intersection content.

For example, the primary key information of the target data stream written into the storage data units of the four links is Pri [ A ], the intersection content is Attr [ B\C\E ], and the primary key information and the intersection content are combined at the moment to obtain the data A\B\C\E to be processed. To-be-processed data A\B\C\E, a digest value (md 5 value) is calculated.

The method for calculating the abstract value in the embodiment of the application is not limited, and the abstract values calculated according to different contents are different.

It can be understood that, in the embodiment of the present application, the larger the data size of the data to be processed is, the higher the accuracy of the verification result is, that is, the higher the accuracy of judging whether the target data stream is successfully written into the storage data unit. Therefore, the embodiment of the application obtains the intersection content of the field information of the target data stream written into the storage data units of different links, and if the intersection content exists, the intersection content and the primary key information are utilized to jointly calculate the abstract value, so that the accuracy of the checking result can be improved.

S303, if the intersection content does not exist, determining a digest value corresponding to the target data stream according to the primary key information.

For example, the primary key information written into the target data stream of the storage data unit of the four links is Pri [ A ], and no intersection content exists, and the data A to be processed is determined according to the primary key information Pri [ A ]. To-be-processed data A, a digest value is calculated.

In other embodiments, the step of determining a digest value corresponding to the target data stream includes: and directly determining the abstract value corresponding to the target data stream according to the primary key information. In the embodiment of the application, the step of detecting whether the field information sent by the storage data units in different links has intersection content is not executed, and the main key information is directly utilized to determine the abstract value corresponding to the target data stream.

In some embodiments, fig. 6 illustrates a flow chart of yet another data real-time reconciliation method provided in accordance with some embodiments. The method further comprises the steps of:

s800, generating a unit identifier corresponding to the stored data unit. In the embodiment of the application, the unit identifier is a unique identifier of a storage data unit and corresponds to the storage data unit one by one.

In some embodiments, the unit identification may include storing the data unit names and the link order. Illustratively, att r [ es,4], where es is the storage data unit name, 4 is the fourth link, i.e., link order is 4.

S900, sending the unit identifiers to the check database so that the check database can execute the process of determining the storage data units which are not successfully written into the target data stream through the unit identifiers if the number of the summary values is smaller than the total number of links.

In the embodiment of the application, in order to facilitate searching the storage data units which are not successfully written into the target data stream, the sending unit identifies the storage data units in the checking database. And searching the storage data unit corresponding to the missing abstract value according to the unit identifier. Illustratively, the unit identifiers received by the collation database include Attr [ kafka,1], attr [ Hbase,2] and Attr [ kafka2,3], but in practice the total number of links is 4, so the unit identifier of the stored data unit of link order 4 is absent, and it is determined that the stored data unit of link order 4 is not successfully written to the target data stream.

In some embodiments, after performing the step of determining, by the element identification, that the stored data element of the target data stream was not successfully written, further comprising:

deleting the summary value and the unit identifier stored in the check database, and repeating the step of acquiring the data stream from the source data.

In the embodiment of the application, the content for verification in the verification database is deleted because of the stored data unit which is not successfully written into the target data stream, and the content comprises the stored abstract value and the unit identifier. Then, the steps of acquiring the data stream from the source data are repeatedly executed, and then the target data stream is written into the storage data units of different links.

In some embodiments, if after the process is repeatedly performed, it is still determined that there are stored data units that have not been successfully written to the target data stream, the process may be continuously repeatedly performed until the number of repetitions reaches a preset number, which may be 3, for example, and when the number of repetitions reaches 3, the process is not repeatedly performed.

In some embodiments, when it is determined that there is a stored data unit that has not been successfully written to the target data stream, a prompt report may be sent to the terminal of the relevant person, so that after the relevant person looks up the prompt report on the terminal, the whole process of writing the target data stream to the stored data unit is checked, a specific problem is found and improved.

The data real-time checking method of the embodiment of the application can be realized based on hadoop (big data software system operation framework) components.

In the embodiment of the application, whether the storage data units of a plurality of links are successfully written into the target data stream can be checked, the difference between databases is shielded, and the data stream problem is found in time. In the embodiment of the application, the mode of writing the data flow into the storage data unit in real time is adopted, so that the service response time of the terminal equipment is faster, and the method is beneficial to the enterprises to respond to the service demands more quickly; the real-time data processing and checking can automatically detect the abnormality and the error of the data stream, thereby reducing the data quality problem; real-time data acquisition, processing and checking are helpful for avoiding various data problems such as data delay and data errors, and effectively improving the data quality and accuracy; the collation database can support a large number of data flows, and the processing capacity can be easily extended to accommodate the increase in the amount of data delivered from the source database.

In the above embodiment, the method for checking data in real time is provided, in which part or all of the data stream of the source database is written into the storage data units of different links as the target data stream, and the checking database uses the digest value corresponding to the target data stream written into the storage data units of different links to check whether the storage data units of different links are successfully written into the target data stream, so as to finally achieve the purpose of checking the target data streams written into the storage data units of multiple links in real time. The method comprises the following steps: obtaining a data stream from a source database; writing a target data stream into storage data units of different links, wherein the target data stream is a part or all of data streams; determining abstract values corresponding to target data streams written into storage data units of different links; transmitting all the digest values to a collation database to cause the collation database to execute a collation data method, wherein the collation data method comprises: storing a digest value corresponding to the target data stream; comparing whether the abstract values are the same or not, counting the number of the abstract values, and comparing the number of the abstract values with the total number of links; if all the digest values are the same and the number of digest values is equal to the total number of links, then it is determined that the stored data unit was all successfully written to the target data stream.

Further, as a specific implementation of the methods of fig. 1 and fig. 4-6, an embodiment of the present application provides a structure schematic diagram of a data real-time checking device, as shown in fig. 7, where the device includes:

an obtaining unit 701, configured to obtain a data stream from a source database;

a writing unit 702, configured to write a target data stream into storage data units of different links, where the target data stream is a part or all of the data streams;

a determining unit 703, configured to determine digest values corresponding to the target data streams written into the storage data units of different links;

a transmitting unit 704, configured to transmit all the digest values to a collation database, so that the collation database performs a collation data method, wherein the collation data method comprises: storing a digest value corresponding to the target data stream; comparing whether the abstract values are the same or not, counting the number of the abstract values, and comparing the number of the abstract values with the total number of links; if all the digest values are the same and the number of digest values is equal to the total number of links, then it is determined that the stored data unit was all successfully written to the target data stream.

In a specific application scenario, the apparatus further includes:

The detection unit is used for detecting whether field information of the target data stream written into the storage data units of different links has intersection content or not;

a first determining unit configured to determine a digest value corresponding to the target data stream based on the primary key information and the intersection content if the intersection content exists;

and a second determining unit configured to determine a digest value corresponding to the target data stream according to the primary key information if the intersection content does not exist.

It should be noted that, other corresponding descriptions of each functional unit related to the data real-time checking device provided in this embodiment may refer to corresponding descriptions in fig. 1, fig. 4-6 and fig. 1, and are not repeated here.

In a specific application scenario, the apparatus further includes:

a generating unit, configured to generate a unit identifier corresponding to the stored data unit;

and the comparison unit is used for sending the unit identification to the check database so that the check database can execute the process of determining the storage data units which are not successfully written into the target data stream through the unit identification if the number of the summary values is smaller than the total number of the links.

Based on the above methods shown in fig. 1 and fig. 4-6, correspondingly, the embodiment of the present application further provides a storage medium, on which a computer program is stored, where the program is executed by a processor to implement the data real-time collation method shown in fig. 1 and fig. 4-6.

Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.), and includes several instructions for causing an electronic device (may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective implementation scenario of the present application.

Based on the methods shown in fig. 1 and fig. 4-6 and the virtual device embodiment shown in fig. 7, in order to achieve the above objects, the embodiment of the present application further provides an entity device for checking data in real time, which may specifically be a computer, a smart phone, a tablet computer, a smart watch, a server, or a network device, where the entity device includes a storage medium and a processor; a storage medium storing a computer program; a processor for executing a computer program to implement the data real-time collation method as described above and shown in fig. 1, 4-6.

Optionally, the physical device may further include a user interface, a network interface, a camera, radio Frequency (RF) circuitry, sensors, audio circuitry, WI-FI modules, and the like. The user interface may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, etc. The network interface may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), etc.

It will be appreciated by those skilled in the art that the structure of the electronic device provided in this embodiment is not limited to the electronic device, and may include more or fewer components, or may be combined with certain components, or may be arranged with different components.

The storage medium may further include an operating device and a network communication module. The operating means is a program that manages and saves electronic device hardware and software resources, supporting the execution of information handling programs and other software and/or programs. The network communication module is used for realizing communication among all the controls in the storage medium and communication with other hardware and software in the entity equipment.

From the above description of the embodiments, it will be apparent to those skilled in the art that the present application may be implemented by means of software plus necessary general hardware platforms, or may be implemented by hardware.

Those skilled in the art will appreciate that the drawing is merely a schematic illustration of one preferred implementation scenario and that elements or processes in the drawing are not necessarily required to practice the application. Those skilled in the art will appreciate that elements of an apparatus in an implementation may be distributed throughout the apparatus in an implementation as described in the implementation, or that corresponding variations may be located in one or more apparatuses other than the present implementation. The units of the implementation scenario may be combined into one unit, or may be further split into a plurality of sub-units.

The above-mentioned inventive sequence numbers are merely for description and do not represent advantages or disadvantages of the implementation scenario. The foregoing disclosure is merely illustrative of some embodiments of the application, and the application is not limited thereto, as modifications may be made by those skilled in the art without departing from the scope of the application.

Claims

1. A method for data real-time collation, comprising:

obtaining a data stream from a source database;

writing a target data stream into storage data units of different links, wherein the target data stream is a part or all of data streams;

determining abstract values corresponding to target data streams written into storage data units of different links;

Transmitting all the digest values to a collation database to cause the collation database to execute a collation data method, wherein the collation data method comprises: storing a digest value corresponding to the target data stream; comparing whether the abstract values are the same or not, counting the number of the abstract values, and comparing the number of the abstract values with the total number of links; if all the digest values are the same and the number of digest values is equal to the total number of links, then it is determined that the stored data unit was all successfully written to the target data stream.

2. The method of claim 1, wherein the target data stream includes primary key information; the method further comprises the steps of:

3. The method of claim 2, wherein the step of checking the database to perform the preliminary detection of whether the stored data units of the different links are all successfully written to the target data stream based on the primary key information comprises: and if the primary key information in the data stream sent by the storage data units of different links is the same, preliminarily determining that the storage data units of different links are successfully written into the target data stream.

4. The method of claim 2, wherein the step of checking the database to perform the preliminary detection of whether the stored data units of different links are all successfully written to the target data stream based on the primary key information further comprises:

5. The method of claim 1, wherein the target data stream further comprises field information; the step of determining the digest value corresponding to the target data stream written into the storage data unit of the different link includes:

6. The method as recited in claim 1, further comprising:

generating a unit identifier corresponding to the stored data unit;

7. The method of claim 6, further comprising, after performing the step of determining, by the element identification, that the stored data element of the target data stream was not successfully written to,:

8. A data real-time collation apparatus, comprising:

an acquisition unit for acquiring a data stream from a source database;

the writing unit is used for writing the target data stream into the storage data units of different links, wherein the target data stream is a part or all of data streams;

A transmitting unit configured to transmit all the digest values to a collation database so that the collation database performs a collation data method, wherein the collation data method comprises: storing a digest value corresponding to the target data stream; comparing whether the abstract values are the same or not, counting the number of the abstract values, and comparing the number of the abstract values with the total number of links; if all the digest values are the same and the number of digest values is equal to the total number of links, then it is determined that the stored data unit was all successfully written to the target data stream.

9. An electronic device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the data stream real-time collation method according to any one of claims 1 to 7.

10. A computer readable storage medium having stored thereon a computer program, characterized in that the computer program, when executed by a processor, implements the steps of the data stream real-time collation method according to any one of claims 1 to 7.