CN108763291B - Data management method and device and electronic equipment - Google Patents

Data management method and device and electronic equipment Download PDF

Info

Publication number
CN108763291B
CN108763291B CN201810339347.7A CN201810339347A CN108763291B CN 108763291 B CN108763291 B CN 108763291B CN 201810339347 A CN201810339347 A CN 201810339347A CN 108763291 B CN108763291 B CN 108763291B
Authority
CN
China
Prior art keywords
data
processing node
information
original
marked
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810339347.7A
Other languages
Chinese (zh)
Other versions
CN108763291A (en
Inventor
韩红根
张超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN201810339347.7A priority Critical patent/CN108763291B/en
Publication of CN108763291A publication Critical patent/CN108763291A/en
Application granted granted Critical
Publication of CN108763291B publication Critical patent/CN108763291B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the invention provides a data management method, a data management device and electronic equipment, wherein the method comprises the following steps: when the monitoring data processing system acquires original data, generating marking data, adding the marking data to the data processing system, and recording a first data volume of the marking data; acquiring data information recorded by a data processing system; and finally, analyzing the data information and the first data volume to obtain consistency information and time delay information of the marked data in the data processing system, wherein the marked data and the original data pass through a processing node in the data processing system together, so that the consistency information and the time delay information of the marked data can be determined as the consistency information and the time delay information of the original data, an analysis basis is provided for analyzing consistency problems and time delay problems of the processing node when the original data is processed, and further, service personnel can manage the consistency problems and the time delay problems of the processing node based on the analysis basis.

Description

Data management method and device and electronic equipment
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data management method and apparatus, and an electronic device.
Background
In the information age, data has become an important resource as a basis for constituting information, and therefore, more and more people in the industry recognize that data quality becomes an important aspect for determining the quality of resources, and the quality of data depends on the quality of data management. Moreover, with the development of big data technology, the richer and richer data bring more challenges to the management of data. In order to manage more and more abundant data, technicians need to manage various aspects such as data collection, data forwarding, data storage, and data analysis.
However, the inventor finds that the prior art has at least the following problems in the process of implementing the invention:
when data is processed, it passes through a plurality of processing nodes. However, in each processing node, a data processing delay may occur or a problem of data repeated processing caused by restarting the processing node may occur, so it is necessary to manage a problem of consistency and a problem of time delay in data processing so as to find out problems existing in the processing node in time.
Disclosure of Invention
The embodiment of the invention aims to provide a data management method, a data management device and electronic equipment, so as to provide an analysis basis for analyzing the consistency problem and the time delay problem of a processing node during data processing, and enable business personnel to manage the consistency problem and the time delay problem of the processing node based on the analysis basis. The specific technical scheme is as follows:
in one aspect of the embodiments of the present invention, an embodiment of the present invention provides a data management method, including:
when the monitoring data processing system acquires original data, generating marking data, wherein the data generation time of the marking data is recorded in the marking data;
adding the marking data into a data processing system, and recording a first data volume of the marking data;
acquiring data information of the marked data recorded by the data processing system, wherein the data information comprises a second data volume for receiving all the marked data, data receiving time for receiving each marked data and data generating time of the marked data;
and determining the consistency information and the time delay information of the original data according to the data generation time, the data receiving time, the first data volume and the second data volume.
In another aspect of the present invention, an embodiment of the present invention further provides a data processing method applied to a data processing system, where the method includes:
acquiring mark data and original data;
sending the marked data and the original data to a processing node;
and when the processing node processes the original data, recording the second data volume of all the received marked data, the data receiving time of each marked data and the data generating time of the marked data as data information.
In another aspect of the present invention, an embodiment of the present invention further provides a data management apparatus, including:
the system comprises a tag data generation module, a data acquisition module and a data processing module, wherein the tag data generation module is used for generating tag data when the monitoring data processing system acquires original data, and the data generation time of the tag data is recorded in the tag data;
the recording module is used for adding the marking data into the data processing system and recording a first data volume of the marking data;
the data information acquisition module is used for acquiring the data information of the marked data recorded by the data processing system, wherein the data information comprises a second data volume for receiving all the marked data, a data receiving time for receiving each marked data and a data generating time of the marked data;
and the calculation module is used for determining the consistency information and the time delay information of the original data according to the data generation time, the data receiving time, the first data volume and the second data volume.
In another aspect of the present invention, an embodiment of the present invention further provides a data processing system, including:
the acquisition node is used for acquiring the marking data and the original data;
the sending node is used for sending the marked data and the original data to the processing node;
and the processing node is used for recording the second data volume of all the received marked data, the data receiving time of each marked data and the data generating time of the marked data as data information when the original data are processed.
In another aspect of the present invention, an embodiment of the present invention further provides an electronic device, which includes a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;
a memory for storing a computer program;
and a processor for implementing any one of the above data management methods when executing the program stored in the memory.
In another aspect of the present invention, an embodiment of the present invention further provides an electronic device, which includes a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;
a memory for storing a computer program;
and a processor for implementing any one of the above data processing methods when executing the program stored in the memory.
In yet another aspect of the present invention, the present invention also provides a computer-readable storage medium, which stores instructions that, when executed on a computer, cause the computer to execute any one of the above-mentioned data management methods.
In yet another aspect of the present invention, the present invention also provides a computer-readable storage medium, which stores instructions that, when executed on a computer, cause the computer to execute any one of the data processing methods described above.
In yet another aspect of the present invention, the present invention further provides a computer program product containing instructions, which when run on a computer, causes the computer to execute any one of the above-mentioned data management methods.
In yet another aspect of the present invention, the present invention further provides a computer program product containing instructions, which when run on a computer, causes the computer to execute any one of the data processing methods described above.
In another aspect of the present invention, an embodiment of the present invention further provides a data management system, where the data management system includes: a data management apparatus and a data processing system,
the data management device is used for generating marking data when the monitoring data processing system acquires the original data, adding the marking data into the data processing system, and recording a first data volume of the marking data;
the data processing system is used for acquiring the marked data and the original data and sending the marked data and the original data to the processing node;
the data processing system is also used for recording the second data volume of all the received marked data, the data receiving time of each marked data and the data generating time of the marked data as data information when the processing node processes the original data;
and the data management device is also used for acquiring the data information of the marked data recorded by the data processing system and determining the consistency information and the time delay information of the original data according to the data generation time, the data receiving time, the first data volume and the second data volume.
According to the data management method, the data management device and the electronic equipment, when the monitoring data processing system obtains original data, marking data are generated, the generated marking data are added into the data processing system, and a first data volume of the marking data is recorded; then acquiring a second data volume recorded in the data processing system and used for receiving all the marked data, the data receiving time for receiving each marked data and the data generating time of the marked data; and finally, analyzing the data generation time, the data receiving time, the first data volume and the second data volume to obtain consistency information and time delay information of the marked data in the data processing system, wherein the consistency information and the time delay information of the marked data in the data processing system can be determined as the consistency information and the time delay information of the original data in the data processing system because the marked data and the original data pass through a processing node in the data processing system together, so that an analysis basis can be provided for analyzing consistency problems and time delay problems existing when the original data is processed by the processing node, and further, service personnel can manage the consistency problems and the time delay problems of the processing node based on the analysis basis. Of course, it is not necessary for any product or method of practicing the invention to achieve all of the above-described advantages at the same time.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
Fig. 1 is a flowchart of a first implementation of a data management method according to an embodiment of the present invention;
FIG. 2 is a flowchart of a first implementation of a data processing method according to an embodiment of the present invention;
FIG. 3 is a block diagram illustrating a first implementation of a data processing system according to an embodiment of the invention;
FIG. 4 is a block diagram illustrating a second implementation of a data processing system according to an embodiment of the invention;
FIG. 5 is a flowchart of a second implementation of a data management method according to an embodiment of the present invention;
FIG. 6 is a flowchart of a second implementation of a data processing method according to an embodiment of the present invention;
FIG. 7 is a diagram illustrating the tag data and the original data in the data queue according to an embodiment of the present invention;
FIG. 8 is a flowchart illustrating a third implementation manner of a data management method according to an embodiment of the present invention;
FIG. 9 is a flowchart of a third implementation of a data processing method according to an embodiment of the present invention;
FIG. 10 is a flowchart of a fourth implementation of a data management method according to an embodiment of the present invention;
FIG. 11 is a flowchart of a fourth implementation of a data processing method according to an embodiment of the present invention;
FIG. 12 is a flowchart of a fifth implementation manner of a data management method according to an embodiment of the present invention;
fig. 13 is a flowchart of a sixth implementation manner of a data management method according to an embodiment of the present invention;
FIG. 14 is a schematic structural diagram of a data management apparatus according to an embodiment of the present invention;
FIG. 15 is a block diagram illustrating a third exemplary implementation of a data processing system according to the invention;
fig. 16 is a schematic structural diagram of an electronic device to which a data management method according to an embodiment of the present invention is applied;
fig. 17 is a schematic structural diagram of an electronic device to which a data processing method according to an embodiment of the present invention is applied;
fig. 18 is a schematic structural diagram of a data management system according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention.
In order to solve the problems in the prior art, embodiments of the present invention provide a data management method, an apparatus, and an electronic device, so as to manage the consistency problem and the delay problem of each processing node in a data processing system, so as to find the problems existing when the processing nodes process data in time, and thus optimize the processing nodes based on the problems.
Next, a data management method according to an embodiment of the present invention is first described, as shown in fig. 1, which is a flowchart of a first implementation manner of the data management method according to the embodiment of the present invention, and in fig. 1, the method may include:
and S110, generating marking data when the monitoring data processing system acquires the original data.
Wherein the mark data includes a data generation time when the mark data is recorded. The raw data is data that is not processed by the data processing system, and the data generation time is a time point of generating the marking data.
In some examples, a data management method according to an embodiment of the present invention may be applied to monitor a data processing system in order to manage consistency and latency of processing raw data by processing nodes in the data processing system.
In particular, the data management method may monitor the data processing system through a monitoring node, where the monitoring node may be a node to which a monitoring program in the prior art is applied.
The data management device may monitor the acquisition node in the data processing system, and may generate the marker data when monitoring the acquisition of the raw data in the data processing system.
Specifically, the data management device may monitor the data processing system through an active monitoring manner, for example, send inquiry information to the data processing system to monitor whether the data processing system receives the raw data.
The data management device may also monitor the data processing system through a passive monitoring method, for example, after the data processing system receives the original data, a prompt message may be sent to the data management device to prompt the data management device to generate the tag data.
In some examples, the marking data may be a packet that is used for marking, for example, the marking data may be watermark data, and each processing node in the data processing system described above does not perform data processing on the marking data after receiving the marking data.
In some examples, when the data processing system acquires raw data in a time window, the data management apparatus may generate the marker data in the time window. The time window is a time period when the data processing system acquires the original data or a time period when the data management device generates the mark data. For example, if the data processing system described above acquires raw data every 5s, the time window is 5 s.
In order to enable the time lag information of the original data in the data processing system to be determined through the subsequent steps, the data management apparatus described above may add, when generating the tag data, the data generation time at which the tag data is generated to the tag data, so that each processing node in the data processing system can acquire the data generation time from the tag data.
And S120, adding the marking data into the data processing system, and recording a first data volume of the marking data.
In order to determine the consistency information and the time delay information of the original data in the data processing system by analyzing the consistency information and the time delay information of the tag data in the data processing system, the data management device may add the generated tag data to the data processing system after generating the tag data, so that the data processing system may transmit the tag data and the original data to the processing node together.
In some examples, the data management apparatus described above may also record how much of all the tag data added to the data processing system at the same data generation time when the tag data is added to the data processing system, that is, record a first data amount of all the tag data added to the data processing system at the same data generation time, so that the first data amount can be used when performing analysis in a subsequent step.
Specifically, when the above-mentioned data management apparatus generates tag data in accordance with a time window, it is possible to add all the tag data generated within the time window to the above-mentioned data processing system and to set the data amount of the tag data generated within the time window as the first data amount.
S130, acquiring the data information of the mark data recorded by the data processing system.
The data information comprises a second data volume for receiving all the mark data, a data receiving time for receiving each mark data and a data generating time of the mark data, wherein the data receiving time is a time point for receiving the mark data.
In some examples, when the data processing system described above receives the marker data transmitted by the data management apparatus, the marker data and the original data may be transmitted to the processing node, the processing node processes the original data, and records a data reception time when the marker data is received and a second data amount when the marker data is received.
Therefore, the data management device can acquire the data information of the recorded mark data from the data processing system.
In some examples, the data management apparatus described above may actively send a data information acquisition request to a processing node in the data processing system to acquire data information from the processing node.
In some examples, a processing node in a data processing system may also actively send recorded data information to the data management device described above.
In some examples, the processing node may also process the raw data according to a time window and record the marked data. The time window may be the same as the time window in which the data processing system acquired the raw data and/or the time window in which the tagged data was generated.
In some examples, the data processing system may record the tag data in a data format of (event _ time, process _ time, count), where "event _ time" may represent a data generation time of the tag data, "process _ time" may represent a data reception time at which each processing node receives the tag data, and "count" may represent a second data amount at which each processing node receives the tag data.
In a possible implementation manner, after the data management device obtains the data information, the data management device may store the data information in the data format, so as to facilitate analysis in a subsequent step.
In one possible implementation manner, when the data generation time is different, the data generation time of the tag data may be used as the identification information of the tag data, and the different tag data may be distinguished. When the data generation time is the same, the data reception time of the tag data may be used as the identification information of the tag data to distinguish different tag data.
In some examples, when storing according to the data format described above, data information recorded by the same processing node may be recorded in a data table. The data information recorded by different processing nodes is recorded in different data tables.
And S140, determining the consistency information and the time delay information of the original data according to the data generation time, the data receiving time, the first data volume and the second data volume.
Specifically, after the data management apparatus described above acquires the second data amount, the consistency information of the marker data may be determined according to the second data amount and the first data amount, and since both the marker data and the original data are recorded and processed by the same processing node, the consistency information of the marker data may be used as the consistency information of the original data.
Specifically, after the data management apparatus obtains the data receiving time and the data generating time, since each tag data has the corresponding data receiving time and data generating time, the data management apparatus may determine the delay information of the corresponding tag data according to the data receiving time and the data generating time, so as to determine the delay information of the tag data of the first data size, and then may use the delay information of the tag data of the first data size as the delay information of the original data.
In some examples, with this step, consistency information and latency information of raw data when processed by one processing node may be determined, and consistency information and latency information of raw data when processed by multiple processing nodes may also be determined.
In some examples, the data management apparatus described above may also determine consistency information and latency information of the raw data between any two processing nodes.
Specifically, when the original data is processed by any two processing nodes in the data processing system, the any two nodes may acquire the tag data at the same time, record the data receiving time of receiving each tag data and the second data volume of receiving all the tag data, and then may calculate the consistency information and the delay information of the tag data between any two processing nodes through the data receiving time recorded by the any two processing nodes and the second data volume recorded by any two processing nodes, and use the consistency information and the delay information of the tag data between any two processing nodes as the consistency information and the delay information of the original data between any two processing nodes.
In some examples, the data management apparatus described above, when calculating the consistency information and the latency information, may calculate the consistency information and the latency information in an arrangement manner of respective processing nodes in the data processing system.
For example, the consistency information and the latency information between the first processing node 310 and the second processing node 320 may be calculated according to the arrangement of the processing nodes in the data processing system shown in fig. 3, and the consistency information and the latency information between the third processing node 330 and the obtaining node 300 may also be calculated.
For another example, the arrangement of the processing nodes in the data processing system shown in fig. 4 may be used to calculate the consistency information and the time delay information between the first processing node 310 and the obtaining node 300, calculate the consistency information and the time delay information between the second processing node 320 and the obtaining node 300, and calculate the consistency information and the time delay information between the third processing node 330 and the obtaining node 300.
According to the data management method, when the monitoring data processing system acquires original data, marking data are generated, the generated marking data are added into the data processing system, and a first data volume of the marking data is recorded; then acquiring a second data volume recorded in the data processing system and used for receiving all the marked data, the data receiving time for receiving each marked data and the data generating time of the marked data; and finally, analyzing the data generation time, the data receiving time, the first data volume and the second data volume to obtain consistency information and time delay information of the marked data in the data processing system, wherein the consistency information and the time delay information of the marked data in the data processing system can be determined as the consistency information and the time delay information of the original data in the data processing system because the marked data and the original data pass through a processing node in the data processing system together, so that an analysis basis can be provided for analyzing consistency problems and time delay problems existing in the data processing of the processing node, and furthermore, service personnel can perform optimization processing on the processing node based on an analysis result.
Corresponding to the data management method shown in fig. 1, an embodiment of the present invention further provides a data processing method, as shown in fig. 2, which is a flowchart of a first implementation manner of the data processing method according to the embodiment of the present invention, and in fig. 2, the method may include:
s210, acquiring mark data and original data.
In some examples, a data processing method according to an embodiment of the present invention may be applied to the data processing system shown in fig. 3 or fig. 4, and may also be applied to the data processing system combining fig. 3 and fig. 4. As shown in fig. 3 or 4, the data processing system may include an acquisition node 300, a first processing node 310, a second processing node 320, and a third processing node 330.
The acquisition node 300 is used to acquire the marking data and the raw data.
The first processing node 310, the second processing node 320 and the third processing node 330 are configured to process the original data and record data information of the received tag data.
The first processing node 310, the second processing node 320, and the third processing node 330 are exemplary processing nodes, and it should be understood that the data processing system may include one processing node or more processing nodes. Fig. 3 and 4 are schematic diagrams of two arrangements of a first processing node 310, a second processing node 320, and a third processing node 330, respectively, in a data processing system.
In some examples, the obtaining node 300, the first processing node 310, the second processing node 320, and the third processing node 330 may be nodes respectively disposed on different hardware devices, or may be different nodes disposed on the same hardware device.
Specifically, after the data management apparatus generates the tag data and transmits the tag data to the data processing system, the data processing system may acquire the tag data.
In some examples, the data processing system described above may acquire raw data in a time window, for example, 15 raw data acquired within a 10 second time window. The raw data may be data generated by a data generating device.
In some examples, the data processing system described above may further include a raw data generation node that may generate raw data and acquire the generated raw data by the acquisition node 300.
S220, the marked data and the original data are sent to a processing node.
Specifically, after the data processing system receives the marked data and the original data, in order to enable the processing node to process the original data and to be able to know the consistency problem and the time delay problem of the original data when being processed by the processing node, the data processing system can send the marked data and the original data to the processing node together.
And S230, when the processing node processes the original data, recording the second data volume of all the received marked data, the data receiving time of each marked data and the data generating time of the marked data as data information.
Specifically, after receiving the original data, the processing node in the data processing system may process the original data and record data information of the received marked data.
For example, in fig. 3, after receiving the original data and the marked data, the first processing node 310 may process the original data to obtain first processed data, and may also record data information of the received marked data, and then send the first processed data and the marked data to the second processing node 320 for processing; after the second processing node 320 processes the first processing data to obtain second processing data, it may record data information of the received tag data, and then send the second processing data and the tag data to the third processing node 330 for processing; after the third processing node 330 processes the second processed data to obtain third processed data, it may record the data information of the received tag data, so that the data management apparatus can obtain the data information from the data processing system.
For another example, in fig. 4, the first processing node 310, the second processing node 320, and the third processing node 330 may receive the original data and the marked data, respectively, and then process the received original data, respectively, to obtain processed first processed data, second processed data, and third processed data. Meanwhile, the data information of the received mark data can be recorded respectively. So that the data management device can acquire data information from the data processing system.
In some examples, as shown in fig. 3, the data processing system described above may further include a recording node 340, which may be separately provided on a hardware device different from the first processing node 310, the second processing node 320, and the third processing node 330, and in the data processing system shown in fig. 3, data information recorded by each node may be stored in the recording node 340.
In some examples, the recording node 340 may record data information for various nodes in an arrangement of various processing nodes in the data processing system.
As also shown in fig. 4, the data processing system may further include a logging module, which may be disposed on each processing node, for example, the first processing node 310 may include a first logging module, the second processing node 320 may include a second logging module, and the third processing node 330 may include a third logging module.
According to the data processing method provided by the embodiment of the invention, after the marked data and the original data are received, the marked data and the original data are jointly sent to the processing node, so that the processing node records the data information when the marked data is received when processing the original data, the data management device can acquire the data information, and then the consistency information and the time delay information are acquired through the data information.
In an optional embodiment of the present invention, on the basis of the data management method shown in fig. 1, an embodiment of the present invention further provides a possible implementation manner, as shown in fig. 5, which is a flowchart of a second implementation manner of the data management method according to the embodiment of the present invention, and in fig. 5, the generating the tag data in S110 may include:
and S111, generating a preset number of marking data.
Wherein the preset number may be a number value of the generation flag data which is set in advance empirically. For example, the preset number may be set to 10, 20, or 50, etc.
Specifically, when the data management system monitors that the data processing system acquires the original data according to the time windows, a preset amount of marking data may be generated in each time window.
By generating the preset number of the tag data, the preset number of the tag data can be all sent to the data processing system, so that the purpose of the embodiment of the invention is realized.
Corresponding to the data management method shown in fig. 5, on the basis of the data processing method shown in fig. 2, an embodiment of the present invention further provides a data processing method, as shown in fig. 6, which is a flowchart of a second implementation manner of the data processing method in the embodiment of the present invention, and in fig. 6, S220 sends the marked data and the original data to the processing node, which may include:
s221, randomly adding a preset amount of marking data into a data queue in the data processing system, and sending the marking data to a processing node through the data queue, wherein the data queue also comprises original data.
In some examples, at least one data queue may be disposed in the acquiring node 300 in the data processing system, and the data queue may be configured to store the acquired raw data, so that after receiving the marked data, the marked data and the raw data are jointly transmitted to the processing node.
Specifically, when the data management apparatus generates a preset number of marked data and sends the preset number of marked data to the data processing system, the data processing system may randomly insert the received preset number of marked data into a data queue in which original data are stored.
For example, as shown in fig. 7, the acquisition node 300 in the data processing system described above may be provided with a data queue 1 and a data queue 2. In the data queue 1, the arrangement of the tag data and the original data may be, in order: original data 11, marked data 11, original data 12, marked data 12, original data 13, original data 14 and marked data 13, wherein in the data queue 2, the marked data and the original data may be arranged in the following order: marking data 21, original data 22, marking data 22, original data 23, marking data 23, original data 24.
By adding the marked data into the data queue of the original data, the processing node can acquire the marked data when acquiring the original data in the data queue, record the data information of the acquired marked data and process the original data.
In an optional embodiment of the present invention, on the basis of the data management method shown in fig. 1, an embodiment of the present invention further provides a possible implementation manner, as shown in fig. 8, which is a flowchart of a third implementation manner of the data management method according to the embodiment of the present invention, and in fig. 8, in S110, generating the tag data may include:
s112, a third data amount of the original data is acquired, and mark data corresponding to the third data amount is generated.
In some examples, the data management apparatus described above may generate the same amount of flag data as the original data.
Specifically, the above-described data management apparatus may acquire a third data amount of the original data from the data processing system, and then generate the same flag data as the third data amount.
By generating the marking data with the same data amount as the third data amount, the data processing system can be enabled to add the marking data with the same data amount as the original data to the original data respectively, so that the occupation of the marking data on the data queue is reduced.
Corresponding to the data management method shown in fig. 8, an embodiment of the present invention further provides a data processing method, as shown in fig. 9, which is a flowchart of a third implementation of the data processing method shown in the embodiment of the present invention, and in fig. 9, S220 sends the marked data and the original data to a processing node, which may include:
s222, adding the tag data corresponding to the third data amount of the original data to the original data, and sending the original data added with the tag data to the processing node.
Wherein one original data tag data is added with one tag data.
In a possible implementation manner of the embodiment of the present invention, in order to reduce occupation of the tag data on the data queue, after receiving the tag data sent by the data management device, the data processing system may add one tag data to each piece of original data, so that when processing the original data, the processing node may obtain the tag data.
Specifically, the marker data may be added to the data head or the data tail of the original data.
The data processing system may store the marked data in a data queue after adding the marked data to each piece of original data, the processing node may obtain the marked data added original data from the data queue, and after receiving the marked data added original data, the processing node may process only the original data in the marked data added original data and record data information of the marked data added original data.
The mark data is added to each original data, so that the occupation of the mark data on the data queue can be reduced, more original data can be stored in the data queue, furthermore, the consistency information and the time delay information of each original data in the processing node can be accurately calculated by the data management method of the embodiment of the invention, and the accuracy of the data management method of the embodiment of the invention is improved.
In an optional embodiment of the present invention, on the basis of the data management method shown in fig. 1, a possible implementation manner is further provided in the embodiment of the present invention, as shown in fig. 10, which is a flowchart of a fourth implementation manner of the data management method shown in fig. 1, in fig. 10, S140, determining consistency information and delay information of original data according to a data generation time, a data reception time, a first data amount, and a second data amount, may include:
and S141, aiming at the same processing node, selecting the maximum data receiving time recorded by the processing node, and taking a first difference value between the maximum data receiving time and the data generating time as the time delay information of the original data.
In some examples, the generation time of each tag data may be the same, in which case, in order to reduce the complexity of calculating the delay information by the data management apparatus, in this step, the data management apparatus may select, for the same processing node, the maximum data reception time recorded by the processing node, and then use the difference between the maximum data reception time and the data generation time as the delay information of the original data.
Through the step, the maximum time delay of a processing node in the process of processing the original data can be obtained.
And S142, aiming at the same processing node, comparing the second data volume with the first data volume of the processing node, when the second data volume is the same as the first data volume, determining that the consistency information of the original data is consistent, and when the second data volume is different from the first data volume, determining that the consistency information of the original data is inconsistent.
The data management device can calculate the time delay information and also calculate the consistency information of the original data at the processing node aiming at the same processing node.
Specifically, the data management apparatus may compare the second data size of the processing node with the first data size.
And when the second data volume is the same as the first data volume, determining that the consistency information of the original data is consistent, and when the second data volume is different from the first data volume, determining that the consistency information of the original data is inconsistent.
In some examples, when the second amount of data is different from the first amount of data, determining that the consistency information of the original data is inconsistent may include:
when the second data amount is larger than the first data amount, the consistency information of the original data can be determined as data repetition; when the second data amount is smaller than the first data amount, the consistency information of the original data may be determined to be data loss.
Thus, the above consistency information may include: data consistency, data duplication, and data loss.
In a possible implementation manner of the embodiment of the present invention, each processing node in the data processing system includes at least two threads, and when each processing node processes raw data through at least two threads, on the basis of a data processing method shown in fig. 2, an embodiment of the present invention further provides a data processing method, as shown in fig. 11, which is a flowchart of a fourth implementation manner of the data processing method according to the embodiment of the present invention, in fig. 11, S230 records, as data information, the second data amount of all received tag data, the data receiving time of each tag data, and the data generating time of the tag data, and may include:
s231 records the second data amount of all the tag data received by each thread, the data receiving time of each tag data, and the data generating time of the tag data as data information corresponding to the thread number of the thread.
Specifically, for each processing node, when the processing node processes original data by using at least two threads, each thread receives the tag data, and therefore, the second data amount of all the tag data received by each thread, the data receiving time when each tag data is received, and the data generating time when the tag data is generated can be recorded as data information corresponding to the thread number of the thread.
In some examples, when each processing node includes at least two threads, when data information corresponding to each thread is recorded, the data information of each thread may be recorded by taking the thread number of each thread as identification information of the thread and establishing an association relationship between the thread number and the data information recorded by the thread corresponding to the thread number. So as to distinguish the data information recorded by each thread.
By obtaining the data information recorded by each thread in the step, the data information recorded by each thread can be obtained when the data management method of the embodiment of the invention obtains the data information. So as to calculate consistency information and time delay information when each thread processes the original data through subsequent steps.
In an optional embodiment of the present invention, when the data information includes: when the thread number is the same, the data information may include: data information corresponding to the thread number.
For this reason, corresponding to a data processing method shown in fig. 11, on the basis of a data management method shown in fig. 10, an embodiment of the present invention further provides a possible implementation manner, as shown in fig. 12, which is a flowchart of a fifth implementation manner of the data management method according to the embodiment of the present invention, in fig. 12, S141, for a same processing node, selecting a maximum data receiving time recorded by the processing node, and taking a first difference between the maximum data receiving time and a data generating time as time delay information of original data, where the first difference may include:
s1411, aiming at the same thread number in the same processing node, selecting the maximum data receiving time corresponding to the thread number, and using a second difference value between the maximum data receiving time corresponding to the thread number and the data generating time as the time delay information of the original data corresponding to the thread number.
In some examples, the generation time of each tag data may be the same, in this case, in order to reduce the complexity of calculating the latency information of each thread by the data management apparatus, in this step, the data management apparatus may select, for each thread in the same processing node, a maximum data reception time corresponding to the thread number, and use a second difference value between the maximum data reception time corresponding to the thread number and the data generation time as the latency information corresponding to the thread number.
In some examples, after obtaining the delay information corresponding to the thread numbers of all threads in one processing node, the data management apparatus may further compare the delay information corresponding to the thread numbers of all threads, and select the largest delay information as the delay information of the processing node.
When the data information of each processing node acquired by the data management apparatus is data information corresponding to a thread number in the processing node, in order to compare the second data volume and the first data volume of the processing node, an embodiment of the present invention further provides a possible implementation manner, in which, in S142, comparing the second data volume and the first data volume of the processing node for the same processing node, the comparing may include:
s1421, for the same processing node, summing the second data volumes corresponding to all the thread numbers of the processing node to obtain the second data volume of the processing node, and comparing the second data volume of the processing node with the first data volume.
In this step, for each processing node, when the data information acquired by the data management apparatus is data information corresponding to the thread number in the processing node, the second data size of the processing node may be calculated by the second data size in the data information corresponding to the thread number, and the second data size and the first data size of the processing node may be compared to determine the consistency information of the processing node.
On the basis of the data management method shown in fig. 1, an embodiment of the present invention further provides a possible implementation manner, which may implement determining system consistency information and system latency information of a data processing system, as shown in fig. 13, which is a flowchart of a sixth implementation manner of the data management method according to the embodiment of the present invention, and in fig. 13, S130, acquiring data information of tag data recorded by the data processing system may include:
s131, data information of the marked data recorded by the last processing node in the data processing system is obtained.
Specifically, in order to determine the system consistency information and the system delay information of the data processing system, the data management apparatus described above may acquire the data information of the marker data recorded by the last processing node in the data processing system when acquiring the data information.
After the data management system obtains the data information recorded by the last processing node in the data processing system, in step S140, the consistency information and the delay information of the original data are determined according to the data generation time, the data reception time, the first data volume, and the second data volume, which may be specifically implemented by the following steps:
s143, calculating a second difference value between the data receiving time and the data generating time of the last processing node, and taking the second difference value as system time delay information of the data processing system for processing the original data;
s144, comparing the second data volume with the first data volume of the last processing node, and taking the comparison result as system consistency information of the data processing system for processing the original data.
Specifically, after obtaining the second difference, the comparison result between the second data size and the first data size of the last processing node, the data management apparatus may record the data information of the received marked data when the last processing node processes the original data, and thus, the second difference may be used as the system delay information, and the comparison result between the second data size and the first data size of the last processing node may be used as the system consistency information.
Corresponding to the embodiment of the data management method shown in fig. 1, an embodiment of the present invention further provides a data management apparatus, as shown in fig. 14, which is a schematic structural diagram of the data management apparatus according to the embodiment of the present invention, and in fig. 14, the apparatus may include:
the marker data generation module 1410 is configured to generate marker data when the monitoring data processing system acquires original data, where data generation time of the marker data is recorded in the marker data;
a recording module 1420, configured to add the marked data to the data processing system, and record a first data amount of the marked data;
a data information obtaining module 1430, configured to obtain data information of the marked data recorded by the data processing system, where the data information includes a second data amount for receiving all the marked data, a data receiving time for receiving each marked data, and a data generating time of the marked data;
the calculating module 1440 is configured to determine consistency information and delay information of the original data according to the data generating time, the data receiving time, the first data amount, and the second data amount.
According to the data management device, when the monitoring data processing system acquires original data, marking data are generated, the generated marking data are added into the data processing system, and a first data volume of the marking data is recorded; then acquiring a second data volume recorded in the data processing system and used for receiving all the marked data, the data receiving time for receiving each marked data and the data generating time of the marked data; and finally, analyzing the data generation time, the data receiving time, the first data volume and the second data volume to obtain consistency information and time delay information of the marked data in the data processing system, wherein the consistency information and the time delay information of the marked data in the data processing system can be determined as the consistency information and the time delay information of the original data in the data processing system because the marked data and the original data pass through a processing node in the data processing system together, so that an analysis basis can be provided for analyzing consistency problems and time delay problems existing in the data processing of the processing node, and furthermore, service personnel can perform optimization processing on the processing node based on an analysis result.
Specifically, the labeled data generating module 1410 is specifically configured to:
generating a preset amount of marking data; or
And acquiring a third data volume of the original data, and generating marking data corresponding to the third data volume.
Specifically, the calculation module 1440 includes:
the time delay calculation submodule is used for selecting the maximum data receiving time recorded by the processing node aiming at the same processing node, and taking a first difference value between the maximum data receiving time and the data generating time as the time delay information of the original data;
and the consistency information determining submodule is used for comparing a second data volume of the processing node with the first data volume of the same processing node, determining that the consistency information of the original data is consistent when the second data volume is the same as the first data volume, and determining that the consistency information of the original data is inconsistent when the second data volume is different from the first data volume.
Specifically, the data information further includes: the thread number, the data information includes the data information corresponding to thread number;
the time delay calculation submodule is specifically configured to:
aiming at the same thread number in the same processing node, selecting the maximum data receiving time corresponding to the thread number, and taking a second difference value between the maximum data receiving time corresponding to the thread number and the data generating time as the time delay information of the original data corresponding to the thread number;
the consistency information determining submodule is specifically configured to:
and for the same processing node, summing the second data volumes corresponding to all the thread numbers of the processing node to obtain the second data volume of the processing node, and comparing the second data volume of the processing node with the first data volume.
Specifically, the data information obtaining module 1430 is further configured to:
acquiring data information of marked data recorded by the last processing node in the data processing system;
accordingly, the calculating module 1440 further comprises:
the system time delay calculation submodule is used for calculating a second difference value between the data receiving time and the data generating time of the last processing node, and the second difference value is used as system time delay information of the data processing system for processing the original data;
and the system consistency determining submodule is used for comparing the second data volume with the first data volume of the last processing node and taking the comparison result as system consistency information of the data processing system for processing the original data.
Corresponding to the data processing method embodiment shown in fig. 2, an embodiment of the present invention further provides a data processing system, as shown in fig. 15, which is a schematic structural diagram of a third implementation manner of the data processing system according to the embodiment of the present invention, and in fig. 15, the system may include:
an acquisition node 1510 configured to acquire the tag data and the original data;
a sending node 1520 for sending the marked data and the raw data to the processing node;
and a processing node 1530 for recording the second data amount of all the received tag data, the data receiving time of each tag data, and the data generating time of the tag data as data information when processing the original data.
According to the data processing system provided by the embodiment of the invention, after the marked data and the original data are received, the marked data and the original data are jointly sent to the processing node, so that the processing node records the data information when the marked data is received when processing the original data, the data management device can acquire the data information, and then the consistency information and the time delay information are acquired through the data information.
Specifically, the sending node 1520 is specifically configured to:
randomly adding a preset amount of marked data into a data queue in a data processing system, and sending the marked data to a processing node through the data queue, wherein the data queue also comprises original data;
or
And adding mark data corresponding to a third data volume of the original data to the original data, and sending the original data added with the mark data to the processing nodes, wherein one piece of original data is added with one piece of mark data.
In particular, each processing node in the data processing system comprises at least two threads, processing node 1530 is specifically configured to:
and recording the second data volume of all the mark data received by each thread, the data receiving time of each mark data and the data generating time of the mark data as the data information corresponding to the thread number of the thread.
Fig. 16 is a schematic structural diagram of an electronic device to which a data management method according to an embodiment of the present invention is applied, and in fig. 16, the electronic device includes a processor 1610, a communication interface 1620, a memory 1630 and a communication bus 1640, where the processor 1610, the communication interface 1620 and the memory 1630 complete communication with each other through the communication bus 1640,
a memory 1630 for storing computer programs;
the processor 1610, when executing the program stored in the memory 1630, implements the following steps:
when the monitoring data processing system acquires original data, generating marking data, wherein the data generation time of the marking data is recorded in the marking data;
adding the marking data into a data processing system, and recording a first data volume of the marking data;
acquiring data information of the marked data recorded by the data processing system, wherein the data information comprises a second data volume for receiving all the marked data, data receiving time for receiving each marked data and data generating time of the marked data;
and determining the consistency information and the time delay information of the original data according to the data generation time, the data receiving time, the first data volume and the second data volume.
According to the electronic equipment provided by the embodiment of the invention, when the monitoring data processing system acquires original data, marking data is generated, the generated marking data is added into the data processing system, and a first data volume of the marking data is recorded; then acquiring a second data volume recorded in the data processing system and used for receiving all the marked data, the data receiving time for receiving each marked data and the data generating time of the marked data; and finally, analyzing the data generation time, the data receiving time, the first data volume and the second data volume to obtain consistency information and time delay information of the marked data in the data processing system, wherein the consistency information and the time delay information of the marked data in the data processing system can be determined as the consistency information and the time delay information of the original data in the data processing system because the marked data and the original data pass through a processing node in the data processing system together, so that an analysis basis can be provided for analyzing consistency problems and time delay problems existing when the original data is processed by the processing node, and furthermore, service personnel can perform optimization processing on the processing node based on an analysis result.
Fig. 17 is a schematic structural diagram of an electronic device to which a data processing method according to an embodiment of the present invention is applied, where in fig. 17, the electronic device includes a processor 1710, a communication interface 1720, a memory 1730, and a communication bus 1740, where the processor 1710, the communication interface 1720, and the memory 1730 complete communication with each other through the communication bus 1740,
a memory 1730 for storing computer programs;
the processor 1710, when executing the program stored in the memory 1730, implements the following steps:
acquiring mark data and original data;
sending the marked data and the original data to a processing node;
and when the processing node processes the original data, recording the second data volume of all the received marked data, the data receiving time of each marked data and the data generating time of the marked data as data information.
According to the electronic equipment provided by the embodiment of the invention, after the marked data and the original data are received, the marked data and the original data are jointly sent to the processing node, so that the processing node records the data information when the marked data is received when processing the original data, and the data management device can acquire the data information, and further acquire the consistency information and the time delay information through the data information.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the Integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.
In another embodiment, the present invention further provides a computer-readable storage medium, which stores instructions that, when executed on a computer, cause the computer to perform any one of the above-mentioned data management methods.
In another embodiment, the present invention further provides a computer-readable storage medium, which stores instructions that, when executed on a computer, cause the computer to execute any one of the above-mentioned data processing methods.
In another embodiment, the present invention further provides a computer program product containing instructions, which when run on a computer, causes the computer to execute any one of the above-mentioned data management methods.
In another embodiment, the present invention further provides a computer program product including instructions, which when run on a computer, causes the computer to execute any one of the data processing methods described above.
In another embodiment provided by the present invention, an embodiment of the present invention further provides a data management system, as shown in fig. 18, which is a schematic structural diagram of the data management system according to the embodiment of the present invention, and in fig. 18, the data management system includes: data management device 1810 and data processing system 1820;
the data management device 1810 is used for generating mark data when the monitoring data processing system acquires the original data, adding the mark data into the data processing system, and recording a first data volume of the mark data;
the data processing system 1820 is configured to obtain the marked data and the raw data, and send the marked data and the raw data to the processing node;
the data processing system 1820 is further configured to record, as data information, the second data size of all the received tag data, the data receiving time of each tag data, and the data generating time of the tag data when the processing node processes the original data;
the data management device 1810 is further configured to obtain data information of the marked data recorded by the data processing system, and determine consistency information and time delay information of the original data according to the data generation time, the data reception time, the first data volume, and the second data volume.
According to the data management system, when the monitoring data processing system acquires original data, marking data are generated, the generated marking data are added into the data processing system, and a first data volume of the marking data is recorded; after receiving the marked data and the original data, the data processing system sends the marked data and the original data to the processing node together, so that the processing node records data information when receiving the marked data when processing the original data, and then the data management device can acquire a second data volume for receiving all the marked data, data receiving time for receiving each marked data and data generating time of the marked data, which are recorded in the data processing system; and finally, analyzing the data generation time, the data receiving time, the first data volume and the second data volume to obtain consistency information and time delay information of the marked data in the data processing system, wherein the consistency information and the time delay information of the marked data in the data processing system can be determined as the consistency information and the time delay information of the original data in the data processing system because the marked data and the original data pass through a processing node in the data processing system together, so that an analysis basis can be provided for analyzing consistency problems and time delay problems existing when the original data is processed by the processing node, and furthermore, service personnel can perform optimization processing on the processing node based on an analysis result.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (19)

1. A method for managing data, the method comprising:
when a monitoring data processing system acquires original data, generating marked data, wherein the marked data records the data generation time of the marked data, the marked data is data which does not need to be processed by each processing node of the data processing system, and the original data is data which needs to be processed by each processing node;
adding the marked data into the data processing system, and recording a first data volume of the marked data, so that the data processing system sends the marked data and the original data to the processing nodes together, the processing nodes process the original data, and record a data receiving time when the marked data is received and a second data volume when the marked data is received;
acquiring data information of the marked data recorded by each processing node of the data processing system, wherein the data information comprises a second data volume for receiving all the marked data, data receiving time for receiving each marked data and data generating time of the marked data;
and determining consistency information and time delay information of the original data according to the data generation time, the data receiving time, the first data volume and the second data volume.
2. The method of claim 1, wherein generating the marking data comprises:
generating a preset amount of marking data; or
And acquiring a third data volume of the original data, and generating marking data corresponding to the third data volume.
3. The method of claim 1, wherein determining the consistency information and the time delay information of the original data according to the data generation time, the data reception time, the first data amount, and the second data amount comprises:
selecting the maximum data receiving time recorded by the processing node aiming at the same processing node, and taking a first difference value between the maximum data receiving time and the data generating time as the time delay information of the original data;
and comparing the second data volume with the first data volume of the same processing node, determining that the consistency information of the original data is consistent when the second data volume is the same as the first data volume, and determining that the consistency information of the original data is inconsistent when the second data volume is different from the first data volume.
4. The method of claim 3, wherein the data information further comprises: the thread number, the said data message includes the data message corresponding to said thread number;
the selecting, for the same processing node, a maximum data reception time recorded by the processing node, and taking a first difference between the maximum data reception time and the data generation time as the time delay information of the original data, includes:
selecting the maximum data receiving time corresponding to the thread number aiming at the same thread number in the same processing node, and using a second difference value between the maximum data receiving time corresponding to the thread number and the data generating time as the time delay information of the original data corresponding to the thread number;
the comparing, for the same processing node, the second data volume and the first data volume of the processing node includes:
and for the same processing node, summing the second data volumes corresponding to all the thread numbers of the processing node to obtain the second data volume of the processing node, and comparing the second data volume of the processing node with the first data volume.
5. The method of claim 1, wherein the obtaining data information of the marked data recorded by the data processing system comprises:
acquiring data information of the marked data recorded by the last processing node in the data processing system;
correspondingly, the determining the consistency information and the time delay information of the original data according to the data generating time, the data receiving time, the first data volume and the second data volume includes:
calculating a second difference value between the data receiving time and the data generating time of the last processing node, and taking the second difference value as system time delay information of the data processing system for processing the original data;
and comparing the second data volume of the last processing node with the first data volume, and taking a comparison result as system consistency information of the data processing system for processing the original data.
6. A data processing method, applied to a data processing system, the method comprising:
acquiring marked data and original data, wherein the marked data are data which do not need to be processed by each processing node of the data processing system, and the original data are data which need to be processed by each processing node;
sending the marked data and the original data to a processing node;
and when the processing node processes the original data, recording the second data volume of all the received marked data, the data receiving time of each marked data and the data generating time of the marked data as data information.
7. The method of claim 6, wherein sending the tagged data and raw data to a processing node comprises:
randomly adding a preset amount of the marking data into a data queue in the data processing system, and sending the marking data to the processing node through the data queue, wherein the data queue also comprises the original data;
or
Adding mark data corresponding to a third data amount of original data to the original data, and sending the original data added with the mark data to the processing node, wherein one mark data is added to one original data.
8. The method of claim 6, wherein each processing node in the data processing system comprises at least two threads, and wherein recording the second amount of data of all the tag data received, the time of data reception of each tag data, and the time of data generation of the tag data as data information comprises:
and recording the second data volume of all the mark data received by each thread, the data receiving time of each mark data and the data generating time of the mark data as data information corresponding to the thread number of the thread.
9. A data management apparatus, characterized in that the apparatus comprises:
the system comprises a tag data generation module, a data processing module and a data processing module, wherein the tag data generation module is used for generating tag data when a monitoring data processing system acquires original data, the tag data records the data generation time of the tag data, the tag data is data which does not need to be processed by each processing node of the data processing system, and the original data is data which needs to be processed by each processing node;
the recording module is used for adding the marked data into the data processing system and recording a first data volume of the marked data so that the data processing system sends the marked data and the original data to the processing nodes together, the processing nodes process the original data and record a data receiving time when the marked data is received and a second data volume when the marked data is received;
a data information obtaining module, configured to obtain data information of the marked data recorded by each processing node of the data processing system, where the data information includes a second data volume for receiving all the marked data, a data receiving time for receiving each marked data, and a data generating time of the marked data;
and the calculation module is used for determining the consistency information and the time delay information of the original data according to the data generation time, the data receiving time, the first data volume and the second data volume.
10. The apparatus of claim 9, wherein the tag data generation module is specifically configured to:
generating a preset amount of marking data; or
And acquiring a third data volume of the original data, and generating marking data corresponding to the third data volume.
11. The apparatus of claim 9, wherein the computing module comprises:
the time delay calculation submodule is used for selecting the maximum data receiving time recorded by the processing node aiming at the same processing node, and taking a first difference value between the maximum data receiving time and the data generating time as the time delay information of the original data;
and the consistency information determining submodule is used for comparing the second data volume of the processing node with the first data volume of the same processing node, determining that the consistency information of the original data is consistent when the second data volume is the same as the first data volume, and determining that the consistency information of the original data is inconsistent when the second data volume is different from the first data volume.
12. The apparatus of claim 11, wherein the data information further comprises: the thread number, the said data message includes the data message corresponding to said thread number;
the delay calculation submodule is specifically configured to:
selecting the maximum data receiving time corresponding to the thread number aiming at the same thread number in the same processing node, and using a second difference value between the maximum data receiving time corresponding to the thread number and the data generating time as the time delay information of the original data corresponding to the thread number;
the consistency information determining submodule is specifically configured to:
and for the same processing node, summing the second data volumes corresponding to all the thread numbers of the processing node to obtain the second data volume of the processing node, and comparing the second data volume of the processing node with the first data volume.
13. The apparatus of claim 9, wherein the data information obtaining module is further configured to:
acquiring data information of the marked data recorded by the last processing node in the data processing system;
correspondingly, the computing module further includes:
a system delay calculation submodule, configured to calculate a second difference between the data receiving time and the data generating time of the last processing node, and use the second difference as system delay information for processing the original data by the data processing system;
and the system consistency determining submodule is used for comparing the second data volume with the first data volume of the last processing node and taking a comparison result as system consistency information of the data processing system for processing the original data.
14. A data processing system, characterized in that the system comprises:
the system comprises an acquisition node and a data processing system, wherein the acquisition node is used for acquiring marked data and original data, the marked data is data which does not need to be processed by each processing node of the data processing system, and the original data is data which needs to be processed by each processing node;
the sending node is used for sending the marking data and the original data to a processing node;
and the processing node is used for recording the second data volume of all the received marked data, the data receiving time of each marked data and the data generating time of the marked data as data information when the original data are processed.
15. The system according to claim 14, wherein the sending node is specifically configured to:
randomly adding a preset amount of the marking data into a data queue in the data processing system, and sending the marking data to the processing node through the data queue, wherein the data queue also comprises the original data;
or
Adding mark data corresponding to a third data amount of original data to the original data, and sending the original data added with the mark data to the processing node, wherein one mark data is added to one original data.
16. The system of claim 14, wherein each processing node in the data processing system comprises at least two threads, the processing node being configured to:
and recording the second data volume of all the mark data received by each thread, the data receiving time of each mark data and the data generating time of the mark data as data information corresponding to the thread number of the thread.
17. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the method steps of any one of claims 1 to 5 when executing a program stored in the memory.
18. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the method steps of any of claims 6 to 8 when executing a program stored in the memory.
19. A data management system, characterized in that the system comprises: a data management apparatus and a data processing system,
the data management device is used for generating mark data when a monitoring data processing system acquires original data, adding the mark data into the data processing system, and recording a first data volume of the mark data so that the data processing system sends the mark data and the original data to each processing node of the data processing system together, the original data is processed by each processing node, and a data receiving time when the mark data is received and a second data volume when the mark data is received are recorded, the mark data is data which does not need to be processed by each processing node, and the original data is data which needs to be processed by each processing node;
the data processing system is used for acquiring marked data and original data and sending the marked data and the original data to a processing node;
the data processing system is further configured to record, as data information, a second data volume of all the received tagged data, a data receiving time of each tagged data, and a data generating time of the tagged data when the processing node processes the original data;
the data management device is further configured to obtain data information of the marked data recorded by each processing node of the data processing system, and determine consistency information and time delay information of the original data according to the data generation time, the data reception time, the first data volume, and the second data volume.
CN201810339347.7A 2018-04-16 2018-04-16 Data management method and device and electronic equipment Active CN108763291B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810339347.7A CN108763291B (en) 2018-04-16 2018-04-16 Data management method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810339347.7A CN108763291B (en) 2018-04-16 2018-04-16 Data management method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN108763291A CN108763291A (en) 2018-11-06
CN108763291B true CN108763291B (en) 2021-04-30

Family

ID=64010764

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810339347.7A Active CN108763291B (en) 2018-04-16 2018-04-16 Data management method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN108763291B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111552566B (en) * 2020-04-26 2024-04-23 北京奇艺世纪科技有限公司 Data processing system, method, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104809250A (en) * 2015-05-19 2015-07-29 福建新大陆电脑股份有限公司 Loose type data consistency checking method
CN107196821A (en) * 2017-05-24 2017-09-22 深圳市乃斯网络科技有限公司 The method of calibration and system of time delay in network link

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004005085A (en) * 2002-05-31 2004-01-08 Hitachi Ltd Storage network performance measuring system
CN103107875B (en) * 2013-01-31 2015-07-15 西安电子科技大学 Broadcast retransmission system based on network coding and method thereof
CN104135395B (en) * 2014-03-10 2015-12-30 腾讯科技(深圳)有限公司 IDC data transmission in network quality control method and system
CN107526816B (en) * 2017-08-28 2020-11-24 创新先进技术有限公司 Stream distribution record storage method and device and electronic equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104809250A (en) * 2015-05-19 2015-07-29 福建新大陆电脑股份有限公司 Loose type data consistency checking method
CN107196821A (en) * 2017-05-24 2017-09-22 深圳市乃斯网络科技有限公司 The method of calibration and system of time delay in network link

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
具有长时延与数据丢包的网络控制***的研究;黄逸彤;《中国优秀硕士学位论文全文数据库 信息科技辑(月刊)》;20140515(第5期);全文 *

Also Published As

Publication number Publication date
CN108763291A (en) 2018-11-06

Similar Documents

Publication Publication Date Title
CN107748790B (en) Online service system, data loading method, device and equipment
JP6686033B2 (en) Method and apparatus for pushing messages
CN110851308A (en) Test method, test device, electronic equipment and storage medium
CN111491002B (en) Equipment inspection method, device, inspected equipment, inspection server and system
CN111143163B (en) Data monitoring method, device, computer equipment and storage medium
CN109241084B (en) Data query method, terminal equipment and medium
WO2019228506A1 (en) Method of verifying access of multi-core interconnect to level-2 cache
CN109995612B (en) Service inspection method and device and electronic equipment
CN111177165A (en) Method, device and equipment for detecting data consistency
CN110430070B (en) Service state analysis method, device, server, data analysis equipment and medium
CN111008109A (en) Monitoring data processing method and device, electronic equipment and storage medium
CN110888985A (en) Alarm information processing method and device, electronic equipment and storage medium
CN111338888B (en) Data statistics method and device, electronic equipment and storage medium
CN112835885B (en) Processing method, device and system for distributed form storage
CN108733545B (en) Pressure testing method and device
CN108763291B (en) Data management method and device and electronic equipment
CN105893150B (en) Interface calling frequency control method and device and interface calling request processing method and device
CN110543509B (en) Monitoring system, method and device for user access data and electronic equipment
CN112948262A (en) System test method, device, computer equipment and storage medium
CN116775620A (en) Multi-party data-based risk identification method, device, equipment and storage medium
CN116303320A (en) Real-time task management method, device, equipment and medium based on log file
CN111291127B (en) Data synchronization method, device, server and storage medium
CN114039878B (en) Network request processing method and device, electronic equipment and storage medium
CN111163088B (en) Message processing method, system and device and electronic equipment
CN111061543A (en) Multi-tenant workflow engine service method, device and server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant