CN116521659A - Data management method and device, electronic equipment and storage medium - Google Patents

Data management method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116521659A
CN116521659A CN202310409873.7A CN202310409873A CN116521659A CN 116521659 A CN116521659 A CN 116521659A CN 202310409873 A CN202310409873 A CN 202310409873A CN 116521659 A CN116521659 A CN 116521659A
Authority
CN
China
Prior art keywords
task
data
wide
execution
table processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310409873.7A
Other languages
Chinese (zh)
Inventor
张榕深
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Yunxi Xinchuang Network Technology Co ltd
Original Assignee
Shanghai Yunxi Xinchuang Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Yunxi Xinchuang Network Technology Co ltd filed Critical Shanghai Yunxi Xinchuang Network Technology Co ltd
Priority to CN202310409873.7A priority Critical patent/CN116521659A/en
Publication of CN116521659A publication Critical patent/CN116521659A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data management method, a data management device, electronic equipment and a storage medium. Wherein the method comprises the following steps: acquiring a data demand index, and calling at least one wide-table processing task and at least one data application task based on the data demand index; acquiring data to be treated, and executing each wide-table processing task based on the data to be treated to obtain a corresponding wide-table task execution result; and executing each data application task based on the execution result of each wide-table task to obtain a corresponding application task execution result. According to the technical scheme, the effect of improving the data management efficiency on the premise of reducing the development difficulty is achieved, and meanwhile, the effect of automatically managing the data task is achieved.

Description

Data management method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of data analysis technologies, and in particular, to a data management method, a data management device, an electronic device, and a storage medium.
Background
Along with the rapid development of big data technology, more and more enterprises begin to pay attention to the data problem of themselves, and begin to use a certain means to conduct data management and control in enterprise data management and data planning, so as to realize the treatment of big data, obtain treatment results, and related users can conduct subsequent data optimization adjustment operation according to the treatment results.
In the prior art, when large data is managed, two modes are generally adopted, and one mode is to process data by adopting a heavy-weight data management tool (such as Hadoop or Spark) so as to realize data management; another way is for a developer to develop a data governance script using a programming language (e.g., JAVA or Python), and further, to implement processing and handling of data through the data governance script.
However, for small and medium-sized systems, the data management is performed in the first mode, and as the data management tools cooperate with a plurality of components to realize the data management process, the problems of high data management cost and the like may exist; for the second mode, because related developers are required to manually start corresponding data governance scripts, the problems of increasing the workload of the developers, low data governance efficiency and the like may exist.
Disclosure of Invention
The invention provides a data management method, a device, electronic equipment and a storage medium, which are used for realizing the effect of improving the data management efficiency on the premise of reducing the development difficulty and realizing the effect of automatically managing data tasks.
According to an aspect of the present invention, there is provided a data governance method comprising:
Acquiring a data demand index, and calling at least one wide-table processing task and at least one data application task based on the data demand index;
acquiring data to be treated, and executing each wide-table processing task based on the data to be treated to obtain a corresponding wide-table task execution result;
based on the execution results of the wide-table tasks, executing the data application tasks to obtain corresponding application task execution results;
and the application task execution result comprises an index value corresponding to the data demand index.
According to another aspect of the present invention, there is provided a data governance device comprising:
the task calling module is used for acquiring the data demand index and calling at least one wide-table processing task and at least one data application task based on the data demand index;
the wide-table processing task execution module is used for acquiring data to be treated, and executing each wide-table processing task based on the data to be treated to obtain a corresponding wide-table task execution result;
the data application task execution module is used for executing each data application task based on each wide-table task execution result to obtain a corresponding application task execution result;
And the application task execution result comprises an index value corresponding to the data demand index.
According to another aspect of the present invention, there is provided an electronic apparatus including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the data governance method of any embodiment of the present invention.
According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to execute a data governance method according to any of the embodiments of the present invention.
According to the technical scheme, the data demand index is obtained, the at least one wide-table processing task and the at least one data application task are called based on the data demand index, further, data to be managed is obtained, each wide-table processing task is executed based on the data to be managed, a corresponding wide-table task execution result is obtained, finally, each data application task is executed based on each wide-table task execution result, a corresponding application task execution result is obtained, the problems that in the prior art, the data management cost is too high, the workload of developers is increased, the data management efficiency is low and the like are solved, the effect of improving the data management efficiency is achieved on the premise that the development difficulty is reduced, and meanwhile, the effect of automatic management of the data tasks is achieved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a data governance method according to a first embodiment of the present invention;
FIG. 2 is a flow chart of a data governance method according to a second embodiment of the present invention;
FIG. 3 is a flow chart of a data governance method according to a second embodiment of the present invention;
FIG. 4 is a schematic diagram of a data management device according to a third embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device implementing a data management method according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
Fig. 1 is a flowchart of a data governance method according to a first embodiment of the present invention, where the method may be performed by a data governance device, which may be implemented in hardware and/or software, and the data governance device may be configured in a terminal and/or a server, where the data governance device is adapted to process data to be governed based on data requirement indicators. As shown in fig. 1, the method includes:
s110, acquiring a data demand index, and calling at least one wide-table processing task and at least one data application task based on the data demand index.
In this embodiment, the data demand index may be a data application index determined based on the data governance demand of the user. The data demand index can be an index according to which a user performs statistical analysis on any data to be treated. Illustratively, the data demand indicators may include indicators corresponding to user reporting applications, indicators corresponding to order reporting applications, indicators corresponding to user tagging applications, indicators corresponding to order tagging applications, indicators corresponding to user event applications, indicators corresponding to order event applications, and the like. The report application can be to display the treated data in a form based on a report; the label application can be marking treatment of the treated data; the event application may be to statistically analyze the data to be remediated based on a preset remediated event to obtain a data set corresponding to the remediated event.
In this embodiment, the wide table processing task may be a data processing task corresponding to when the data to be managed is processed into the data wide table, that is, after the data to be managed is obtained, the data to be managed may be organized into the corresponding data wide table based on at least one wide table processing task. It should be understood by those skilled in the art that a broad table may be a database table with a relatively large number of fields, and generally refers to a database table in which indexes, dimensions, and attributes related to a business topic are associated together. The wide table has the advantages of improving query performance and being convenient because different contents are stored in the same table. In practical application, the wide table can be widely applied to data preparation before training of the data mining model, and the efficiency problem in iterative computation in the training process of the data mining model can be greatly improved by placing related fields in the same table. For example, if the data requirement index is a user report application index, the broad table processing task may include a registered user mark task, a trusted user mark task, a successful money release user mark task, a click immediate application user mark task, and the like.
In this embodiment, the data application task may be a task for processing the data wide table, and may also be understood as a data processing task corresponding to when data is queried from the data wide table, data is called, or data statistics is performed. For example, if the data requirement index is a user report application index, the data application task may include a statistics registration user number task, a statistics credit user number task, a statistics cash discharge user number task, and the like.
In practical application, a plurality of data requirement indexes of different users for data to be treated can be counted in advance, corresponding wide-table processing tasks to be selected and data application tasks to be selected are respectively created and stored in a database based on the data requirement indexes, when any data application index is detected, at least one wide-table processing task can be called from the plurality of wide-table processing tasks to be selected stored in the database based on the data application index, and at the same time, at least one data application task is called from the plurality of data application tasks to be selected stored in the database; or, the statistical analysis can be performed on the historical data treatment process of the data to be treated to obtain at least one data demand historical index, and a corresponding wide-table processing task and a corresponding data application task, so that when the data demand index matched with any data demand historical index is detected, the at least one wide-table processing task and the corresponding data application task corresponding to the data demand historical index can be directly invoked.
It should be noted that, for the same data requirement index, the corresponding wide-table processing task and the data application task may include a task with a one-to-one correspondence and a dependency relationship between the two tasks, that is, the execution of the data application task may depend on the premise that the execution of the corresponding wide-table processing task is completed; it may also include tasks where there is no one-to-one correspondence and dependency relationship between the two tasks.
S120, obtaining data to be treated, and executing each wide-table processing task based on the data to be treated to obtain a corresponding wide-table task execution result.
In this embodiment, the data to be managed may be data that needs to be managed, or may be data stored in a system corresponding to a plurality of service systems. By way of example, the data to be remediated may include business data, user behavior data, external file data, real-time message data, and the like. In practical application, the original data of a plurality of service systems needing data management can be obtained first, then, each item of original data is subjected to data cleaning and stored in a unified data layer, and at the moment, the data stored in the layer can be used as data to be managed. The execution result of the wide-table task can be a result obtained after the data to be treated is processed according to the corresponding wide-table processing task. For example, if the wide-table processing task is a registered user marking task, the execution result of the wide-table task may be a result obtained after marking the registered user included in the data to be treated; if the wide-table processing task is a credit passing user marking task, the execution result of the wide-table task is a result obtained after the credit passing user included in the data to be treated is marked; if the wide-table processing task is a successful money release user marking task, the execution result of the wide-table task is a result obtained after marking the successful money release user included in the data to be treated.
In practical application, after determining at least one wide table processing task, at least one data wide table can be firstly constructed based on each wide table processing task, specifically, each wide table processing task is used as a field in a corresponding data wide table, further, after obtaining data to be treated, corresponding data can be obtained from the data to be treated based on each wide table processing task, then, processing corresponding data based on each wide table processing task and storing the processed data into a pre-constructed and completed data wide table, at this time, the result stored into the data wide table is the execution result of the wide table task corresponding to the corresponding wide table processing task, and when each wide table processing task is executed, the target data wide table meeting the data demand index can be obtained.
It should be noted that, there is no execution dependency between the wide-table processing tasks, and the wide-table processing tasks may be executed concurrently.
S130, executing each data application task based on each wide-table task execution result to obtain a corresponding application task execution result.
The application task execution result comprises an index value corresponding to the data demand index.
In this embodiment, after the execution result of each wide-table task is obtained, each data application task may be executed based on the execution result of each wide-table task, so as to obtain a corresponding application task execution result. The application task execution result may be a result obtained after the corresponding wide-table task execution result is processed according to the corresponding data application task. For example, if the data application task is a task for counting the number of registered users, the application task execution result may be a statistical result of the number of registered users; if the data application task is a user task for counting the trusted users, the execution result of the application task can be a user statistical result for the trusted users; if the data application task is a user task for counting the number of the paying users, the execution result of the application task can be a user counting result for counting the number of the paying users. The index value may be a specific data value corresponding to the case where the corresponding data application requirement index is used as a field.
In practical application, after the execution result of each wide-table task is obtained, each data application task may be executed based on the execution result of each wide-table task, specifically, for each data application task, corresponding data may be obtained from the execution result of the corresponding wide-table task based on the current data application task, and the data obtaining manner may include a manner of generating a data query instruction based on the data application task and sending the data query instruction to the execution result of the corresponding wide-table task, so as to invoke corresponding data from the execution result of the corresponding wide-table task based on the data query instruction, and further, when all the data application tasks are executed, an application task execution result corresponding to each data application task may be obtained, so that an index value corresponding to the data demand index may be determined based on the execution result of each application task.
For each data application task, when executing the current data application task, it may be determined first whether there is a wide-table processing task that has an execution dependency relationship with the current data application task in each wide-table processing task, if there is a wide-table processing task, it may be determined whether the execution of these wide-table processing tasks is completed, and if there is a wide-table processing task, it may be executed.
In practical application, the execution process of the wide-table processing task and the execution process of the data application task can be executed in a message queue mode, and the corresponding execution logic exists in the wide-table processing task and the data application task respectively so as to be executed based on the corresponding execution logic in the task execution process, therefore, when the wide-table processing task and the data application task are executed, the corresponding message queue can be acquired so as to determine the execution logic of the corresponding task based on the task attribute information stored in the message queue in advance, and the corresponding task is executed based on the execution logic.
Optionally, the execution process of the broad table processing task and the data application task includes: acquiring a task queue to be executed; and for any task to be executed in the task queue to be executed, executing the current task to be executed based on the execution logic corresponding to the current task to be executed.
In this embodiment, the task queue to be executed may be a message queue storing tasks to be executed. The tasks to be executed comprise a wide-table processing task and a data application task. The task queue to be executed may be middleware for storing tasks to be executed during task execution. The execution logic may be a task execution mode of the corresponding task to be executed. Alternatively, the execution logic may include parallel execution of the wide-table processing task and serial execution of the data application task with the dependent wide-table processing task, and the execution timing of the wide-table processing task is earlier than the execution timing of the data application task.
In practical applications, the execution process of the broad table processing task or the data application task may include the following steps: firstly, a task queue to be executed is obtained, wherein the task queue comprises a wide-table processing task and a data application task, then, for any task to be executed in the task queue to be executed, task attribute information corresponding to the current task to be executed can be determined based on the task queue to be executed, further, execution logic corresponding to the current task to be executed is determined based on the task attribute information, and accordingly, the current task to be executed can be executed based on the execution logic.
According to the technical scheme, the data demand index is obtained, the at least one wide-table processing task and the at least one data application task are called based on the data demand index, further, data to be managed is obtained, each wide-table processing task is executed based on the data to be managed, a corresponding wide-table task execution result is obtained, finally, each data application task is executed based on each wide-table task execution result, a corresponding application task execution result is obtained, the problems that in the prior art, the data management cost is too high, the workload of developers is increased, the data management efficiency is low and the like are solved, the effect of improving the data management efficiency is achieved on the premise that the development difficulty is reduced, and meanwhile, the effect of automatic management of the data tasks is achieved.
Example two
Fig. 2 is a flowchart of a data management method according to a second embodiment of the present invention, where a task queue to be executed may be created based on the foregoing embodiment, so as to execute a wide-table processing task and a data application task. The specific implementation manner can be seen in the technical scheme of the embodiment. Wherein, the technical terms identical to or corresponding to the above embodiments are not repeated herein.
As shown in fig. 2, the method includes:
s210, acquiring a data demand index, and calling at least one wide-table processing task and at least one data application task based on the data demand index.
S220, constructing a task queue to be executed based on at least one wide-table processing task and at least one data application task.
In practical application, after at least one wide-table processing task and at least one data application task are obtained, each wide-table processing task and corresponding task attribute information can be stored in a message queue, and meanwhile, each data application task and corresponding task attribute information are stored in the message queue, so that a task queue to be executed can be obtained.
It should be noted that, the construction of the task queue to be executed has the following advantages: the task automatic distribution management can be realized based on a task distribution mechanism of the message queue.
S230, for any task to be executed in the task queue to be executed, determining corresponding execution logic based on whether the current task to be executed has execution dependency attributes, so as to execute the current task to be executed based on the execution logic, and obtaining a corresponding task execution result.
In this embodiment, the execution dependency attribute may be information indicating whether or not there is an execution dependency relationship between tasks. Optionally, the execution dependency attribute is dependent on the data application task on at least one broad-table processing task, i.e. whether the data application task starts execution is dependent on whether the associated at least one broad-table processing task is executing to completion. The execution logic may be a task execution mode of the corresponding task to be executed. Alternatively, the execution logic may include parallel execution of the wide-table processing task and serial execution of the data application task with the dependent wide-table processing task, and the execution timing of the wide-table processing task is earlier than the execution timing of the data application task.
In practical application, after the task queue to be executed is obtained, for any task to be executed in the task queue to be executed, task attribute information corresponding to the current task to be executed can be obtained based on the task queue to be executed, further, whether other tasks to be executed with execution dependency relationships exist in the current task to be executed can be determined based on the task attribute information, that is, whether execution dependency attributes exist in the current task to be executed is determined, and further, corresponding execution logic can be determined according to a judgment result of the current task to be executed on the execution dependency attributes.
In practical applications, the execution logic corresponding to the wide-table processing task and the execution logic corresponding to the data application task are different, so that when the task to be executed is executed based on the execution logic, the task execution process of the wide-table processing task is also different from the task execution process of the data application task, and the execution process of the wide-table processing task and the execution process of the data application can be respectively specifically described below.
Optionally, when the task to be executed is a wide-table processing task, the corresponding execution logic is to execute the wide-table processing task in parallel, and correspondingly, based on the execution logic, the current task to be executed is executed, including: updating the task execution state of the wide-table processing task, and executing the wide-table processing task based on the data to be treated; when the successful execution of the wide-table processing task is detected, the task execution state of the wide-table processing task is updated, task completion information is sent, and the wide-table processing task is placed in a pre-constructed task completion queue.
The task execution state may be an attribute that characterizes the current execution condition of the task. The task execution state may include task unexecuted, task executing, task execution ended, and the like. The task completion information may be information characterizing successful execution completion of the task. The task completion queue may be a message queue that stores tasks that have performed completion.
In practical application, the execution logic of the wide-table processing task is parallel execution, so that at least one wide-table processing task can be executed simultaneously, for any wide-table processing task, the task execution state of the current wide-table processing task can be updated into task execution, then corresponding data are acquired from data to be treated based on the current wide-table processing task, processing is carried out on the acquired data based on the current wide-table processing task, a wide-table task execution result is obtained and stored in a pre-built data wide table, further, when the successful execution of the current wide-table processing task is detected, the task execution state of the current wide-table processing task can be updated into task execution end, and task completion information can be sent to a corresponding terminal based on a preset information sending mode, and then the current wide-table processing task is updated into a pre-built task completion queue.
Optionally, when the task to be executed is a data application task, the execution logic performs serial execution on the data application task and the dependent wide-table processing task, and correspondingly, based on the execution logic, performs the current task to be executed, including: determining whether the wide-table processing task depending on the data application task is executed to be completed or not; if yes, updating the task execution state of the data application task, and executing the data application task based on the wide-table task execution result of the corresponding wide-table processing task; if not, returning the data application task to the task queue to be executed for queuing again, and executing the data application task again when the execution of the dependent wide-table processing task is monitored to be completed.
In practical application, the execution attribute of the data application task exists, so that the execution logic of the data application task can perform serial execution of the data application task and the dependent wide-table processing task, for each data application task, whether the execution of the wide-table processing task which is dependent on the current data application task is completed or not can be determined, if yes, the current data application task can be executed, at this time, the task execution state of the current data application task is updated into task execution, then corresponding data is acquired from the wide-table task execution result of the dependent wide-table processing task based on the current data application task, so as to obtain an application task execution result corresponding to the current data application task, further, when the successful execution of the current data application task is detected, the task execution state of the current data application task can be updated into task execution end, task completion information is sent to the corresponding terminal based on a preset information sending mode, and then the current data application task is updated into a pre-built task completion queue. Conversely, if the broad-table processing task on which the current data application task depends is not executed, the current data application task can be returned to the task queue to be executed for queuing again, and when the execution completion of the broad-table processing task on which the current data application task depends is monitored, the current data application task can be called out from the task queue to be executed for executing the data application task again.
It should be noted that, each task to be executed in the task queue to be executed may monitor the task completion queue, and if the data application task monitors that the execution of the dependent wide-table processing task is completed, the data application task may be executed immediately.
In practical application, if some tasks are not executed within a preset time, potential safety hazards may be caused, so if any task to be executed is detected to be not executed within the preset time, a delay alarm judgment can be performed to determine whether to send alarm information based on a judgment result.
Based on the above, the above technical means further includes: if the wide-table processing task relied on by the data application task is not executed to be completed, determining whether delay warning is carried out or not based on the task attribute of the data application task; if yes, sending alarm information.
In this embodiment, the task attributes may characterize various pieces of configuration information associated with the task. The alarm information may be information including that the corresponding task is not completed within a preset time.
In practical application, for each data application task, if the wide-table processing task on which the current data application task depends is not executed, determining a task attribute corresponding to the current data application task to determine whether delay alarm is required based on the task attribute, if so, sending alarm information to the corresponding terminal device based on a preset alarm mode, and if not, returning the current data application task to a task queue to be executed to re-execute the current data application task when the execution completion of the wide-table processing task on which the current data application task depends is monitored.
Optionally, the execution of the task to be executed supports multiple programming languages.
It should be noted that, the execution of the broad table processing task or the task execution of the data processing task may be implemented based on a plurality of programming languages, for example, at least one of JAVA programming language, python programming language, and SQL programming language.
By way of example, the execution of the broad table processing task and the data application task may be described in conjunction with the flowchart shown in FIG. 3: 1. acquiring a data demand index; 2. determining a wide-table processing task and a data application task; 3. creating a task queue to be executed; 4. performing task routing on each task to be executed in the task queue to be executed; 5. for each task to be executed, determining whether the task on which the current task to be executed depends is executed or not (it is to be noted that, for the wide-table processing task, since the task on which the current task to be executed does not exist, the task on which the current task to be executed depends can be executed by default, if yes, executing the step 6, otherwise, executing the step 11; 6. updating the task execution state; 7. executing the current task to be executed based on the corresponding task execution logic; 8. determining whether the current task to be executed is executed successfully, if yes, executing the step 9, and if not, executing the step; 9. updating the task execution state; 10. transmitting task completion information, and updating the current task to be executed into a task completion queue; 11. determining whether delay alarming is needed, if yes, executing step 12, and if not, executing step 13; 12. alarming; 13. and returning the current task to be executed to the task queue to be executed.
According to the technical scheme, the data demand index is obtained, the at least one wide-table processing task and the at least one data application task are called based on the data demand index, further, data to be managed is obtained, each wide-table processing task is executed based on the data to be managed, a corresponding wide-table task execution result is obtained, finally, each data application task is executed based on each wide-table task execution result, a corresponding application task execution result is obtained, the problems that in the prior art, the data management cost is too high, the workload of developers is increased, the data management efficiency is low and the like are solved, the effect of improving the data management efficiency is achieved on the premise that the development difficulty is reduced, and meanwhile, the effect of automatic management of the data tasks is achieved.
Example III
Fig. 4 is a schematic structural diagram of a data management device according to a third embodiment of the present invention. As shown in fig. 4, the apparatus includes: task call module 310, wide-table process task execution module 320, and data application task execution module 330.
The task calling module 310 is configured to obtain a data requirement index, and call at least one wide-table processing task and at least one data application task based on the data requirement index;
The wide-table processing task execution module 320 is configured to obtain data to be treated, and execute each of the wide-table processing tasks based on the data to be treated, so as to obtain a corresponding wide-table task execution result;
a data application task execution module 330, configured to execute each data application task based on each of the wide-table task execution results, to obtain a corresponding application task execution result;
and the application task execution result comprises an index value corresponding to the data demand index.
According to the technical scheme, the data demand index is obtained, the at least one wide-table processing task and the at least one data application task are called based on the data demand index, further, data to be managed is obtained, each wide-table processing task is executed based on the data to be managed, a corresponding wide-table task execution result is obtained, finally, each data application task is executed based on each wide-table task execution result, a corresponding application task execution result is obtained, the problems that in the prior art, the data management cost is too high, the workload of developers is increased, the data management efficiency is low and the like are solved, the effect of improving the data management efficiency is achieved on the premise that the development difficulty is reduced, and meanwhile, the effect of automatic management of the data tasks is achieved.
Optionally, the apparatus further includes: and a task queue construction module.
The task queue construction module is used for constructing a task queue to be executed based on the at least one wide-table processing task and the at least one data application task after the at least one wide-table processing task and the at least one data application task are called;
correspondingly, the broad table processing task execution module 320 and the data application task execution module 330 are specifically configured to determine, for any task to be executed in the task queue to be executed, a corresponding execution logic based on whether an execution dependency attribute exists in a current task to be executed, so as to execute the current task to be executed based on the execution logic, and obtain a corresponding task execution result;
the tasks to be executed comprise a wide-table processing task and a data application task; the task execution results comprise a wide-table task execution result and an application task execution result.
Optionally, the execution dependency attribute is that the data application task depends on at least one wide-table processing task; the execution logic comprises parallel execution of the wide-table processing tasks and serial execution of the data application tasks and the dependent wide-table processing tasks, wherein the execution time sequence of the wide-table processing tasks is earlier than the execution time sequence of the data application tasks.
Optionally, the task to be executed includes a wide-table processing task, the execution logic includes parallel execution of the wide-table processing task, and the corresponding task execution module to be executed further includes: a wide-table processing task execution unit and a task completion information sending unit.
The wide-table processing task execution unit is used for updating the task execution state of the wide-table processing task and executing the wide-table processing task based on the data to be treated;
and the task completion information sending unit is used for updating the task execution state of the wide-table processing task when the successful execution of the wide-table processing task is detected, sending the task completion information and updating the wide-table processing task into a pre-constructed task completion queue.
Optionally, the task to be executed includes a data application task, the execution logic executes the data application task and the dependent wide-table processing task in series, and the corresponding task to be executed execution module includes: the system comprises a task execution completion determining unit, a data application task executing unit and a data application task returning unit.
The task execution completion determining unit is used for determining whether the wide-table processing task depending on the data application task is executed and completed;
The data application task execution unit is used for updating the task execution state of the data application task if yes, executing the data application task based on a wide-table task execution result of a corresponding wide-table processing task, updating the task execution state when successful execution of the data application task is detected, sending task completion information, and updating the data application task into a pre-built task completion queue;
and the data application task returning unit is used for returning the data application task to the task queue to be executed for queuing again if not, and re-executing the data application task when monitoring that the execution of the dependent wide-table processing task is completed.
Optionally, the apparatus further includes: a delay alarm determining module and an alarm information transmitting module.
The delay alarm determining module is used for determining whether delay alarm is carried out or not based on the task attribute of the data application task if the wide table processing task relied on by the data application task is not executed;
and the alarm information sending module is used for sending alarm information if yes.
Optionally, the execution process of the task to be executed supports multiple programming languages.
The data management device provided by the embodiment of the invention can execute the data management method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example IV
Fig. 5 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 5, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, such as data governance methods.
In some embodiments, the data governance method may be implemented as a computer program tangibly embodied on a computer readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more of the steps of the data governance method described above may be carried out. Alternatively, in other embodiments, processor 11 may be configured to perform the data governance method in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method of data management comprising:
acquiring a data demand index, and calling at least one wide-table processing task and at least one data application task based on the data demand index;
acquiring data to be treated, and executing each wide-table processing task based on the data to be treated to obtain a corresponding wide-table task execution result;
based on the execution results of the wide-table tasks, executing the data application tasks to obtain corresponding application task execution results;
And the application task execution result comprises an index value corresponding to the data demand index.
2. The method of claim 1, further comprising, after said invoking at least one wide-table processing task and at least one data application task:
constructing a task queue to be executed based on the at least one wide-table processing task and the at least one data application task;
correspondingly, the execution process of the wide-table processing task and the data application task comprises the following steps:
for any task to be executed in the task queue to be executed, determining corresponding execution logic based on whether the current task to be executed has execution dependency attributes, so as to execute the current task to be executed based on the execution logic, and obtaining a corresponding task execution result; the tasks to be executed comprise a wide-table processing task and a data application task; the task execution results comprise a wide-table task execution result and an application task execution result.
3. The method of claim 2, wherein the execution-dependent attribute is that the data application task is dependent on at least one broad-table processing task; the execution logic comprises parallel execution of the wide-table processing tasks and serial execution of the data application tasks and the dependent wide-table processing tasks, wherein the execution time sequence of the wide-table processing tasks is earlier than the execution time sequence of the data application tasks.
4. The method of claim 2, wherein the task to be performed comprises a wide-table processing task, the execution logic comprises parallel execution of the wide-table processing task, and the executing the current task to be performed based on the execution logic comprises:
updating the task execution state of the wide-table processing task, and executing the wide-table processing task based on the data to be treated;
when the successful execution of the wide-table processing task is detected, updating the task execution state corresponding to the wide-table processing task, sending task completion information, and updating the wide-table processing task into a pre-constructed task completion queue.
5. The method of claim 2, wherein the task to be performed comprises a data application task, the execution logic is configured to execute the data application task serially with the dependent wide-table processing task, and the executing the current task to be performed based on the execution logic comprises:
determining whether the wide-table processing task depending on the data application task is executed to be completed or not;
if yes, updating the task execution state of the data application task, executing the data application task based on a wide-table task execution result of a corresponding wide-table processing task, updating the task execution state when successful execution of the data application task is detected, sending task completion information, and updating the data application task to a pre-built task completion queue;
If not, returning the data application task to the task queue to be executed for queuing again, and executing the data application task again when the execution completion of the dependent wide-table processing task is monitored.
6. The method as recited in claim 5, further comprising:
if the wide-table processing task relied on by the data application task is not executed to be completed, determining whether delay warning is carried out or not based on the task attribute of the data application task;
if yes, sending alarm information.
7. The method of claim 2, wherein the execution of the task to be performed supports multiple programming languages.
8. A data governance device, comprising:
the task calling module is used for acquiring the data demand index and calling at least one wide-table processing task and at least one data application task based on the data demand index;
the wide-table processing task execution module is used for acquiring data to be treated, and executing each wide-table processing task based on the data to be treated to obtain a corresponding wide-table task execution result;
the data application task execution module is used for executing each data application task based on each wide-table task execution result to obtain a corresponding application task execution result;
And the application task execution result comprises an index value corresponding to the data demand index.
9. An electronic device, the electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the data governance method of any of claims 1 to 7.
10. A computer readable storage medium storing computer instructions for causing a processor to perform the data governance method of any of claims 1 to 7.
CN202310409873.7A 2023-04-17 2023-04-17 Data management method and device, electronic equipment and storage medium Pending CN116521659A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310409873.7A CN116521659A (en) 2023-04-17 2023-04-17 Data management method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310409873.7A CN116521659A (en) 2023-04-17 2023-04-17 Data management method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116521659A true CN116521659A (en) 2023-08-01

Family

ID=87396789

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310409873.7A Pending CN116521659A (en) 2023-04-17 2023-04-17 Data management method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116521659A (en)

Similar Documents

Publication Publication Date Title
US20230020324A1 (en) Task Processing Method and Device, and Electronic Device
CN115525411A (en) Method, device, electronic equipment and computer readable medium for processing service request
CN112948081B (en) Method, device, equipment and storage medium for processing tasks in delayed mode
CN113656239A (en) Monitoring method and device for middleware and computer program product
CN116661960A (en) Batch task processing method, device, equipment and storage medium
CN116126719A (en) Interface testing method and device, electronic equipment and storage medium
CN116521659A (en) Data management method and device, electronic equipment and storage medium
CN115509714A (en) Task processing method and device, electronic equipment and storage medium
EP3832985B1 (en) Method and apparatus for processing local hot spot, electronic device and storage medium
CN114356713A (en) Thread pool monitoring method and device, electronic equipment and storage medium
CN114238069A (en) Web application firewall testing method and device, electronic equipment, medium and product
CN113722141A (en) Method and device for determining delay reason of data task, electronic equipment and medium
CN114168329A (en) Distributed batch optimization method, electronic device and computer-readable storage medium
CN113760568A (en) Data processing method and device
CN112925623A (en) Task processing method and device, electronic equipment and medium
CN114924806B (en) Dynamic synchronization method, device, equipment and medium for configuration information
CN113779098B (en) Data processing method, device, electronic equipment and storage medium
CN114816928A (en) Method, device and system for monitoring business data, electronic equipment and storage medium
CN117614998A (en) Current limiting method and device for micro-service system, electronic equipment and storage medium
CN117081939A (en) Traffic data processing method, device, equipment and storage medium
CN116795260A (en) Full selection method, device, equipment and medium based on elementUI
CN117272151A (en) Data processing method, device, equipment and storage medium
CN115983222A (en) EasyExcel-based file data reading method, device, equipment and medium
CN116071112A (en) Advertisement putting service processing method and device, electronic equipment and storage medium
CN117971872A (en) Database query method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Country or region after: China

Address after: 518000, Zone 2601A, China Energy Storage Building, No. 3099 Community Keyuan South Road, Yuehai Street, Nanshan District, Shenzhen, Guangdong Province

Applicant after: Shenzhen Yunxi Xinchuang Network Technology Co.,Ltd.

Address before: Room 5057, 5th Floor, No. 6, Lane 600, Yunling West Road, Putuo District, Shanghai, 200333

Applicant before: Shanghai Yunxi Xinchuang Network Technology Co.,Ltd.

Country or region before: China