CN112597233A - Batch processing method, device and equipment of data indexes and storage medium - Google Patents

Batch processing method, device and equipment of data indexes and storage medium Download PDF

Info

Publication number
CN112597233A
CN112597233A CN202011593546.4A CN202011593546A CN112597233A CN 112597233 A CN112597233 A CN 112597233A CN 202011593546 A CN202011593546 A CN 202011593546A CN 112597233 A CN112597233 A CN 112597233A
Authority
CN
China
Prior art keywords
data
processing
index
task
processed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011593546.4A
Other languages
Chinese (zh)
Other versions
CN112597233B (en
Inventor
张广智
梁海涛
邢远辉
邵骋
余祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Bank Co Ltd
Original Assignee
Ping An Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Bank Co Ltd filed Critical Ping An Bank Co Ltd
Priority to CN202011593546.4A priority Critical patent/CN112597233B/en
Publication of CN112597233A publication Critical patent/CN112597233A/en
Application granted granted Critical
Publication of CN112597233B publication Critical patent/CN112597233B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Quality & Reliability (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Tourism & Hospitality (AREA)
  • Operations Research (AREA)
  • Marketing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a batch processing method, a batch processing device, a batch processing equipment and a batch processing storage medium for data indexes, and belongs to the field of data standardization. The method comprises the following steps: scanning an index processing task based on the configured task parameters; the task parameters comprise a scanning mode, a triggering condition and a calling interface; when the index processing task meeting the trigger condition is scanned, calling the index processing task through the calling interface; executing the index processing task based on the configured index processing flow to obtain a data index meeting the index requirement; the index processing flow comprises data source information and data processing logic information of the index processing task. According to the invention, the data processing logic is embedded into the nodes of the index processing flow in a configuration file manner, so that updating and replacement are convenient, the time of logic change flow is greatly shortened, and the development and version sending efficiency is improved.

Description

Batch processing method, device and equipment of data indexes and storage medium
Technical Field
The invention relates to the field of data standardization, and relates to a batch processing method, a batch processing device, batch processing equipment and a storage medium for data indexes.
Background
The data index is index information which is extracted and reflects the business operation condition of an enterprise through analyzing and summarizing ODS data of the enterprise in the operation process of the enterprise. The processing of the data index generally comprises the cleaning and warehousing of ODS data, the refining of data index logic and the technical implementation of index logic. By using the data indexes, the data are analyzed, the business operation conditions of the enterprises can be more clearly known, various decisions can be made more quickly and better, the decision risk of the enterprises is reduced, and the market opportunity is easier to grasp.
The data index processing platform of the enterprise in the industry generally adopts a large data platform technology, the technology of the system is related and dependent, the whole process is long in chain, and the technology output is not easy to perform. Before the index processing logic is determined, in order to verify the accuracy and the effectiveness of the index, multiple times of debugging, perfecting and verification are required. In the debugging process, the index processing logic has long changing process and is not easy to develop and debug. Index processing logics of different enterprise organizations are different, the supporting function of the multiple organizations is weak, and the multiple organizations can be supported only by more changes.
Disclosure of Invention
The invention aims to solve the technical problem of long index processing logic change process in the prior art, and provides a batch processing method, a device, equipment and a storage medium for data indexes.
The invention solves the technical problems through the following technical scheme:
a batch processing method of data indexes comprises the following steps:
scanning an index processing task based on the configured task parameters; the task parameters comprise trigger conditions and calling interfaces; when the index processing task meeting the trigger condition is scanned, calling the index processing task through the calling interface;
executing the index processing task based on the configured index processing flow to obtain a data index meeting the index requirement; and the nodes of the index processing flow comprise data source information and data processing logic information of the index processing task.
In the technical scheme, the batch processing of the data indexes is realized by combining the timing task with the index processing flow to configure the processing logic of the index processing task.
Preferably, the step of executing the index processing task based on the configured index processing flow includes:
extracting data to be processed from a corresponding database according to the data source information;
and processing the data to be processed according to the data processing logic information to obtain a data index meeting the index requirement.
In the technical scheme, the index processing task supports multiple databases, and can simultaneously process data indexes of data in the multiple databases.
Preferably, the step of extracting the data to be processed from the corresponding database according to the data source information includes:
intercepting data source information from the index processing flow, wherein the data source information at least comprises a database;
abstract connecting corresponding databases based on the data source information;
and extracting data to be processed from the connected database and temporarily storing the data in a resource library, wherein the data to be processed is provided with a data circulation identifier corresponding to the database.
In the technical scheme, the index processing task supports multiple databases, and data isolation is performed through the data flow identification, so that data security is guaranteed.
Preferably, the data processing logic information at least includes a set of data processing rules, the data processing rules are configured in a property file, and the data processing rules are in one-to-one correspondence with the property file;
each database is at least provided with a set of data processing rules;
the step of processing the extracted data to be processed comprises the following steps:
intercepting the data processing logic information from the index processing flow, wherein the data processing logic information comprises a file name of at least one property file;
loading a corresponding property file into the resource library based on the data processing logic information;
and processing the data to be processed in the resource library according to the data processing rule configured in the property file.
In the technical scheme, the data processing rules are configured through the property file, and are loaded through the property file, so that the operation is simple and convenient; the index processing task can be adapted to various industrial scenes, the index processing rule corresponding to each industrial scene can be embedded into the processing node of the processing task flow, and the application scene is wide.
Preferably, before the step of intercepting the data processing logic information from the index processing flow, the method further includes:
traversing the property file;
and when the property file is updated, correspondingly updating the data processing logic information in the index processing flow.
In the technical scheme, the data processing rules are configured through the property file, and the old logic is replaced in a property file loading mode, so that the development efficiency and the edition sending efficiency are improved.
Preferably, the data index is obtained by processing the corresponding data to be processed, and the data index is provided with the data flow identifier on the corresponding data to be processed;
after the step of obtaining the data index meeting the index requirement, the method further comprises the following steps:
and storing the data index into a database corresponding to the data flow conversion identification based on the data flow conversion identification carried on the data index.
In the technical scheme, the processed data indexes are guaranteed to be stored in the corresponding database through the data circulation identification, and data leakage is prevented.
Preferably, the data processing rule comprises a data date parameter and a data processing logic;
the step of processing the data to be processed in the resource library according to the data processing rule configured in the property file includes:
intercepting data date parameters in the data processing rules;
according to the data date parameters, finding out data generated by the time represented by the data date parameters from the data to be processed;
and processing the found data according to the data processing logic.
In the technical scheme, data on any date can be processed through the data date parameter in the data processing rule, so that the data on the date can be processed into index data meeting the index requirement.
The invention also discloses a batch processing device of the data indexes, which comprises:
the task management module is used for scanning the index processing task based on the configured task parameters; the task parameters comprise a scanning mode, a triggering condition and a calling interface; when the index processing task meeting the trigger condition is scanned, calling the index processing task through the calling interface;
the data processing module is used for executing the index processing task based on the configured index processing flow so as to obtain a data index meeting the index requirement; and the nodes of the index processing flow comprise data source information and data processing logic information of the index processing task.
The invention also discloses computer equipment which comprises a memory and a processor, wherein the memory is stored with a computer program, and the computer program is executed by the processor to realize the steps of the batch processing method of the data indexes in any technical scheme.
The invention also discloses a computer readable storage medium, in which a computer program is stored, and the computer program can be executed by at least one processor to implement the steps of the batch processing method for data indexes in any one of the above technical solutions.
The positive progress effects of the invention are as follows: the processing of the data indexes is regularly executed in a timing task mode; the data processing logic is embedded in the nodes of the index processing flow in a configuration file mode, so that updating and replacement are facilitated, the time for logic change flow is greatly shortened, and the development and version issuing efficiency is improved; a plurality of data sources are configured in the nodes, so that data among different mechanisms are isolated, and the safety of the data is improved.
Drawings
FIG. 1 is a flow chart of a first embodiment of a batch processing method for data indicators according to the present invention;
FIG. 2 shows a detailed flowchart of step 2 in the first embodiment;
FIG. 3 shows a detailed flowchart of step 21 in the first embodiment;
FIG. 4 is a flowchart showing the detailed procedure of step 22 in the first embodiment;
FIG. 5 shows a detailed flowchart of step 225 in the first embodiment;
FIG. 6 is a block diagram illustrating a first embodiment of a batch processing apparatus for data indexes in accordance with the present invention;
fig. 7 shows a hardware architecture diagram of an embodiment of the computer apparatus of the present invention.
Detailed Description
The invention is further illustrated by the following examples, which are not intended to limit the scope of the invention.
Firstly, the invention provides a batch processing method of data indexes.
In an embodiment, as shown in fig. 1, the batch processing method of the data index includes the following steps:
step 1: scanning an index processing task based on the configured task parameters; the task parameters comprise trigger conditions and calling interfaces; and when the index processing task meeting the trigger condition is scanned, calling the index processing task through the calling interface.
The task monitoring and scanning is realized through a task management platform, and an open source task management platform TaskManager on the market can be adopted.
The index processing task can be configured in the task management platform as a timing task, and in order to simplify configuration operation, before the task management platform is used, the task parameters of the timing task are configured in the task management platform, and the task parameters of the timing task are generally configured in a visual interface or API form. The task parameters include a scanning mode of the task, a trigger condition, a calling interface of a system executing the task (here, a calling interface of the ETL platform), and a judging mode. The scanning mode of the task is various, and any one of polling, random, consistent hash, least frequently used, least recently used, fault transfer and busy transfer can be selected. The trigger condition may be a timing execution time corresponding to each task. The judgment mode refers to a mode of judging whether the trigger condition is met, and here, may refer to detecting the time of the current time to judge whether the execution time of the task is reached.
After the task management platform configures the task parameters, the task management platform can continuously scan the timing task by using the background according to the selected scanning mode, and then trigger the timing task meeting the conditions, such as starting to execute the index processing task at the zero point every day. After the index processing task is triggered, the task management platform calls an ETL (Extract-Transform-Load) platform for executing the index processing task according to the configured calling interface.
And when the index processing task meeting the condition is scanned, calling and executing the index processing task.
Step 2: executing the index processing task based on the configured index processing flow to obtain a data index meeting the index requirement; the index processing flow comprises data source information and data processing logic information of the index processing task.
The index processing task may be specifically implemented by an ETL platform, where the ETL platform is a system tool that can extract (extract), convert (transform), and load (load) data from a source end to a destination end, and specifically may adopt a button (an ETL platform) written in Java.
When the ETL platform is used, an index processing flow of an index processing task needs to be configured in the ETL in advance, information such as a data source and data processing logic can be configured at each processing node of the index processing flow, and the operation is very simple and convenient. And then, corresponding operation is executed by intercepting data source information or data processing logic information in the index processing flow. As shown in fig. 2, the method specifically includes the following steps:
step 21: and extracting the data to be processed from the corresponding database according to the data source information.
Step 22: and processing the data to be processed according to the data processing logic information to obtain a data index meeting the index requirement.
As shown in fig. 3, step 21 includes the following sub-steps:
step 211: and intercepting data source information from the index processing flow, wherein the data source information at least comprises one database.
The data source information can comprise a plurality of databases, so that the index processing task supports multiple databases, for example, the same index processing task is used for processing data of a plurality of different mechanisms, and naturally, the data of different mechanisms are stored in different data.
Step 212: and abstractively connecting the corresponding databases based on the data source information.
When the data source information contains a plurality of databases, each database needs to be accessed, and the data in each database can be queried and acquired through the same function (method) by adopting the abstract connection.
Step 213: and extracting data to be processed from the connected database and temporarily storing the data in a resource library, wherein the data to be processed is provided with a data circulation identifier corresponding to the database.
For data security, data isolation is needed for data among different databases, so that the data to be processed is provided with a data flow identifier, and the data source can be identified through the data flow identifier. For example: an organization number is distributed for databases of different organizations, and the organization number can be directly used as a data flow identifier to be associated with data acquired from the database, so that the source of each data can be identified in the subsequent data warehousing, processing and other processes. Through the corresponding relation between the mechanism number and the database, after new data are subsequently processed, the data can be conveniently and respectively stored as long as the mechanism number is identified, so that data leakage is avoided.
The data processing logic information at least comprises a set of data processing rules, the data processing rules are configured in the property file, and the data processing rules are in one-to-one correspondence with the property file. Each database is configured with at least one set of data processing rule, but multiple sets of data processing rules are generally configured according to different business requirements, so that many times, one database is configured with multiple sets of data processing rules. Different data processing rules are distinguished by dimensions, the data processing rules of the same dimension are configured in a property file, namely indexes of the same dimension are defined in the same table, each index is defined as different fields of the table, the data processing rules of the indexes are defined as attribute values of the indexes, and the data processing rules are actually a string of simple sql statements.
As shown in fig. 4, step 22 may specifically include the following sub-steps:
step 221: and traversing the property file.
Since the data processing rules are configured in the property file, and the property file is updatable, before the data processing rules are extracted from the index processing flow, all the property files need to be traversed to determine whether new data processing rules exist.
Step 222: and when the property file is updated, correspondingly updating the data processing logic information in the index processing flow.
The updating means that a new property file is added, that is, a new data processing rule is added, when the new property file is added, the new property file is updated to the index processing flow before the data processing logic information is intercepted from the index processing flow, otherwise, the data processing rule cannot be executed. The old logic is replaced by the property file loading mode, so that the development efficiency and the version issuing efficiency can be effectively improved.
Step 223: intercepting the data processing logic information from the index processing flow, wherein the data processing logic information comprises a file name of at least one property file;
since the data processing logic information is embedded in the processing node of the index processing flow, the data processing logic information is obtained by means of interception.
Step 224: loading a corresponding property file into the resource library based on the data processing logic information;
the data processing logic information contains the file name of the property file, so after the data processing logic information is intercepted, the corresponding property file is loaded into the resource library according to the file name of each property file contained in the data processing logic information.
Step 225: and processing the data to be processed in the resource library according to the data processing rule configured in the property file.
The data processing rule comprises data extraction, cleaning and processing, and the data to be processed can obtain data indexes meeting the index requirements after the data to be processed is extracted, cleaned and processed.
The data processing rule may include a data date parameter to specify which day the data is processed. And then introducing a pull-up list, and processing data of any date by combining data date parameters in the data processing rule so as to process the data of the date into index data meeting the index requirement. As shown in fig. 5, step 225 specifically includes the following sub-steps:
step 2251: intercepting data date parameters in the data processing rules;
step 2252: according to the data date parameters, finding out data generated by the time represented by the data date parameters from the data to be processed;
step 2253: and processing the found data according to the data processing logic.
The data index is obtained after the corresponding data to be processed is processed, so the data index is provided with a data flow identifier on the corresponding data to be processed. Therefore, after the data indexes meeting the index requirements are obtained, the data indexes can be stored into the database corresponding to the data circulation identifications based on the data circulation identifications carried on the data indexes. Namely, when the processed data indexes are put in storage, different databases need to be selected according to the data flow identification for storage, and data leakage is further prevented.
The first embodiment realizes the regular execution by processing the data indexes in a timing task mode; the data processing logic is embedded in the nodes of the index processing flow in a configuration file mode, so that updating and replacement are facilitated, and the development and version issuing efficiency is improved; a plurality of data sources are configured in the nodes, so that data among different mechanisms are isolated, and the safety of the data is improved.
Secondly, the present invention provides a batch processing apparatus for data indexes, wherein the apparatus 20 can be divided into one or more modules.
For example, FIG. 6 shows a block diagram of a first embodiment of the batch processing apparatus 20 for data indexes, in which embodiment the apparatus 20 may be divided into a task management module 201 and a data processing module 202. The following description will specifically describe the specific functions of the task management module 201 and the data processing module 202.
The task management module 201 is configured to perform scanning of an index processing task based on configured task parameters; the task parameters comprise a scanning mode, a triggering condition and a calling interface; and when the index processing task meeting the trigger condition is scanned, calling the index processing task through the calling interface.
The task monitoring and scanning is realized through a task management platform, and an open source task management platform TaskManager on the market can be adopted.
The index processing task can be configured in the task management platform as a timing task, and in order to simplify configuration operation, before the task management platform is used, the task parameters of the timing task are configured in the task management platform, and the task parameters of the timing task are generally configured in a visual interface or API form. The task parameters include a scanning mode of the task, a trigger condition, a calling interface of a system executing the task (here, a calling interface of the ETL platform), and a judging mode. The scanning mode of the task is various, and any one of polling, random, consistent hash, least frequently used, least recently used, fault transfer and busy transfer can be selected. The trigger condition may be a timing execution time corresponding to each task. The judgment mode refers to a mode of judging whether the trigger condition is met, and here, may refer to detecting the time of the current time to judge whether the execution time of the task is reached.
After the task management platform configures the task parameters, the task management platform can continuously scan the timing task by using the background according to the selected scanning mode, and then trigger the timing task meeting the conditions, such as starting to execute the index processing task at the zero point every day. After the index processing task is triggered, the task management platform calls an ETL (Extract-Transform-Load) platform for executing the index processing task according to the configured calling interface.
The data processing module 202 is configured to execute the index processing task based on a configured index processing flow to obtain a data index meeting an index requirement; the index processing flow comprises data source information and data processing logic information of the index processing task.
The index processing task may be specifically implemented by an ETL platform, where the ETL platform is a system tool that can extract (extract), convert (transform), and load (load) data from a source end to a destination end, and specifically may adopt a button (an ETL platform) written in Java.
When the ETL platform is used, an index processing flow of an index processing task needs to be configured in the ETL in advance, information such as a data source and data processing logic can be configured at each processing node of the index processing flow, and the operation is very simple and convenient. And then, corresponding operation is executed by intercepting data source information or data processing logic information in the index processing flow. Specifically, the data processing module 202 may be further divided into a data extraction sub-module and a processing sub-module:
and the data extraction submodule is used for extracting the data to be processed from the corresponding database according to the data source information.
And the processing and processing submodule is used for processing the data to be processed according to the data processing logic information so as to obtain a data index meeting the index requirement.
The data extraction submodule can be further divided into a data source interception unit, a database connection unit and a data extraction unit.
The data source intercepting unit is used for intercepting data source information from the index processing flow, and the data source information at least comprises a database.
The data source information can comprise a plurality of databases, so that the index processing task supports multiple databases, for example, the same index processing task is used for processing data of a plurality of different mechanisms, and naturally, the data of different mechanisms are stored in different data.
And the database connection unit is used for connecting the corresponding database in an abstract way based on the data source information.
When the data source information contains a plurality of databases, each database needs to be accessed, and the data in each database can be queried and acquired through the same function (method) by adopting the abstract connection.
The data extraction unit is used for extracting data to be processed from the connected database and temporarily storing the data in a resource library, and the data to be processed is provided with a data circulation identifier corresponding to the database.
For data security, data isolation is needed for data among different databases, so that the data to be processed is provided with a data flow identifier, and the data source can be identified through the data flow identifier. For example: an organization number is distributed for databases of different organizations, and the organization number can be directly used as a data flow identifier to be associated with data acquired from the database, so that the source of each data can be identified in the subsequent data warehousing, processing and other processes. Through the corresponding relation between the mechanism number and the database, after new data are subsequently processed, the data can be conveniently and respectively stored as long as the mechanism number is identified, so that data leakage is avoided. The data processing logic information at least comprises a set of data processing rules, the data processing rules are configured in the property file, and the data processing rules are in one-to-one correspondence with the property file. Each database is configured with at least one set of data processing rule, but multiple sets of data processing rules are generally configured according to different business requirements, so that many times, one database is configured with multiple sets of data processing rules. Different data processing rules are distinguished by dimensions, the data processing rules of the same dimension are configured in a property file, namely indexes of the same dimension are defined in the same table, each index is defined as different fields of the table, the data processing rules of the indexes are defined as attribute values of the indexes, and the data processing rules are actually a string of simple sql statements.
The processing and processing submodule can be further divided into a traversal unit, a logic updating unit, a logic interception unit, a logic loading unit and a data processing unit.
The traversal unit is used for traversing the property file.
Since the data processing rules are configured in the property file, and the property file is updatable, before the data processing rules are extracted from the index processing flow, all the property files need to be traversed to determine whether new data processing rules exist.
The logic updating unit is used for correspondingly updating the data processing logic information in the index processing flow when the property file is updated.
The updating means that a new property file is added, that is, a new data processing rule is added, when the new property file is added, the new property file is updated to the index processing flow before the data processing logic information is intercepted from the index processing flow, otherwise, the data processing rule cannot be executed. The old logic is replaced by the property file loading mode, so that the development efficiency and the version issuing efficiency can be effectively improved.
The logic intercepting unit is used for intercepting the data processing logic information from the index processing flow, and the data processing logic information comprises a file name of at least one property file.
Since the data processing logic information is embedded in the processing node of the index processing flow, the data processing logic information is obtained by means of interception.
The logic loading unit is used for loading the corresponding property file into the resource library based on the data processing logic information.
The data processing logic information contains the file name of the property file, so after the data processing logic information is intercepted, the corresponding property file is loaded into the resource library according to the file name of each property file contained in the data processing logic information.
The data processing unit is used for processing the data to be processed in the resource library according to the data processing rule configured in the property file.
The data processing rule comprises data extraction, cleaning and processing, and the data to be processed can obtain data indexes meeting the index requirements after the data to be processed is extracted, cleaned and processed.
The data processing rule may include a data date parameter to specify which day the data is processed. And then introducing a pull-up list, and processing data of any date by combining data date parameters in the data processing rule so as to process the data of the date into index data meeting the index requirement. To achieve the above functions, the data processing unit may be further divided into a date intercepting subunit, a data matching subunit, and a data processing subunit.
And the date intercepting subunit is used for intercepting the data date parameter in the data processing rule.
And the data matching subunit is used for finding out data generated by the time represented by the data date parameter from the data to be processed according to the data date parameter.
And the data processing subunit is used for processing the found data according to the data processing logic.
The data index is obtained after the corresponding data to be processed is processed, so the data index is provided with a data flow identifier on the corresponding data to be processed. Therefore, after the data indexes meeting the index requirements are obtained, the data indexes can be stored into the database corresponding to the data circulation identifications based on the data circulation identifications carried on the data indexes. Namely, when the processed data indexes are put in storage, different databases need to be selected according to the data flow identification for storage, and data leakage is further prevented.
According to the method and the device, the index processing logic is embedded into the nodes of the index processing flow of the timing task in a configuration file mode, so that the index processing logic can be replaced quickly. .
The invention further provides computer equipment.
Fig. 7 is a schematic diagram of a hardware architecture of an embodiment of the computer device according to the present invention. In the present embodiment, the computer device 2 is a device capable of automatically performing numerical calculation and/or information processing in accordance with a preset or stored instruction. For example, the server may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a rack server, a blade server, a tower server, or a rack server (including an independent server or a server cluster composed of a plurality of servers). As shown, the computer device 2 includes, but is not limited to, at least a memory 21, a processor 22, and a network interface 23 communicatively coupled to each other via a system bus. Wherein:
the memory 21 includes at least one type of computer-readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the memory 21 may be an internal storage unit of the computer device 2, such as a hard disk or a memory of the computer device 2. In other embodiments, the memory 21 may also be an external storage device of the computer device 2, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the computer device 2. Of course, the memory 21 may also comprise both an internal storage unit of the computer device 2 and an external storage device thereof. In this embodiment, the memory 21 is generally used for storing an operating system and various types of application software installed in the computer device 2, such as a computer program for implementing the batch processing method of the data index. Further, the memory 21 may also be used to temporarily store various types of data that have been output or are to be output.
The processor 22 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 22 is generally configured to control the overall operation of the computer device 2, such as performing control and processing related to data interaction or communication with the computer device 2. In this embodiment, the processor 22 is configured to run a program code stored in the memory 21 or process data, for example, run a computer program for implementing a batch processing method of the data index.
The network interface 23 may comprise a wireless network interface or a wired network interface, and the network interface 23 is typically used to establish a communication connection between the computer device 2 and other computer devices. For example, the network interface 23 is used to connect the computer device 2 to an external terminal through a network, establish a data transmission channel and a communication connection between the computer device 2 and the external terminal, and the like. The network may be a wireless or wired network such as an Intranet (Intranet), the Internet (Internet), a Global System of Mobile communication (GSM), Wideband Code Division Multiple Access (WCDMA), a 4G network, a 5G network, Bluetooth (Bluetooth), Wi-Fi, and the like.
It is noted that fig. 7 only shows the computer device 2 with components 21-23, but it is to be understood that not all shown components are required to be implemented, and that more or less components may be implemented instead.
In this embodiment, the computer program stored in the memory 21 for implementing the batch processing method of the data index may be executed by one or more processors (in this embodiment, the processor 22) to perform the following steps:
step 1: scanning an index processing task based on the configured task parameters; the task parameters comprise a scanning mode, a triggering condition and a calling interface; and when the index processing task meeting the trigger condition is scanned, calling the index processing task through the calling interface.
Step 2: executing the index processing task based on the configured index processing flow to obtain a data index meeting the index requirement; and the nodes of the index processing flow comprise data source information and data processing logic information of the index processing task.
Furthermore, the present invention relates to a computer-readable storage medium, which is a non-volatile readable storage medium, and a computer program is stored in the computer-readable storage medium, and the computer program can be executed by at least one processor to implement the operations of the above-mentioned batch processing method or apparatus for data indexes.
The computer-readable storage medium includes, among others, a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the computer readable storage medium may be an internal storage unit of the computer device, such as a hard disk or a memory of the computer device. In other embodiments, the computer readable storage medium may be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. provided on the computer device. Of course, the computer-readable storage medium may also include both internal and external storage devices of the computer device. In this embodiment, the computer-readable storage medium is generally used for storing an operating system and various types of application software installed in a computer device, such as the aforementioned computer program for implementing the batch processing method of the data index. Further, the computer-readable storage medium may also be used to temporarily store various types of data that have been output or are to be output.
While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that this is by way of example only, and that the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention, and these changes and modifications are within the scope of the invention.

Claims (10)

1. A batch processing method of data indexes is characterized by comprising the following steps:
scanning an index processing task based on the configured task parameters; the task parameters comprise trigger conditions and calling interfaces; when the index processing task meeting the trigger condition is scanned, calling the index processing task through the calling interface;
executing the index processing task based on the configured index processing flow to obtain a data index meeting the index requirement; and the nodes of the index processing flow comprise data source information and data processing logic information of the index processing task.
2. The batch processing method of data metrics of claim 1, wherein the step of performing the metric processing task based on the configured metric process flow comprises:
extracting data to be processed from a corresponding database according to the data source information;
and processing the data to be processed according to the data processing logic information to obtain a data index meeting the index requirement.
3. The batch processing method of data indicators of claim 2,
the step of extracting the data to be processed from the corresponding database according to the data source information comprises the following steps:
intercepting data source information from the index processing flow, wherein the data source information at least comprises a database;
abstract connecting corresponding databases based on the data source information;
and extracting data to be processed from the connected database and temporarily storing the data in a resource library, wherein the data to be processed is provided with a data circulation identifier corresponding to the database.
4. The method of claim 3, wherein the data processing logic information comprises at least one set of data processing rules, the data processing rules are configured in a property file, and the data processing rules are in one-to-one correspondence with the property file;
each database is at least provided with a set of data processing rules;
the step of processing the data to be processed according to the data processing logic information comprises the following steps:
intercepting the data processing logic information from the index processing flow, wherein the data processing logic information comprises a file name of at least one property file;
loading a corresponding property file into the resource library based on the data processing logic information;
and processing the data to be processed in the resource library according to the data processing rule configured in the property file.
5. The method of batch processing of data metrics of claim 4, further comprising, prior to the step of intercepting the data processing logic information from the metric process flow:
traversing the property file;
and when the property file is updated, correspondingly updating the data processing logic information in the index processing flow.
6. The batch processing method of data indicators of claim 4,
the data index is obtained after the corresponding data to be processed is processed, and the data index is provided with the data flow identification on the corresponding data to be processed;
after the step of obtaining the data index meeting the index requirement, the method further comprises the following steps:
and storing the data index into a database corresponding to the data flow conversion identification based on the data flow conversion identification carried on the data index.
7. The batch processing method of data indicators according to claim 4, wherein the data processing rules comprise data date parameters and data processing logic;
the step of processing the data to be processed in the resource library according to the data processing rule configured in the property file includes:
intercepting data date parameters in the data processing rules;
according to the data date parameters, finding out data generated by the time represented by the data date parameters from the data to be processed;
and processing the found data according to the data processing logic.
8. An apparatus for batch processing of data indicators, comprising:
the task management module is used for scanning the index processing task based on the configured task parameters; the task parameters comprise a scanning mode, a triggering condition and a calling interface; when the index processing task meeting the trigger condition is scanned, calling the index processing task through the calling interface;
the data processing module is used for executing the index processing task based on the configured index processing flow so as to obtain a data index meeting the index requirement; and the nodes of the index processing flow comprise data source information and data processing logic information of the index processing task.
9. A computer device comprising a memory and a processor, characterized in that the memory has stored thereon a computer program which, when executed by the processor, carries out the steps of the batch processing method of data indicators according to any one of claims 1-7.
10. A computer-readable storage medium, in which a computer program is stored which is executable by at least one processor to implement the steps of the method of batch processing of data indicators as claimed in any one of claims 1 to 7.
CN202011593546.4A 2020-12-29 2020-12-29 Batch processing method, device and equipment for data indexes and storage medium Active CN112597233B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011593546.4A CN112597233B (en) 2020-12-29 2020-12-29 Batch processing method, device and equipment for data indexes and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011593546.4A CN112597233B (en) 2020-12-29 2020-12-29 Batch processing method, device and equipment for data indexes and storage medium

Publications (2)

Publication Number Publication Date
CN112597233A true CN112597233A (en) 2021-04-02
CN112597233B CN112597233B (en) 2024-06-25

Family

ID=75203241

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011593546.4A Active CN112597233B (en) 2020-12-29 2020-12-29 Batch processing method, device and equipment for data indexes and storage medium

Country Status (1)

Country Link
CN (1) CN112597233B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102890651A (en) * 2011-07-19 2013-01-23 阿里巴巴集团控股有限公司 Method and device for testing scene data
CN110245029A (en) * 2019-05-21 2019-09-17 中国平安财产保险股份有限公司 A kind of data processing method, device, storage medium and server

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102890651A (en) * 2011-07-19 2013-01-23 阿里巴巴集团控股有限公司 Method and device for testing scene data
CN110245029A (en) * 2019-05-21 2019-09-17 中国平安财产保险股份有限公司 A kind of data processing method, device, storage medium and server

Also Published As

Publication number Publication date
CN112597233B (en) 2024-06-25

Similar Documents

Publication Publication Date Title
CN110069572B (en) HIVE task scheduling method, device, equipment and storage medium based on big data platform
CN110209652B (en) Data table migration method, device, computer equipment and storage medium
CN110309125B (en) Data verification method, electronic device and storage medium
US10275355B2 (en) Method and apparatus for cleaning files in a mobile terminal and associated mobile terminal
CN108255620B (en) Service logic processing method, device, service server and system
JP6996812B2 (en) How to process data blocks in a distributed database, programs, and devices
CN110515795B (en) Big data component monitoring method and device and electronic equipment
CN110674109B (en) Data importing method, system, computer equipment and computer readable storage medium
CN112764874B (en) Virtual machine server information acquisition method based on CMDB configuration management system
CN111737227B (en) Data modification method and system
CN110928802A (en) Test method, device, equipment and storage medium based on automatic generation of case
WO2019148657A1 (en) Method for testing associated environments, electronic device and computer readable storage medium
CN110333876A (en) A kind of data clearing method and control equipment
CN110737594A (en) Database standard conformance testing method and device for automatically generating test cases
WO2020010724A1 (en) Front-end static resource management method, apparatus, computer device and storage medium
CN110688378A (en) Migration method and system for database storage process
JP6282217B2 (en) Anti-malware system and anti-malware method
CN111124872A (en) Branch detection method and device based on difference code analysis and storage medium
CN113535677A (en) Data analysis query management method and device, computer equipment and storage medium
WO2021031583A1 (en) Method and apparatus for executing statements, server and storage medium
CN112416957A (en) Data increment updating method and device based on data model layer and computer equipment
CN112860412B (en) Service data processing method and device, electronic equipment and storage medium
US10761940B2 (en) Method, device and program product for reducing data recovery time of storage system
CN112597233B (en) Batch processing method, device and equipment for data indexes and storage medium
CN105302604A (en) Application version update method and apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant