CN112925772A - Data dynamic splitting method and device - Google Patents
Data dynamic splitting method and device Download PDFInfo
- Publication number
- CN112925772A CN112925772A CN201911242162.5A CN201911242162A CN112925772A CN 112925772 A CN112925772 A CN 112925772A CN 201911242162 A CN201911242162 A CN 201911242162A CN 112925772 A CN112925772 A CN 112925772A
- Authority
- CN
- China
- Prior art keywords
- data
- task
- batch number
- cleaned
- exporting
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000013507 mapping Methods 0.000 claims abstract description 37
- 238000012544 monitoring process Methods 0.000 claims abstract description 9
- 238000004140 cleaning Methods 0.000 claims abstract description 8
- 238000004590 computer program Methods 0.000 claims description 8
- 238000000638 solvent extraction Methods 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 13
- 238000012545 processing Methods 0.000 description 9
- 238000004891 communication Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 230000003287 optical effect Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 206010063385 Intellectualisation Diseases 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2308—Concurrency control
- G06F16/2315—Optimistic concurrency control
- G06F16/2322—Optimistic concurrency control using timestamps
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a method and a device for dynamically splitting data, and relates to the technical field of computers. A specific implementation mode of the method comprises the steps of obtaining a service system data source, cleaning the data source based on a preset task condition, and incrementing a task counter every time one cleaned data is obtained; monitoring the task calculator, and generating a corresponding batch number if the task calculator is determined to be more than the batch number until the batch division of the cleaned data is completed; and based on the batch number, exporting the cleaned data to an intermediate table in batches according to a preset mapping relation. Therefore, the method and the device can solve the problem that the database is blocked due to overlarge data quantity when the business data are exported to the intermediate table.
Description
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for dynamically splitting data.
Background
In the integrated middleware system, data generated by a service system needs to be accurately and completely exported to a middle table in time according to a certain rule, and because the number of the middle tables is large and each table stores data of different service types, if the data volume processed at one time is too large, the database pressure is large, and the data analysis is not facilitated.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art:
in the prior art, the amount of service data extracted from a service system by each integrated middleware system is millions and tens of millions, and when the service data is exported to a middle table, the database is jammed due to overlarge statistical data amount at one time.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for dynamically splitting data, which can solve the problem of database deadlock caused by an excessively large data amount when business data is exported to an intermediate table.
In order to achieve the above object, according to an aspect of the embodiments of the present invention, a method for dynamically splitting data is provided, including acquiring a data source of a service system, cleaning the data source based on a preset task condition, and incrementing a task counter each time a cleaned data is obtained; monitoring the task calculator, and generating a corresponding batch number if the task calculator is determined to be more than the batch number until the batch division of the cleaned data is completed; and based on the batch number, exporting the cleaned data to an intermediate table in batches according to a preset mapping relation.
Optionally, before the data source is cleaned based on the preset task condition, the method includes:
and setting task conditions, batch times and mapping relations in the configuration file, and initializing the configuration file into a memory.
Optionally, generating a corresponding batch number comprises:
and acquiring the current timestamp, and splicing to generate a corresponding batch number based on the hash value of the task identification code.
Optionally, the exporting the cleaned data to the intermediate table in batches according to a preset mapping relationship, includes:
and exporting the cleaned data to an intermediate table in batches according to a preset mapping relation by utilizing a parallel operation programming model of the data set and an analyzer for converting the database into a distributed file system.
In addition, the invention also provides a data dynamic splitting device, which comprises an acquisition module, a data processing module and a data processing module, wherein the acquisition module is used for acquiring the data source of the service system, cleaning the data source based on the preset task condition, and incrementing a task counter when each cleaned data is obtained; the dividing module is used for monitoring the task calculator, generating a corresponding batch number if the task calculator is determined to be more than the batch number until the batch division of the cleaned data is completed; and the export module is used for exporting the cleaned data to the intermediate table in batches according to a preset mapping relation based on the batch number.
Optionally, before the obtaining module cleans the data source based on the preset task condition, the obtaining module includes:
and setting task conditions, batch times and mapping relations in the configuration file, and initializing the configuration file into a memory.
Optionally, the dividing module generates a corresponding batch number, including:
and acquiring the current timestamp, and splicing to generate a corresponding batch number based on the hash value of the task identification code.
Optionally, the exporting module exports the cleaned data to the intermediate table in batches according to a preset mapping relationship, including:
and exporting the cleaned data to an intermediate table in batches according to a preset mapping relation by utilizing a parallel operation programming model of the data set and an analyzer for converting the database into a distributed file system.
One embodiment of the above invention has the following advantages or benefits: the method comprises the steps that a service system data source is obtained, the data source is cleaned based on preset task conditions, and a task counter is increased when one cleaned data is obtained; monitoring the task calculator, and generating a corresponding batch number if the task calculator is determined to be more than the batch number until the batch division of the cleaned data is completed; the technical means of exporting the cleaned data to the intermediate table in batches according to the preset mapping relation based on the batch number overcomes the technical problem of database jamming caused by overlarge data amount when business data are exported to the intermediate table, further achieves the technical effects of dynamically splitting the data into a plurality of batches according to the specified split batch number, performing data integration and data analysis according to different batches, realizing data identification intellectualization, reducing unnecessary development workload and improving working efficiency.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
fig. 1 is a schematic diagram of a main flow of a data dynamic splitting method according to a first embodiment of the present invention;
FIG. 2 is a schematic diagram of a main flow of a data dynamic splitting method according to a second embodiment of the present invention;
FIG. 3 is a schematic diagram of the main modules of a data dynamic splitting apparatus according to an embodiment of the present invention;
FIG. 4 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
fig. 5 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a schematic diagram of a main flow of a dynamic data splitting method according to a first embodiment of the present invention, as shown in fig. 1, the dynamic data splitting method includes:
step S101, a service system data source is obtained, the data source is cleaned based on preset task conditions, and a task counter is increased when one cleaned data is obtained.
In some embodiments, the task conditions, the batch times and the mapping relationships in the data dynamic splitting method are all set in a configuration file in advance. Further, before step S101 is executed, the configuration file may be initialized to the memory, and further, before step S101 is executed, the configuration file in the memory may be analyzed, so that the task condition, the batch number, and the mapping relationship may be obtained.
The task conditions are screening conditions set according to different service requirements. The batch number is the data volume of one batch and can be preset according to the actual situation. The mapping relation is used for converting the data field value of the business system into the field value of the intermediate table according to the defined mapping relation, and supports direct mapping and also supports function mapping.
It should be noted that the service system data source may come from various databases, such as sqlserver, mysql, oracle, and so on.
And S102, monitoring the task calculator, and generating a corresponding batch number if the task calculator is determined to be more than the batch number until the batch division of the cleaned data is completed.
In some embodiments, when generating the corresponding batch number, the current timestamp may be obtained, and then the current timestamp may be concatenated with the hash value of the task identifier to obtain the batch number.
In a further embodiment, the current timestamp is concatenated with the hash value of the task ID to generate a batch number, thereby ensuring the uniqueness of the batch number in the flushed full amount of data.
It is noted that if the first lot number is generated when the first cleaned data is obtained, the next lot number is generated when it is determined that the task calculator is greater than the number of lots. And if the first batch number is not generated when the first cleaned data is obtained, generating the batch number when the task calculator is determined to be more than the batch times.
And step S103, based on the batch number, exporting the cleaned data to an intermediate table in batches according to a preset mapping relation.
In some embodiments, the cleaned data may be exported to the intermediate table in batches according to a preset mapping relationship by using a parallel operation programming model of the data set through a parser for converting the database into a distributed file system.
In a further embodiment, the dynamic data splitting method provided by the invention can be executed based on a MapReduce framework, wherein MapReduce is a programming model and is used for parallel operation of large-scale data sets. In executing step S103, the washed data may be exported to the intermediate table in batches through sqoop according to a preset mapping relationship. Wherein sqoop is a parser that converts the database into a distributed file system.
In summary, the dynamic data splitting method provided by the present invention can dynamically split the batches of data according to the specified splitting times, and implement the configuration of batch processing of data in a big data processing scenario, reduce the complexity of program coding and maintenance cost, and bring great convenience to the subsequent data statistical analysis.
Fig. 2 is a schematic diagram of a main flow of a dynamic data splitting method according to a second embodiment of the present invention, where the dynamic data splitting method may include:
step S201, setting task conditions, batch times and mapping relationships in the configuration file, and initializing the configuration file into a memory.
In a preferred embodiment, the dynamic data splitting method can be executed under a MapReduce framework, so that the mapping relationship of fields can be defined in a customized mapping file, and the mapping file can be placed in a memory during initialization.
Step S202, a service system data source is obtained, and a configuration file in a memory is analyzed.
Step S203, cleaning the data source based on the preset task condition, and incrementing the task counter every time one cleaned data is obtained.
In a preferred embodiment, a where condition may be specified in the mapping file according to a service requirement to preset cleaning of the data source, that is, cleaning of the data source based on a preset task condition may be preset. That is, data satisfying the where condition is valid data, and data not satisfying the where condition is garbage data.
Step S204, monitoring the task calculator, judging whether the task calculator is more than the batch number, if so, performing step S205, otherwise, returning to step S203.
In a preferred embodiment, under the MapReduce framework, a reduce task counter can be used, each reduce task is a separate process for execution, and the task counter is only valid for the current reduce task and is not global.
Step S205, acquiring the current timestamp, and generating a corresponding batch number by splicing based on the hash value of the task identification code.
Step S206, determining whether there is a service system data source to be cleaned, if yes, returning to step S203, otherwise, performing step S207.
And step S207, based on the batch number, utilizing a parallel operation programming model of the data set, and exporting the cleaned data to an intermediate table in batches according to a preset mapping relation through an analyzer for converting the database into a distributed file system.
In an embodiment, the dynamic data splitting method of the present invention may be applied to an EBS integration middleware system, that is, data generated by a service system is exported to an EBS intermediate table. Wherein EBS refers to a middleware system of the financial system.
Fig. 3 is a schematic diagram of main modules of a dynamic data splitting apparatus according to an embodiment of the present invention, and as shown in fig. 3, the dynamic data splitting apparatus 300 includes an obtaining module 301, a dividing module 302, and a deriving module 303. The obtaining module 301 obtains a data source of a service system, cleans the data source based on a preset task condition, and increments a task counter each time one cleaned data is obtained. The dividing module 302 monitors the task calculator, and generates a corresponding batch number if the task calculator is determined to be more than the batch number, until the batch division of the cleaned data is completed. The export module 303 exports the cleaned data to the intermediate table in batches according to a preset mapping relationship based on the batch number.
In some embodiments, before the obtaining module 301 cleans the data source based on the preset task condition, the obtaining module includes:
and setting task conditions, batch times and mapping relations in the configuration file, and initializing the configuration file into a memory.
As another embodiment, the dividing module 302 generates the corresponding batch number, including:
and acquiring the current timestamp, and splicing to generate a corresponding batch number based on the hash value of the task identification code.
It should be noted that the data dynamic splitting apparatus of the present invention utilizes a parallel operation programming model of a data set, and exports the cleaned data to the intermediate table in batches according to a preset mapping relationship through an analyzer for converting a database into a distributed file system.
It should be noted that the dynamic data splitting method and the dynamic data splitting apparatus of the present invention have corresponding relationships in the specific implementation contents, and therefore, the repeated contents are not described again.
Fig. 4 shows an exemplary system architecture 400 to which the dynamic data splitting method or the dynamic data splitting apparatus according to the embodiment of the present invention can be applied.
As shown in fig. 4, the system architecture 400 may include terminal devices 401, 402, 403, a network 404, and a server 405. The network 404 serves as a medium for providing communication links between the terminal devices 401, 402, 403 and the server 405. Network 404 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.
A user may use terminal devices 401, 402, 403 to interact with a server 405 over a network 404 to receive or send messages or the like. The terminal devices 401, 402, 403 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).
The terminal devices 401, 402, 403 may be various electronic devices having dynamic splitting of data screens and supporting web browsing, including but not limited to smart phones, tablets, laptop portable computers, desktop computers, and the like.
The server 405 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the terminal devices 401, 402, 403. The backend management server may analyze and perform other processing on the received data such as the product information query request, and feed back a processing result (for example, target push information, product information — just an example) to the terminal device.
It should be noted that the dynamic data splitting method provided by the embodiment of the present invention is generally executed by the server 405, and accordingly, the computing device is generally disposed in the server 405.
It should be understood that the number of terminal devices, networks, and servers in fig. 4 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 5, shown is a block diagram of a computer system 500 suitable for use with a terminal device implementing an embodiment of the present invention. The terminal device shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 5, the computer system 500 includes a Central Processing Unit (CPU)501 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)502 or a program loaded from a storage section 508 into a Random Access Memory (RAM) 503. In the RAM503, various programs and data necessary for the operation of the system 500 are also stored. The CPU501, ROM502, and RAM503 are connected to each other via a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
The following components are connected to the I/O interface 505: an input portion 506 including a keyboard, a mouse, and the like; an output section 507 including a display such as a Cathode Ray Tube (CRT), a liquid crystal data dynamic splitter (LCD), and the like, and a speaker; a storage portion 508 including a hard disk and the like; and a communication section 509 including a network interface card such as a LAN card, a modem, or the like. The communication section 509 performs communication processing via a network such as the internet. The driver 510 is also connected to the I/O interface 505 as necessary. A removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 510 as necessary, so that a computer program read out therefrom is mounted into the storage section 508 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 509, and/or installed from the removable medium 511. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 501.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes an acquisition module, a partitioning module, and a derivation module. Wherein the names of the modules do not in some cases constitute a limitation of the module itself.
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs, when the one or more programs are executed by the equipment, the equipment comprises a data source for acquiring a service system, the data source is cleaned based on preset task conditions, and a task counter is incremented every time cleaned data is obtained; monitoring the task calculator, and generating a corresponding batch number if the task calculator is determined to be more than the batch number until the batch division of the cleaned data is completed; and based on the batch number, exporting the cleaned data to an intermediate table in batches according to a preset mapping relation.
According to the technical scheme of the embodiment of the invention, the problem of database jamming caused by overlarge data volume when business data are exported to the intermediate table can be solved.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (10)
1. A dynamic data splitting method is characterized by comprising the following steps:
acquiring a service system data source, cleaning the data source based on a preset task condition, and incrementing a task counter every time one cleaned data is obtained;
monitoring the task calculator, and generating a corresponding batch number if the task calculator is determined to be more than the batch number until the batch division of the cleaned data is completed;
and based on the batch number, exporting the cleaned data to an intermediate table in batches according to a preset mapping relation.
2. The method of claim 1, wherein prior to cleansing the data source based on the pre-set task conditions, comprising:
and setting task conditions, batch times and mapping relations in the configuration file, and initializing the configuration file into a memory.
3. The method of claim 1, wherein generating a corresponding batch number comprises:
and acquiring the current timestamp, and splicing to generate a corresponding batch number based on the hash value of the task identification code.
4. The method according to any one of claims 1 to 3, wherein exporting the cleaned data to the intermediate table in batches according to a preset mapping relationship comprises:
and exporting the cleaned data to an intermediate table in batches according to a preset mapping relation by utilizing a parallel operation programming model of the data set and an analyzer for converting the database into a distributed file system.
5. A dynamic data splitting device is characterized by comprising:
the acquisition module is used for acquiring a data source of a service system, cleaning the data source based on preset task conditions, and incrementing a task counter when each cleaned data is obtained;
the dividing module is used for monitoring the task calculator, generating a corresponding batch number if the task calculator is determined to be more than the batch number until the batch division of the cleaned data is completed;
and the export module is used for exporting the cleaned data to the intermediate table in batches according to a preset mapping relation based on the batch number.
6. The method of claim 5, wherein before the obtaining module cleans the data source based on the preset task condition, the method comprises:
and setting task conditions, batch times and mapping relations in the configuration file, and initializing the configuration file into a memory.
7. The method of claim 5, wherein the partitioning module generates a corresponding batch number, comprising:
and acquiring the current timestamp, and splicing to generate a corresponding batch number based on the hash value of the task identification code.
8. The method according to any one of claims 5 to 7, wherein the exporting module exports the cleaned data to the intermediate table in batches according to a preset mapping relationship, and the exporting module comprises:
and exporting the cleaned data to an intermediate table in batches according to a preset mapping relation by utilizing a parallel operation programming model of the data set and an analyzer for converting the database into a distributed file system.
9. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-4.
10. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911242162.5A CN112925772A (en) | 2019-12-06 | 2019-12-06 | Data dynamic splitting method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911242162.5A CN112925772A (en) | 2019-12-06 | 2019-12-06 | Data dynamic splitting method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112925772A true CN112925772A (en) | 2021-06-08 |
Family
ID=76161586
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911242162.5A Pending CN112925772A (en) | 2019-12-06 | 2019-12-06 | Data dynamic splitting method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112925772A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115964155A (en) * | 2023-03-16 | 2023-04-14 | 燧原智能科技(成都)有限公司 | On-chip data processing hardware, on-chip data processing method and AI platform |
Citations (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040098719A1 (en) * | 2002-11-15 | 2004-05-20 | International Business Machines Corporation | Auto-commit processing in an IMS batch application |
CN101697166A (en) * | 2009-10-28 | 2010-04-21 | 浪潮电子信息产业股份有限公司 | Method for accelerating data integration of heterogeneous platform |
CN103581296A (en) * | 2013-09-27 | 2014-02-12 | 深圳中科金证科技有限公司 | Data pushing method and device of POS system and regional health information platform |
CN103699495A (en) * | 2013-12-27 | 2014-04-02 | 乐视网信息技术(北京)股份有限公司 | Transmission device and transmission system for splitting data |
CN104731791A (en) * | 2013-12-18 | 2015-06-24 | 东阳艾维德广告传媒有限公司 | Marketing analysis data market system |
CN104820670A (en) * | 2015-03-13 | 2015-08-05 | 国家电网公司 | Method for acquiring and storing big data of power information |
CN105119763A (en) * | 2015-09-24 | 2015-12-02 | 烽火通信科技股份有限公司 | RIA-based Web network management client big data rapid export method and system |
CN105512237A (en) * | 2015-11-30 | 2016-04-20 | 用友网络科技股份有限公司 | Data introduction system with complex structure |
CN106951442A (en) * | 2017-02-15 | 2017-07-14 | 中国保险信息技术管理有限责任公司 | Data interactive method and device between a kind of heterogeneous database |
CN107451861A (en) * | 2017-07-27 | 2017-12-08 | 中兴软创科技股份有限公司 | A kind of method of user's online feature recognition under big data |
CN107688592A (en) * | 2017-04-06 | 2018-02-13 | 平安科技(深圳)有限公司 | The method and terminal of data cleansing |
CN108038126A (en) * | 2017-11-08 | 2018-05-15 | 中国平安人寿保险股份有限公司 | Method, apparatus, terminal device and storage medium derived from a kind of data |
CN108846076A (en) * | 2018-06-08 | 2018-11-20 | 山大地纬软件股份有限公司 | The massive multi-source ETL process method and system of supporting interface adaptation |
CN109034988A (en) * | 2018-07-26 | 2018-12-18 | 北京京东金融科技控股有限公司 | A kind of accounting entry generation method and device |
CN109241040A (en) * | 2017-07-10 | 2019-01-18 | 北京京东尚科信息技术有限公司 | The method and apparatus of data cleansing |
CN109583941A (en) * | 2018-11-06 | 2019-04-05 | 汪浩 | A kind of advertisement delivery system |
CN110134430A (en) * | 2019-04-12 | 2019-08-16 | 中国平安财产保险股份有限公司 | A kind of data packing method, device, storage medium and server |
CN110275918A (en) * | 2019-06-17 | 2019-09-24 | 浙江百应科技有限公司 | A kind of million rank excel data quick and stable import systems |
CN110377651A (en) * | 2019-06-20 | 2019-10-25 | 平安科技(深圳)有限公司 | Processing method, device, equipment and the storage medium of batch data |
-
2019
- 2019-12-06 CN CN201911242162.5A patent/CN112925772A/en active Pending
Patent Citations (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040098719A1 (en) * | 2002-11-15 | 2004-05-20 | International Business Machines Corporation | Auto-commit processing in an IMS batch application |
CN101697166A (en) * | 2009-10-28 | 2010-04-21 | 浪潮电子信息产业股份有限公司 | Method for accelerating data integration of heterogeneous platform |
CN103581296A (en) * | 2013-09-27 | 2014-02-12 | 深圳中科金证科技有限公司 | Data pushing method and device of POS system and regional health information platform |
CN104731791A (en) * | 2013-12-18 | 2015-06-24 | 东阳艾维德广告传媒有限公司 | Marketing analysis data market system |
CN103699495A (en) * | 2013-12-27 | 2014-04-02 | 乐视网信息技术(北京)股份有限公司 | Transmission device and transmission system for splitting data |
CN104820670A (en) * | 2015-03-13 | 2015-08-05 | 国家电网公司 | Method for acquiring and storing big data of power information |
CN105119763A (en) * | 2015-09-24 | 2015-12-02 | 烽火通信科技股份有限公司 | RIA-based Web network management client big data rapid export method and system |
CN105512237A (en) * | 2015-11-30 | 2016-04-20 | 用友网络科技股份有限公司 | Data introduction system with complex structure |
CN106951442A (en) * | 2017-02-15 | 2017-07-14 | 中国保险信息技术管理有限责任公司 | Data interactive method and device between a kind of heterogeneous database |
CN107688592A (en) * | 2017-04-06 | 2018-02-13 | 平安科技(深圳)有限公司 | The method and terminal of data cleansing |
CN109241040A (en) * | 2017-07-10 | 2019-01-18 | 北京京东尚科信息技术有限公司 | The method and apparatus of data cleansing |
CN107451861A (en) * | 2017-07-27 | 2017-12-08 | 中兴软创科技股份有限公司 | A kind of method of user's online feature recognition under big data |
CN108038126A (en) * | 2017-11-08 | 2018-05-15 | 中国平安人寿保险股份有限公司 | Method, apparatus, terminal device and storage medium derived from a kind of data |
CN108846076A (en) * | 2018-06-08 | 2018-11-20 | 山大地纬软件股份有限公司 | The massive multi-source ETL process method and system of supporting interface adaptation |
CN109034988A (en) * | 2018-07-26 | 2018-12-18 | 北京京东金融科技控股有限公司 | A kind of accounting entry generation method and device |
CN109583941A (en) * | 2018-11-06 | 2019-04-05 | 汪浩 | A kind of advertisement delivery system |
CN110134430A (en) * | 2019-04-12 | 2019-08-16 | 中国平安财产保险股份有限公司 | A kind of data packing method, device, storage medium and server |
CN110275918A (en) * | 2019-06-17 | 2019-09-24 | 浙江百应科技有限公司 | A kind of million rank excel data quick and stable import systems |
CN110377651A (en) * | 2019-06-20 | 2019-10-25 | 平安科技(深圳)有限公司 | Processing method, device, equipment and the storage medium of batch data |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115964155A (en) * | 2023-03-16 | 2023-04-14 | 燧原智能科技(成都)有限公司 | On-chip data processing hardware, on-chip data processing method and AI platform |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111190888A (en) | Method and device for managing graph database cluster | |
CN112527649A (en) | Test case generation method and device | |
CN110795315A (en) | Method and device for monitoring service | |
CN110706093A (en) | Accounting processing method and device | |
CN110737676A (en) | Data query method and device | |
CN112947919A (en) | Method and device for constructing service model and processing service request | |
CN111858706A (en) | Data processing method and device | |
CN111062572A (en) | Task allocation method and device | |
CN110928594A (en) | Service development method and platform | |
CN116450622B (en) | Method, apparatus, device and computer readable medium for data warehouse entry | |
CN113326305A (en) | Method and device for processing data | |
CN112925772A (en) | Data dynamic splitting method and device | |
CN113760982A (en) | Data processing method and device | |
CN112685481A (en) | Data processing method and device | |
CN111241048A (en) | Web terminal log management method, device, medium and electronic equipment | |
CN116204428A (en) | Test case generation method and device | |
CN111949678A (en) | Method and device for processing non-accumulation indexes across time windows | |
CN114237765B (en) | Functional component processing method, device, electronic equipment and medium | |
CN111026629A (en) | Method and device for automatically generating test script | |
CN113515306B (en) | System transplanting method and device | |
CN112988857A (en) | Service data processing method and device | |
CN114490050A (en) | Data synchronization method and device | |
CN114817297A (en) | Method and device for processing data | |
CN113779018A (en) | Data processing method and device | |
CN113282455A (en) | Monitoring processing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |