CN109284335A - A kind of method and apparatus of integration across database batch conduct data - Google Patents
A kind of method and apparatus of integration across database batch conduct data Download PDFInfo
- Publication number
- CN109284335A CN109284335A CN201811050114.1A CN201811050114A CN109284335A CN 109284335 A CN109284335 A CN 109284335A CN 201811050114 A CN201811050114 A CN 201811050114A CN 109284335 A CN109284335 A CN 109284335A
- Authority
- CN
- China
- Prior art keywords
- data
- database
- memory
- slicer
- integration
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of method and apparatus of integration across database batch conduct data, comprising: to data sectional to be conducted in first database, sequentially generates multiple data slicers;After each data slicer is generated, the data slicer is exported in memory immediately;When the data slicer present in the memory reaches specified quantity, data slicer is concurrently imported into the second database simultaneously.Technical solution of the present invention can carry out integration across database batch for different data or different types of data and conduct, and improve batch data efficiency of transmission, reduce transmission time.
Description
Technical field
The present invention relates to field of data transmission, and more specifically, more particularly to a kind of integration across database batch conduction number
According to method and apparatus.
Background technique
Most common big data search engine is ElasticSearch (ES) in the prior art, and the data loaded into ES can
To be rapidly searched and inquire.ES data source first is that from relevant database (such as K-DB database).In the prior art
Lack the scheme that data are loaded into ES from K-DB rapid batch.The prior art is that data are gone out and generated from K-DB data base querying
Data file (i.e. so-called file landing), then the data file of landing is imported in ES, it is that full dose imports due to importing, in number
It is lower according to efficiency is imported when measuring big.
Aiming at the problem that lacking integration across database rapid batch conduct data method in the prior art, there has been no effective at present
Solution.
Summary of the invention
In view of this, the purpose of the embodiment of the present invention is to propose the method and dress of a kind of integration across database batch conduct data
It sets, integration across database batch can be carried out for different data or different types of data and conducted, improve batch data efficiency of transmission,
Reduce transmission time.
Based on above-mentioned purpose, the one side of the embodiment of the present invention provides a kind of side of integration across database batch conduct data
Method, comprising the following steps:
To data sectional to be conducted in first database, multiple data slicers are sequentially generated;
After each data slicer is generated, the data slicer is exported in memory immediately;
When the data slicer present in the memory reaches specified quantity, data slicer is concurrently imported into the second data simultaneously
Library.
In some embodiments, data slicer is concurrently imported to the second database simultaneously includes: by the specified quantity
Data slicer concurrently simultaneously import the second database.
It in some embodiments, include: to obtain data slicer threshold value to data sectional to be conducted in first database,
And the data of conduction are divided into multiple data slicers of specified size according to data slicer threshold value.
In some embodiments, data slicer is stored in the buffer area of memory, and is not generated for data slicer
Corresponding slice file.
In some embodiments, data slicer is concurrently imported to the second database simultaneously includes: by the institute in memory
There is data slicer concurrently while importing the second database.
In some embodiments, when first database sequentially generates multiple data slicers, every two adjacent data
Have between slice and generates time interval.
In some embodiments, first database is relevant database K-DB;Second database is
The database of ElasticSearch search engine.
In some embodiments, data to be conducted in K-DB database are segmented using shell script;It will be interior
Multiple data slicers in depositing import the database of ElasticSearch search engine using bulk.
The another aspect of the embodiment of the present invention additionally provides a kind of device of integration across database batch conduct data, comprising:
First database;
Second database;
Memory;
Memory is stored with the program code that can be run;
At least one processor executes above-mentioned integration across database batch conduction number in the program code of run memory storage
According to method, data are transmitted to the second database by memory from first database.
The another aspect of the embodiment of the present invention additionally provides a kind of database, comprising:
Memory is stored with the program code that can be run;
At least one processor, can be by executing above-mentioned integration across database batch in the program code of run memory storage
The method of amount conduct data carrys out batch conduct data.
The present invention has following advantageous effects: the side of integration across database batch conduct data provided in an embodiment of the present invention
Method and device are cut by sequentially generating multiple data slicers to data sectional to be conducted in first database in each data
After piece is generated, the data slicer is exported in memory immediately, when the data slicer present in the memory reaches specified quantity,
The technical solution that data slicer is concurrently imported to the second database simultaneously, can be directed to different data or different types of data
It carries out integration across database batch to conduct, improves batch data efficiency of transmission, reduce transmission time.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with
It obtains other drawings based on these drawings.
Fig. 1 is the flow diagram of the method for integration across database batch conduct data provided by the invention;
Fig. 2 is the data transmission schematic diagram of the embodiment of the method for integration across database batch conduct data provided by the invention.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with specific embodiment, and reference
The embodiment of the present invention is further described in attached drawing.
It should be noted that all statements for using " first " and " second " are for differentiation two in the embodiment of the present invention
The non-equal entity of a same names or non-equal parameter, it is seen that " first " " second " only for the convenience of statement, does not answer
It is interpreted as the restriction to the embodiment of the present invention, subsequent embodiment no longer illustrates this one by one.
Based on above-mentioned purpose, the first aspect of the embodiment of the present invention, proposing one kind can for different data or not
The data of same type carry out the embodiment of the method for integration across database batch conduction.Shown in fig. 1 is provided by the invention across data
The flow diagram of the embodiment of the method for library batch conduct data.
The method of the integration across database batch conduct data the following steps are included:
Step S101 sequentially generates multiple data slicers to data sectional to be conducted in first database;
Step S103 immediately exports to the data slicer in memory after each data slicer is generated;
Step S105 concurrently leads data slicer when the data slicer present in the memory reaches specified quantity simultaneously
Enter the second database.
Data are exported to memory from first database by JDBC mode by the present invention, are carried out while export to data
Slice, while the second database is imported parallel to the batch data of slice, realize the purpose that data are quickly conducted.JDBC is a kind of
For executing the java application interface (API) of SQL statement, including the class write with Java language and interface, JDBC can be with
Unified access is provided for a variety of relational databases.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, Ke Yitong
Computer program is crossed to instruct related hardware and complete, the program can be stored in a computer-readable storage medium,
The program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, the storage medium can for magnetic disk,
CD, read-only memory (ROM) or random access memory (RAM) etc..The embodiment of the computer program, Ke Yida
The effect identical or similar to corresponding aforementioned any means embodiment.
In some embodiments, when in memory there are when multiple data slicers, concurrently simultaneously by multiple data slicers
Import the second database.
It in some embodiments, include: to obtain data slicer threshold value to data sectional to be conducted in first database,
And the data of conduction are divided into multiple data slicers of specified size according to data slicer threshold value.Data slicer threshold value can be with
It is number of data, such as can just forms a data slicer with every 10,000 data.
Disclosed method is also implemented as the computer program executed by CPU, the calculating according to embodiments of the present invention
Machine program may be stored in a computer readable storage medium.When the computer program is executed by CPU, executes the present invention and implement
The above-mentioned function of being limited in method disclosed in example.Above method step also can use controller and for storing so that controlling
Device realizes that the computer readable storage medium of the computer program of above-mentioned steps is realized.
In some embodiments, data slicer is stored in the buffer area of memory, and is not generated for data slicer
Corresponding slice file.For data slicer in memory with the storage of JSON format, JSON format is the number of ElasticSearch default
According to format.
Computer readable storage medium (such as buffer area of memory) as described herein can be volatile memory or non-
Volatile memory, or may include both volatile memory and nonvolatile memory.As an example and not restrictive
, nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM (EPROM),
Electrically erasable programmable ROM (EEPROM) or flash memory.Volatile memory may include random access memory
(RAM), which can serve as external cache.As an example and not restrictive, RAM can be in a variety of forms
It obtains, such as synchronous random access memory (DRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate SDRAM (DDR
SDRAM), enhance SDRAM (ESDRAM), synchronization link DRAM (SLDRAM) and directly Rambus RAM (DRRAM).Institute is public
The storage equipment for the aspect opened is intended to the memory of including but not limited to these and other suitable type.
In some embodiments, data slicer is concurrently imported to the second database simultaneously includes: by specified quantity
Data slicer concurrently imports the second database simultaneously.In other embodiment, data slicer is concurrently led simultaneously
Enter the second database include: by all data slicers in memory concurrently and meanwhile import the second database.
In some embodiments, when first database sequentially generates multiple data slicers, every two adjacent data
Have between slice and generates time interval.This time interval of time interval is determined by the processing speed of this side of first database
It is fixed.
The step of method in conjunction with described in disclosure herein or algorithm, can be directly contained in hardware, be held by processor
In capable software module or in combination of the two.Software module may reside within RAM memory, flash memory, ROM storage
Device, eprom memory, eeprom memory, register, hard disk, removable disk, CD-ROM or known in the art it is any its
In the storage medium of its form.Illustrative storage medium is coupled to processor, enables a processor to from the storage medium
Information is written to the storage medium in middle reading information.In an alternative, the storage medium can be with processor collection
At together.Pocessor and storage media may reside in ASIC.ASIC may reside in user terminal.It is replaced at one
In scheme, it is resident in the user terminal that pocessor and storage media can be used as discrete assembly.
In some embodiments, first database is relevant database K-DB;Second database is
The database of ElasticSearch search engine.K-DB is tide from the domestic relevant database ground, is functionally similar to
Oracle database.ElasticSearch is the Enterprise search engine of Java language exploitation, for searching in real time in cloud computing
Rope, it is stable, reliable, quick, easy to install and use.
In some embodiments, data to be conducted in K-DB database are segmented using shell script;It will be interior
Multiple data slicers in depositing import the database of ElasticSearch search engine using bulk.SHELL is LINUX operation
The set of the relevant operation system order of the execution language of system.Bulk mode is a kind of batch mode, i.e. the importing side of data
Formula is not imported once using a data, is imported simultaneously using a plurality of data.
As shown in Fig. 2, memory is connected to K-DB database using JDBC mode, SQL statement is executed, desired data are inquired
With determination data to be sent.SHELL script is executed while determining data, according to given requirements (such as every 10,000 numbers
According to) generate data slicer.After pending data number of sections or total size reach certain threshold value, number is imported using Bulk mode batch
According to parallel imported into the data of data slicer in ES.It generates data slicer, export data slicer and imports data slicer
Work carry out simultaneously.
The various exemplary databases in conjunction with described in disclosure herein may be implemented as electronic hardware, computer software
Or both combination.In order to clearly demonstrate this interchangeability of hardware and software, with regard to various exemplary data libraries
Function has carried out general description to it.This function is implemented as software and is also implemented as hardware depending on specifically answering
With and be applied to the design constraint of whole system.Those skilled in the art can for every kind of concrete application in various ways come
Realize the function, but this realization decision should not be interpreted as causing a departure from range disclosed by the embodiments of the present invention.
From above-described embodiment as can be seen that the method for integration across database batch conduct data provided in an embodiment of the present invention, leads to
It crosses to data sectional to be conducted in first database, sequentially generates multiple data slicers, after each data slicer is generated,
The data slicer is exported in memory immediately, when the data slicer present in the memory reaches specified quantity, by data slicer
The technical solution of the second database is concurrently imported simultaneously, can be carried out for different data or different types of data across data
Library batch conducts, and improves batch data efficiency of transmission, reduces transmission time.
It is important to note that each step in each embodiment of the method for above-mentioned integration across database batch conduct data
Suddenly it can intersect, replace, increase, deleting, therefore, these reasonable permutation and combination transformation are passed in integration across database batch
The method of derivative evidence should also be as belonging to the scope of protection of the present invention, and protection scope of the present invention should not be confined to the reality
It applies on example.
Based on above-mentioned purpose, the second aspect of the embodiment of the present invention, proposing one kind can for different data or not
The data of same type carry out the embodiment of the device of integration across database batch conduction.Described device includes:
First database;
Second database;
Memory;
Memory is stored with the program code that can be run;
At least one processor executes above-mentioned integration across database batch conduction number in the program code of run memory storage
According to method data are transmitted to the second database by memory from first database.
It can be various electric terminal equipments, such as mobile phone, a number that the embodiment of the present invention, which discloses described device, equipment etc.,
Word assistant (PDA), tablet computer (PAD), smart television etc., are also possible to large-scale terminal device, such as server, therefore this hair
Protection scope disclosed in bright embodiment should not limit as certain certain types of device, equipment.The embodiment of the present invention discloses described
Client can be with the combining form of electronic hardware, computer software or both be applied to any one of the above electric terminal
In equipment.
Based on above-mentioned purpose, in terms of the third of the embodiment of the present invention, proposing one kind can for different data or not
The data of same type carry out the embodiment of the database of integration across database batch conduction.Database includes being stored with the program that can be run
The memory of code and at least one processor, processor can be by holding in the program code that run memory stores
The method of the above-mentioned integration across database batch conduct data of row carrys out batch conduct data.
The various exemplary databases in conjunction with described in disclosure herein, which can use, to be designed to execute institute here
The following component of function is stated to realize or execute: general processor, digital signal processor (DSP), specific integrated circuit
(ASIC), field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hard
Any combination of part component or these components.General processor can be microprocessor, but alternatively, and processor can be with
It is any conventional processors, controller, microcontroller or state machine.Processor also may be implemented as calculating the combination of equipment,
For example, the combination of DSP and microprocessor, multi-microprocessor, one or more microprocessors combination DSP and/or any other
This configuration.
From above-described embodiment as can be seen that the device sum number of integration across database batch conduct data provided in an embodiment of the present invention
According to library, by sequentially generating multiple data slicers to data sectional to be conducted in first database, in each data slicer quilt
After generation, the data slicer is exported in memory immediately, when the data slicer present in the memory reaches specified quantity, will be counted
It concurrently imports the technical solution of the second database simultaneously according to slice, can be carried out for different data or different types of data
Integration across database batch conducts, and improves batch data efficiency of transmission, reduces transmission time.
It is important to note that the device of above-mentioned integration across database batch conduct data and the embodiment of database use
The embodiment of the method for the integration across database batch conduct data illustrates the course of work of each module, those skilled in the art
Member can be it is readily conceivable that by the other embodiments of the method for these module applications to the integration across database batch conduct data
In.Certainly, since each step in the embodiment of the method for the integration across database batch conduct data can intersect, replace
Change, increase, delete, therefore, the device in the integration across database batch conduct data of these reasonable permutation and combination transformation and
Database should also be as belonging to the scope of protection of the present invention, and protection scope of the present invention should not be confined to the embodiment it
On.
It is exemplary embodiment disclosed by the invention above, it should be noted that in the sheet limited without departing substantially from claim
Under the premise of inventive embodiments scope of disclosure, it may be many modifications and modify.According to open embodiment described herein
The function of claim to a method, step and/or movement be not required to the execution of any particular order.In addition, although the present invention is implemented
Element disclosed in example can be described or be required in the form of individual, but be unless explicitly limited odd number, it is understood that be multiple.
It should be understood that it is used in the present context, unless the context clearly supports exceptions, singular " one
It is a " it is intended to also include plural form.It is to be further understood that "and/or" used herein refers to including one or one
Any and all possible combinations of a above project listed in association.The embodiment of the present invention discloses embodiment sequence number
Description, does not represent the advantages or disadvantages of the embodiments.
It should be understood by those ordinary skilled in the art that: the discussion of any of the above embodiment is exemplary only, not
It is intended to imply that range disclosed by the embodiments of the present invention (including claim) is limited to these examples;In the think of of the embodiment of the present invention
Under road, it can also be combined between the technical characteristic in above embodiments or different embodiments, and exist as described above
Many other variations of the different aspect of the embodiment of the present invention, for simplicity, they are not provided in details.Therefore, all at this
Within the spirit and principle of inventive embodiments, any omission, modification, equivalent replacement, improvement for being made etc. should be included in this hair
Within the protection scope of bright embodiment.
Claims (10)
1. a kind of method of integration across database batch conduct data, which comprises the following steps:
To data sectional to be conducted in first database, multiple data slicers are sequentially generated;
After each data slicer is generated, the data slicer is exported in memory immediately;
When the data slicer present in the memory reaches specified quantity, the data slicer is concurrently imported simultaneously
Second database.
2. the method according to claim 1, wherein the data slicer is concurrently imported described second simultaneously
Database include: by the data slicer of the specified quantity concurrently and meanwhile import the second database.
3. the method according to claim 1, wherein to the data to be conducted in the first database point
Section includes: to obtain data slicer threshold value, and the data of conduction are divided into according to the data slicer threshold value and are specified greatly
Small multiple data slicers.
4. the method according to claim 1, wherein the data slicer to be stored in the buffer area of the memory
In, and corresponding slice file is not generated for the data slicer.
5. the method according to claim 1, wherein the data slicer is concurrently imported described second simultaneously
Database includes: that all data slicers in the memory concurrently while being imported second database.
6. the method according to claim 1, wherein sequentially generating multiple data in the first database
When slice, has between the every two adjacent data slicer and generate time interval.
7. method described in any one of -6 according to claim 1, which is characterized in that the first database is relationship type number
According to library K-DB;Second database is the database of ElasticSearch search engine.
8. the method according to the description of claim 7 is characterized in that using data to be conducted in the K-DB database
Shell script is segmented;Multiple data slicers in the memory are imported into the ElasticSearch using bulk
The database of search engine.
9. a kind of device of integration across database batch conduct data characterized by comprising
First database;
Second database;
Memory;
Memory is stored with the program code that can be run;
At least one processor is executed when running the said program code of the memory storage as appointed in claim 1-8
Data are passed through the memory from the first database by the method for integration across database batch conduct data described in meaning one
It is transmitted to second database.
10. a kind of database characterized by comprising
Memory is stored with the program code that can be run;
At least one processor, can be by executing such as claim when running the said program code of the memory storage
The method of integration across database batch conduct data described in any one of 1-8 carrys out batch conduct data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811050114.1A CN109284335A (en) | 2018-09-10 | 2018-09-10 | A kind of method and apparatus of integration across database batch conduct data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811050114.1A CN109284335A (en) | 2018-09-10 | 2018-09-10 | A kind of method and apparatus of integration across database batch conduct data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109284335A true CN109284335A (en) | 2019-01-29 |
Family
ID=65181038
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811050114.1A Pending CN109284335A (en) | 2018-09-10 | 2018-09-10 | A kind of method and apparatus of integration across database batch conduct data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109284335A (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101515291A (en) * | 2009-03-26 | 2009-08-26 | 北京泰合佳通信息技术有限公司 | Method for leading data into database in a batch way and system thereof |
CN103699638A (en) * | 2013-12-23 | 2014-04-02 | 国云科技股份有限公司 | Method for realizing cross-database type synchronous data based on configuration parameters |
CN104598563A (en) * | 2015-01-08 | 2015-05-06 | 北京京东尚科信息技术有限公司 | High concurrency data storage method and device |
CN105808577A (en) * | 2014-12-29 | 2016-07-27 | 北京神州泰岳软件股份有限公司 | HBase database-based data batch loading method and device |
CN105843933A (en) * | 2016-03-30 | 2016-08-10 | 电子科技大学 | Index building method for distributed memory columnar database |
CN105843955A (en) * | 2016-04-13 | 2016-08-10 | 曙光信息产业(北京)有限公司 | Data migration system |
-
2018
- 2018-09-10 CN CN201811050114.1A patent/CN109284335A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101515291A (en) * | 2009-03-26 | 2009-08-26 | 北京泰合佳通信息技术有限公司 | Method for leading data into database in a batch way and system thereof |
CN103699638A (en) * | 2013-12-23 | 2014-04-02 | 国云科技股份有限公司 | Method for realizing cross-database type synchronous data based on configuration parameters |
CN105808577A (en) * | 2014-12-29 | 2016-07-27 | 北京神州泰岳软件股份有限公司 | HBase database-based data batch loading method and device |
CN104598563A (en) * | 2015-01-08 | 2015-05-06 | 北京京东尚科信息技术有限公司 | High concurrency data storage method and device |
CN105843933A (en) * | 2016-03-30 | 2016-08-10 | 电子科技大学 | Index building method for distributed memory columnar database |
CN105843955A (en) * | 2016-04-13 | 2016-08-10 | 曙光信息产业(北京)有限公司 | Data migration system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6928104B2 (en) | Methods and Devices for Writing Service Data to the Blockchain, and Methods for Determining the Service Subset | |
US9367574B2 (en) | Efficient query processing in columnar databases using bloom filters | |
CN109155763B (en) | Digital signal processing on data stream | |
US8380680B2 (en) | Piecemeal list prefetch | |
CN106844682B (en) | Method for interchanging data, apparatus and system | |
US20150379083A1 (en) | Custom query execution engine | |
CN112463886B (en) | Data processing method and device, electronic equipment and storage medium | |
US11036732B2 (en) | Subquery predicate generation to reduce processing in a multi-table join | |
Groß et al. | Bridging two worlds with RICE: integrating R into the SAP in-memory computing engine | |
CN105279261B (en) | Dynamic scalable database filing method and system | |
US20090327220A1 (en) | Automated client/server operation partitioning | |
AU2019222934B2 (en) | Cloud-based database-less serverless framework using data foundation | |
US9135572B2 (en) | Method and arrangement for processing data | |
US10983815B1 (en) | System and method for implementing a generic parser module | |
US11074246B2 (en) | Cluster-based random walk processing | |
Khayyat et al. | Fast and scalable inequality joins | |
US20160117364A1 (en) | Generating imperative-language query code from declarative-language query code | |
WO2021057482A1 (en) | Method and device for generating bloom filter in blockchain | |
CN105095425A (en) | Cross-database transfer method and device for databases | |
Zhang et al. | Agriculture Big Data: Research status, challenges and countermeasures | |
Gupta | Real-Time Big Data Analytics | |
CN103559247A (en) | Data service processing method and device | |
CN107169047A (en) | A kind of method and device for realizing data buffer storage | |
US11262986B2 (en) | Automatic software generation for computer systems | |
CN109284335A (en) | A kind of method and apparatus of integration across database batch conduct data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190129 |