CN109284335A - A kind of method and apparatus of integration across database batch conduct data - Google Patents

A kind of method and apparatus of integration across database batch conduct data Download PDF

Info

Publication number
CN109284335A
CN109284335A CN201811050114.1A CN201811050114A CN109284335A CN 109284335 A CN109284335 A CN 109284335A CN 201811050114 A CN201811050114 A CN 201811050114A CN 109284335 A CN109284335 A CN 109284335A
Authority
CN
China
Prior art keywords
data
database
memory
slicer
integration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811050114.1A
Other languages
Chinese (zh)
Inventor
魏本帅
杜彦魁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201811050114.1A priority Critical patent/CN109284335A/en
Publication of CN109284335A publication Critical patent/CN109284335A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of method and apparatus of integration across database batch conduct data, comprising: to data sectional to be conducted in first database, sequentially generates multiple data slicers;After each data slicer is generated, the data slicer is exported in memory immediately;When the data slicer present in the memory reaches specified quantity, data slicer is concurrently imported into the second database simultaneously.Technical solution of the present invention can carry out integration across database batch for different data or different types of data and conduct, and improve batch data efficiency of transmission, reduce transmission time.

Description

A kind of method and apparatus of integration across database batch conduct data
Technical field
The present invention relates to field of data transmission, and more specifically, more particularly to a kind of integration across database batch conduction number According to method and apparatus.
Background technique
Most common big data search engine is ElasticSearch (ES) in the prior art, and the data loaded into ES can To be rapidly searched and inquire.ES data source first is that from relevant database (such as K-DB database).In the prior art Lack the scheme that data are loaded into ES from K-DB rapid batch.The prior art is that data are gone out and generated from K-DB data base querying Data file (i.e. so-called file landing), then the data file of landing is imported in ES, it is that full dose imports due to importing, in number It is lower according to efficiency is imported when measuring big.
Aiming at the problem that lacking integration across database rapid batch conduct data method in the prior art, there has been no effective at present Solution.
Summary of the invention
In view of this, the purpose of the embodiment of the present invention is to propose the method and dress of a kind of integration across database batch conduct data It sets, integration across database batch can be carried out for different data or different types of data and conducted, improve batch data efficiency of transmission, Reduce transmission time.
Based on above-mentioned purpose, the one side of the embodiment of the present invention provides a kind of side of integration across database batch conduct data Method, comprising the following steps:
To data sectional to be conducted in first database, multiple data slicers are sequentially generated;
After each data slicer is generated, the data slicer is exported in memory immediately;
When the data slicer present in the memory reaches specified quantity, data slicer is concurrently imported into the second data simultaneously Library.
In some embodiments, data slicer is concurrently imported to the second database simultaneously includes: by the specified quantity Data slicer concurrently simultaneously import the second database.
It in some embodiments, include: to obtain data slicer threshold value to data sectional to be conducted in first database, And the data of conduction are divided into multiple data slicers of specified size according to data slicer threshold value.
In some embodiments, data slicer is stored in the buffer area of memory, and is not generated for data slicer Corresponding slice file.
In some embodiments, data slicer is concurrently imported to the second database simultaneously includes: by the institute in memory There is data slicer concurrently while importing the second database.
In some embodiments, when first database sequentially generates multiple data slicers, every two adjacent data Have between slice and generates time interval.
In some embodiments, first database is relevant database K-DB;Second database is The database of ElasticSearch search engine.
In some embodiments, data to be conducted in K-DB database are segmented using shell script;It will be interior Multiple data slicers in depositing import the database of ElasticSearch search engine using bulk.
The another aspect of the embodiment of the present invention additionally provides a kind of device of integration across database batch conduct data, comprising:
First database;
Second database;
Memory;
Memory is stored with the program code that can be run;
At least one processor executes above-mentioned integration across database batch conduction number in the program code of run memory storage According to method, data are transmitted to the second database by memory from first database.
The another aspect of the embodiment of the present invention additionally provides a kind of database, comprising:
Memory is stored with the program code that can be run;
At least one processor, can be by executing above-mentioned integration across database batch in the program code of run memory storage The method of amount conduct data carrys out batch conduct data.
The present invention has following advantageous effects: the side of integration across database batch conduct data provided in an embodiment of the present invention Method and device are cut by sequentially generating multiple data slicers to data sectional to be conducted in first database in each data After piece is generated, the data slicer is exported in memory immediately, when the data slicer present in the memory reaches specified quantity, The technical solution that data slicer is concurrently imported to the second database simultaneously, can be directed to different data or different types of data It carries out integration across database batch to conduct, improves batch data efficiency of transmission, reduce transmission time.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with It obtains other drawings based on these drawings.
Fig. 1 is the flow diagram of the method for integration across database batch conduct data provided by the invention;
Fig. 2 is the data transmission schematic diagram of the embodiment of the method for integration across database batch conduct data provided by the invention.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with specific embodiment, and reference The embodiment of the present invention is further described in attached drawing.
It should be noted that all statements for using " first " and " second " are for differentiation two in the embodiment of the present invention The non-equal entity of a same names or non-equal parameter, it is seen that " first " " second " only for the convenience of statement, does not answer It is interpreted as the restriction to the embodiment of the present invention, subsequent embodiment no longer illustrates this one by one.
Based on above-mentioned purpose, the first aspect of the embodiment of the present invention, proposing one kind can for different data or not The data of same type carry out the embodiment of the method for integration across database batch conduction.Shown in fig. 1 is provided by the invention across data The flow diagram of the embodiment of the method for library batch conduct data.
The method of the integration across database batch conduct data the following steps are included:
Step S101 sequentially generates multiple data slicers to data sectional to be conducted in first database;
Step S103 immediately exports to the data slicer in memory after each data slicer is generated;
Step S105 concurrently leads data slicer when the data slicer present in the memory reaches specified quantity simultaneously Enter the second database.
Data are exported to memory from first database by JDBC mode by the present invention, are carried out while export to data Slice, while the second database is imported parallel to the batch data of slice, realize the purpose that data are quickly conducted.JDBC is a kind of For executing the java application interface (API) of SQL statement, including the class write with Java language and interface, JDBC can be with Unified access is provided for a variety of relational databases.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, Ke Yitong Computer program is crossed to instruct related hardware and complete, the program can be stored in a computer-readable storage medium, The program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, the storage medium can for magnetic disk, CD, read-only memory (ROM) or random access memory (RAM) etc..The embodiment of the computer program, Ke Yida The effect identical or similar to corresponding aforementioned any means embodiment.
In some embodiments, when in memory there are when multiple data slicers, concurrently simultaneously by multiple data slicers Import the second database.
It in some embodiments, include: to obtain data slicer threshold value to data sectional to be conducted in first database, And the data of conduction are divided into multiple data slicers of specified size according to data slicer threshold value.Data slicer threshold value can be with It is number of data, such as can just forms a data slicer with every 10,000 data.
Disclosed method is also implemented as the computer program executed by CPU, the calculating according to embodiments of the present invention Machine program may be stored in a computer readable storage medium.When the computer program is executed by CPU, executes the present invention and implement The above-mentioned function of being limited in method disclosed in example.Above method step also can use controller and for storing so that controlling Device realizes that the computer readable storage medium of the computer program of above-mentioned steps is realized.
In some embodiments, data slicer is stored in the buffer area of memory, and is not generated for data slicer Corresponding slice file.For data slicer in memory with the storage of JSON format, JSON format is the number of ElasticSearch default According to format.
Computer readable storage medium (such as buffer area of memory) as described herein can be volatile memory or non- Volatile memory, or may include both volatile memory and nonvolatile memory.As an example and not restrictive , nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM (EPROM), Electrically erasable programmable ROM (EEPROM) or flash memory.Volatile memory may include random access memory (RAM), which can serve as external cache.As an example and not restrictive, RAM can be in a variety of forms It obtains, such as synchronous random access memory (DRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate SDRAM (DDR SDRAM), enhance SDRAM (ESDRAM), synchronization link DRAM (SLDRAM) and directly Rambus RAM (DRRAM).Institute is public The storage equipment for the aspect opened is intended to the memory of including but not limited to these and other suitable type.
In some embodiments, data slicer is concurrently imported to the second database simultaneously includes: by specified quantity Data slicer concurrently imports the second database simultaneously.In other embodiment, data slicer is concurrently led simultaneously Enter the second database include: by all data slicers in memory concurrently and meanwhile import the second database.
In some embodiments, when first database sequentially generates multiple data slicers, every two adjacent data Have between slice and generates time interval.This time interval of time interval is determined by the processing speed of this side of first database It is fixed.
The step of method in conjunction with described in disclosure herein or algorithm, can be directly contained in hardware, be held by processor In capable software module or in combination of the two.Software module may reside within RAM memory, flash memory, ROM storage Device, eprom memory, eeprom memory, register, hard disk, removable disk, CD-ROM or known in the art it is any its In the storage medium of its form.Illustrative storage medium is coupled to processor, enables a processor to from the storage medium Information is written to the storage medium in middle reading information.In an alternative, the storage medium can be with processor collection At together.Pocessor and storage media may reside in ASIC.ASIC may reside in user terminal.It is replaced at one In scheme, it is resident in the user terminal that pocessor and storage media can be used as discrete assembly.
In some embodiments, first database is relevant database K-DB;Second database is The database of ElasticSearch search engine.K-DB is tide from the domestic relevant database ground, is functionally similar to Oracle database.ElasticSearch is the Enterprise search engine of Java language exploitation, for searching in real time in cloud computing Rope, it is stable, reliable, quick, easy to install and use.
In some embodiments, data to be conducted in K-DB database are segmented using shell script;It will be interior Multiple data slicers in depositing import the database of ElasticSearch search engine using bulk.SHELL is LINUX operation The set of the relevant operation system order of the execution language of system.Bulk mode is a kind of batch mode, i.e. the importing side of data Formula is not imported once using a data, is imported simultaneously using a plurality of data.
As shown in Fig. 2, memory is connected to K-DB database using JDBC mode, SQL statement is executed, desired data are inquired With determination data to be sent.SHELL script is executed while determining data, according to given requirements (such as every 10,000 numbers According to) generate data slicer.After pending data number of sections or total size reach certain threshold value, number is imported using Bulk mode batch According to parallel imported into the data of data slicer in ES.It generates data slicer, export data slicer and imports data slicer Work carry out simultaneously.
The various exemplary databases in conjunction with described in disclosure herein may be implemented as electronic hardware, computer software Or both combination.In order to clearly demonstrate this interchangeability of hardware and software, with regard to various exemplary data libraries Function has carried out general description to it.This function is implemented as software and is also implemented as hardware depending on specifically answering With and be applied to the design constraint of whole system.Those skilled in the art can for every kind of concrete application in various ways come Realize the function, but this realization decision should not be interpreted as causing a departure from range disclosed by the embodiments of the present invention.
From above-described embodiment as can be seen that the method for integration across database batch conduct data provided in an embodiment of the present invention, leads to It crosses to data sectional to be conducted in first database, sequentially generates multiple data slicers, after each data slicer is generated, The data slicer is exported in memory immediately, when the data slicer present in the memory reaches specified quantity, by data slicer The technical solution of the second database is concurrently imported simultaneously, can be carried out for different data or different types of data across data Library batch conducts, and improves batch data efficiency of transmission, reduces transmission time.
It is important to note that each step in each embodiment of the method for above-mentioned integration across database batch conduct data Suddenly it can intersect, replace, increase, deleting, therefore, these reasonable permutation and combination transformation are passed in integration across database batch The method of derivative evidence should also be as belonging to the scope of protection of the present invention, and protection scope of the present invention should not be confined to the reality It applies on example.
Based on above-mentioned purpose, the second aspect of the embodiment of the present invention, proposing one kind can for different data or not The data of same type carry out the embodiment of the device of integration across database batch conduction.Described device includes:
First database;
Second database;
Memory;
Memory is stored with the program code that can be run;
At least one processor executes above-mentioned integration across database batch conduction number in the program code of run memory storage According to method data are transmitted to the second database by memory from first database.
It can be various electric terminal equipments, such as mobile phone, a number that the embodiment of the present invention, which discloses described device, equipment etc., Word assistant (PDA), tablet computer (PAD), smart television etc., are also possible to large-scale terminal device, such as server, therefore this hair Protection scope disclosed in bright embodiment should not limit as certain certain types of device, equipment.The embodiment of the present invention discloses described Client can be with the combining form of electronic hardware, computer software or both be applied to any one of the above electric terminal In equipment.
Based on above-mentioned purpose, in terms of the third of the embodiment of the present invention, proposing one kind can for different data or not The data of same type carry out the embodiment of the database of integration across database batch conduction.Database includes being stored with the program that can be run The memory of code and at least one processor, processor can be by holding in the program code that run memory stores The method of the above-mentioned integration across database batch conduct data of row carrys out batch conduct data.
The various exemplary databases in conjunction with described in disclosure herein, which can use, to be designed to execute institute here The following component of function is stated to realize or execute: general processor, digital signal processor (DSP), specific integrated circuit (ASIC), field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hard Any combination of part component or these components.General processor can be microprocessor, but alternatively, and processor can be with It is any conventional processors, controller, microcontroller or state machine.Processor also may be implemented as calculating the combination of equipment, For example, the combination of DSP and microprocessor, multi-microprocessor, one or more microprocessors combination DSP and/or any other This configuration.
From above-described embodiment as can be seen that the device sum number of integration across database batch conduct data provided in an embodiment of the present invention According to library, by sequentially generating multiple data slicers to data sectional to be conducted in first database, in each data slicer quilt After generation, the data slicer is exported in memory immediately, when the data slicer present in the memory reaches specified quantity, will be counted It concurrently imports the technical solution of the second database simultaneously according to slice, can be carried out for different data or different types of data Integration across database batch conducts, and improves batch data efficiency of transmission, reduces transmission time.
It is important to note that the device of above-mentioned integration across database batch conduct data and the embodiment of database use The embodiment of the method for the integration across database batch conduct data illustrates the course of work of each module, those skilled in the art Member can be it is readily conceivable that by the other embodiments of the method for these module applications to the integration across database batch conduct data In.Certainly, since each step in the embodiment of the method for the integration across database batch conduct data can intersect, replace Change, increase, delete, therefore, the device in the integration across database batch conduct data of these reasonable permutation and combination transformation and Database should also be as belonging to the scope of protection of the present invention, and protection scope of the present invention should not be confined to the embodiment it On.
It is exemplary embodiment disclosed by the invention above, it should be noted that in the sheet limited without departing substantially from claim Under the premise of inventive embodiments scope of disclosure, it may be many modifications and modify.According to open embodiment described herein The function of claim to a method, step and/or movement be not required to the execution of any particular order.In addition, although the present invention is implemented Element disclosed in example can be described or be required in the form of individual, but be unless explicitly limited odd number, it is understood that be multiple.
It should be understood that it is used in the present context, unless the context clearly supports exceptions, singular " one It is a " it is intended to also include plural form.It is to be further understood that "and/or" used herein refers to including one or one Any and all possible combinations of a above project listed in association.The embodiment of the present invention discloses embodiment sequence number Description, does not represent the advantages or disadvantages of the embodiments.
It should be understood by those ordinary skilled in the art that: the discussion of any of the above embodiment is exemplary only, not It is intended to imply that range disclosed by the embodiments of the present invention (including claim) is limited to these examples;In the think of of the embodiment of the present invention Under road, it can also be combined between the technical characteristic in above embodiments or different embodiments, and exist as described above Many other variations of the different aspect of the embodiment of the present invention, for simplicity, they are not provided in details.Therefore, all at this Within the spirit and principle of inventive embodiments, any omission, modification, equivalent replacement, improvement for being made etc. should be included in this hair Within the protection scope of bright embodiment.

Claims (10)

1. a kind of method of integration across database batch conduct data, which comprises the following steps:
To data sectional to be conducted in first database, multiple data slicers are sequentially generated;
After each data slicer is generated, the data slicer is exported in memory immediately;
When the data slicer present in the memory reaches specified quantity, the data slicer is concurrently imported simultaneously Second database.
2. the method according to claim 1, wherein the data slicer is concurrently imported described second simultaneously Database include: by the data slicer of the specified quantity concurrently and meanwhile import the second database.
3. the method according to claim 1, wherein to the data to be conducted in the first database point Section includes: to obtain data slicer threshold value, and the data of conduction are divided into according to the data slicer threshold value and are specified greatly Small multiple data slicers.
4. the method according to claim 1, wherein the data slicer to be stored in the buffer area of the memory In, and corresponding slice file is not generated for the data slicer.
5. the method according to claim 1, wherein the data slicer is concurrently imported described second simultaneously Database includes: that all data slicers in the memory concurrently while being imported second database.
6. the method according to claim 1, wherein sequentially generating multiple data in the first database When slice, has between the every two adjacent data slicer and generate time interval.
7. method described in any one of -6 according to claim 1, which is characterized in that the first database is relationship type number According to library K-DB;Second database is the database of ElasticSearch search engine.
8. the method according to the description of claim 7 is characterized in that using data to be conducted in the K-DB database Shell script is segmented;Multiple data slicers in the memory are imported into the ElasticSearch using bulk The database of search engine.
9. a kind of device of integration across database batch conduct data characterized by comprising
First database;
Second database;
Memory;
Memory is stored with the program code that can be run;
At least one processor is executed when running the said program code of the memory storage as appointed in claim 1-8 Data are passed through the memory from the first database by the method for integration across database batch conduct data described in meaning one It is transmitted to second database.
10. a kind of database characterized by comprising
Memory is stored with the program code that can be run;
At least one processor, can be by executing such as claim when running the said program code of the memory storage The method of integration across database batch conduct data described in any one of 1-8 carrys out batch conduct data.
CN201811050114.1A 2018-09-10 2018-09-10 A kind of method and apparatus of integration across database batch conduct data Pending CN109284335A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811050114.1A CN109284335A (en) 2018-09-10 2018-09-10 A kind of method and apparatus of integration across database batch conduct data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811050114.1A CN109284335A (en) 2018-09-10 2018-09-10 A kind of method and apparatus of integration across database batch conduct data

Publications (1)

Publication Number Publication Date
CN109284335A true CN109284335A (en) 2019-01-29

Family

ID=65181038

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811050114.1A Pending CN109284335A (en) 2018-09-10 2018-09-10 A kind of method and apparatus of integration across database batch conduct data

Country Status (1)

Country Link
CN (1) CN109284335A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101515291A (en) * 2009-03-26 2009-08-26 北京泰合佳通信息技术有限公司 Method for leading data into database in a batch way and system thereof
CN103699638A (en) * 2013-12-23 2014-04-02 国云科技股份有限公司 Method for realizing cross-database type synchronous data based on configuration parameters
CN104598563A (en) * 2015-01-08 2015-05-06 北京京东尚科信息技术有限公司 High concurrency data storage method and device
CN105808577A (en) * 2014-12-29 2016-07-27 北京神州泰岳软件股份有限公司 HBase database-based data batch loading method and device
CN105843933A (en) * 2016-03-30 2016-08-10 电子科技大学 Index building method for distributed memory columnar database
CN105843955A (en) * 2016-04-13 2016-08-10 曙光信息产业(北京)有限公司 Data migration system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101515291A (en) * 2009-03-26 2009-08-26 北京泰合佳通信息技术有限公司 Method for leading data into database in a batch way and system thereof
CN103699638A (en) * 2013-12-23 2014-04-02 国云科技股份有限公司 Method for realizing cross-database type synchronous data based on configuration parameters
CN105808577A (en) * 2014-12-29 2016-07-27 北京神州泰岳软件股份有限公司 HBase database-based data batch loading method and device
CN104598563A (en) * 2015-01-08 2015-05-06 北京京东尚科信息技术有限公司 High concurrency data storage method and device
CN105843933A (en) * 2016-03-30 2016-08-10 电子科技大学 Index building method for distributed memory columnar database
CN105843955A (en) * 2016-04-13 2016-08-10 曙光信息产业(北京)有限公司 Data migration system

Similar Documents

Publication Publication Date Title
JP6928104B2 (en) Methods and Devices for Writing Service Data to the Blockchain, and Methods for Determining the Service Subset
US9367574B2 (en) Efficient query processing in columnar databases using bloom filters
CN109155763B (en) Digital signal processing on data stream
US8380680B2 (en) Piecemeal list prefetch
CN106844682B (en) Method for interchanging data, apparatus and system
US20150379083A1 (en) Custom query execution engine
CN112463886B (en) Data processing method and device, electronic equipment and storage medium
US11036732B2 (en) Subquery predicate generation to reduce processing in a multi-table join
Groß et al. Bridging two worlds with RICE: integrating R into the SAP in-memory computing engine
CN105279261B (en) Dynamic scalable database filing method and system
US20090327220A1 (en) Automated client/server operation partitioning
AU2019222934B2 (en) Cloud-based database-less serverless framework using data foundation
US9135572B2 (en) Method and arrangement for processing data
US10983815B1 (en) System and method for implementing a generic parser module
US11074246B2 (en) Cluster-based random walk processing
Khayyat et al. Fast and scalable inequality joins
US20160117364A1 (en) Generating imperative-language query code from declarative-language query code
WO2021057482A1 (en) Method and device for generating bloom filter in blockchain
CN105095425A (en) Cross-database transfer method and device for databases
Zhang et al. Agriculture Big Data: Research status, challenges and countermeasures
Gupta Real-Time Big Data Analytics
CN103559247A (en) Data service processing method and device
CN107169047A (en) A kind of method and device for realizing data buffer storage
US11262986B2 (en) Automatic software generation for computer systems
CN109284335A (en) A kind of method and apparatus of integration across database batch conduct data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190129