CN107844581A - A kind of multi-resources Heterogeneous data fusion platform - Google Patents
A kind of multi-resources Heterogeneous data fusion platform Download PDFInfo
- Publication number
- CN107844581A CN107844581A CN201711113864.4A CN201711113864A CN107844581A CN 107844581 A CN107844581 A CN 107844581A CN 201711113864 A CN201711113864 A CN 201711113864A CN 107844581 A CN107844581 A CN 107844581A
- Authority
- CN
- China
- Prior art keywords
- data
- database
- modules
- fusion platform
- foreign key
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/258—Data format conversion from or to a database
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of multi-resources Heterogeneous data fusion platform;It is characterized in that:The multi-resources Heterogeneous data fusion platform includes Metadata modules, data read module, Transformer modular converters and Foreign Key repair modules;The framework platform serves irreplaceable key effect in actual items.Even if third party database abandons restricted selection significantly to simplify data synchronization technology and provides the low quality data storehouse containing invalid data, the framework still it can be converted into constraint it is sound, relation it is correct, it is strict logic, can be by quality data storehouse that production environment directly accesses.The exactly basic-level support of this framework, the exploitation of whole upper strata complicated applications just become feasible.
Description
Technical field
The present invention relates to domain of data fusion, is especially a kind of multi-resources Heterogeneous data fusion platform.
Background technology
Relevant database supports a variety of constraints, wherein including the foreign key constraint for ensureing referential integrity.Two datas can
To establish father and son's association, subdata provides a foreign key column to preserve the id of parent data, and user needs outer to the foundation of this foreign key column
Key constraint to ensure, or the external key of a data for sky, i.e., be temporarily not directed to any parent data, otherwise must be one
The legal id of existing parent data.Limited by force by such, the relation between data is always complete, and subdata is impossible
Hold illegal external key to lead to not point to an existing parent data and form an illegal and skimble-skamble fault relationships,
This is relevant database the reason for why being relational data, and general in database and civilian post class practitioner's hand
One of important difference of energization sub-table.
In actual items, it is impossible to which all data are all oneself to provide and safeguard, always have to use many third parties
Data, services.Ideally, third party's data-service providers should provide a sound database of constraint and in a steady stream
Constantly data change is synchronously come, but in fact, a part of data-service providers probably provide unconfined number
According to the pure business-driven type supplier that storehouse, especially technical strength are extremely weak.Once constraint database lacks, when subdata is first
It is synchronized in customer database but when parent data corresponding to it is not in time for push, should is only originally in subdata
Legal parent data id external key can preserves an id for not completing synchronous and not existing parent data also in violation of rules and regulations, finally
Cause a series of mistakes.This way ignores the set membership of data, and all data are considered as to extraneous data isolated each other, institute
With can regardless of sequencing, mechanically continued synchronization data change, the difficulty of data, services is greatly reduced.But logarithm
For consumer, the database so comprising a large amount of illegal relations can not be used directly, one long time
Although follow-up data syn-chronization afterwards can repair before the problem of, can also manufacture the problem of new simultaneously, database is in forever
Illegal state.
The content of the invention
Therefore, in order to solve above-mentioned deficiency, the present invention provides a kind of multi-resources Heterogeneous data fusion platform herein;The framework is put down
Platform serves irreplaceable key effect in actual items.Even if third party database is significantly simplified data synchronization technology
And abandon restricted selection and the low quality data storehouse containing invalid data be provided, the framework still it can be converted into constraint it is sound,
Relation correctly, strict logic, can be by quality data storehouse that production environment directly accesses.The exactly bottom branch of this framework
Hold, the exploitation of whole upper strata complicated applications just becomes feasible.
The present invention is achieved in that a kind of multi-resources Heterogeneous data fusion platform of construction, it is characterised in that:The multi-source is different
Prime number includes Metadata modules, data read module, Transformer modular converters and Foreign according to convergence platform
Key repair modules;
Wherein, Metadata modules are used for the code structure for analyzing user, automatically generate the SQL languages for building table, building constraint, indexing
Sentence, the desired data structure of user is generated in target database;
Wherein, system data read module directly reads all data from low quality source database;Meanwhile changed from source database
Data change is continuously read in daily record;
Wherein, Transformer modular converters obtain data change from event queue, personal code work are called, by old data
Need to be changed into desired new data according to the specific business of project;
At the same time, the new data after processing is put into high quality target database by Transformer modular converters;
Also, Transformer conversion modules notice Foreign Key repair modules new data arrives, it may be necessary to repairs outer
Key;
Described, Foreign Key repair modules will can become legal institute because of the arrival of latest data in target database
There is foreign key constraint all to repair.
According to a kind of multi-resources Heterogeneous data fusion platform of the present invention, it is characterised in that:System data read module from
Source database directly reads all data, and this process is extremely very long, often a couple of days;Therefore system is done once before reaching the standard grade first, only
This once, later no longer full dose update, the substitute is incremental update.
According to a kind of multi-resources Heterogeneous data fusion platform of the present invention, it is characterised in that:Come for data read module
Say, either which kind of mode reads data, is put into follow-up event queue;The existing purpose of event queue is to solve
The problem of processing speed of data read module and the processing speed of follow-up data modular converter may be inconsistent, there is provided certain
Buffering.
According to a kind of multi-resources Heterogeneous data fusion platform of the present invention, it is characterised in that:Conversion module is from event queue
Not merely it is that target data is simply converted and be saved according to business after middle extraction source database data altering event
In storehouse;And whether there is external key in checking in the data being currently pushed(Foreign Key)If the value of some external keys is temporary transient
Valid data can not be referred in target database, then the external key of the data is temporarily arranged to nothing in target database,
Its following possible value is retained in extra temporary marker field simultaneously, is continued until certain event handling in future
Afterwards at this moment the desired quantity to be quoted of the temporary marker field just sets the outer of that original data according to when being also pushed in place
Key, to allow it to quote the newest data being pushed.
This framework allows the relational structure of developer's configuration target database, including specified foreign key constraint.If
Existing developer specifies an external key for certain table, is designated as fk, but this framework can actually generate two fields in object table
Fk and hidden $ fk.Wherein fk is external key truly, possesses foreign key constraint, otherwise it is sky, otherwise deposited for one
Parent data id;And hidden $ fk are only the external key in a symbolic meaning, do not possess foreign key constraint, can set
Any value, including illegal value.
This framework provides a set of mechanism, by the database for receiving third party's data supplier data syn-chronization(To the 3rd
For number formulary is according to service provider, this is consumer;For this framework, this is producer.Hereinafter third party database)In
Low quality is continuously synchronized in target database without bound data change.When certain subdata is by the data of this framework
Synchronization module from third party database be synchronized to target database when, first should will be external key in the data but not set by third party
The value for being set to the row of external key copies to the symbolic external key hidden fk of target database and in non-genuine external key fk, i.e.,
Hidden $ fk have recorded the value of true external key future but non-present as a temporary marker.If now hidden $ fk are signified
The parent data drawn has been present, and copying hidden $ fk in just genuine key fk finish relations builds;Otherwise true external key continues
Null value is kept, and expects that follow-up data can synchronously be in due course and assigns its correctly value.In follow-up data synchronization, work as father
After data are synchronized to position finally, hidden $ fk are found in all subdatas and are equal to currently by just by the parent data of synchronization
Id subdata, for each subdata found, its hidden fk value is copied in fk.
All above-mentioned steps, be black box for developer, except know in database it is all with hidden start row
All be with business it is unrelated, for assistance data fusion technical row outside, user need not be concerned about any interior details.
Certainly, necessarily there is many other functions as a data convergence platform, this framework, such as carry out source data
Certain deformation for possessing business meaning is then stored into the function of target database after changing.But these functions and on the market other
Data Integration is similar, and not this platform is distinctive, repeats no more.
Multi-resources Heterogeneous data fusion platform of the present invention, the framework of a Java language exploitation.Developer can base
Simple secondary development is carried out in it, the data of multiple different source databases are continuously synchronized to a target data
In storehouse, in this process, a most important subfunction be by constrain loosely even without low quality data according to exploitation
The wish of personnel is converted into the complete quality data of constraint.
The invention has the advantages that:The present invention provides a kind of multi-resources Heterogeneous data fusion platform, and the framework platform is in reality
Irreplaceable key effect is served in the project of border.Even if third party database is abandoned significantly to simplify data synchronization technology
Restricted selection provides the low quality data storehouse containing invalid data, and it still can be converted into constraint perfects, relation just by the framework
It is true, strict logic, can be by quality data storehouse that production environment directly accesses.The exactly basic-level support of this framework, it is whole
The exploitation of individual upper strata complicated applications just becomes feasible.
Brief description of the drawings
Fig. 1 is multi-resources Heterogeneous data fusion platform structure functional block diagram of the present invention;
Fig. 2-Fig. 3 is corresponding database comparison diagram when foreign key constraint of the present invention recovers.
Embodiment
Below in conjunction with accompanying drawing 1- Fig. 2, the present invention is described in detail, and the technical scheme in the embodiment of the present invention is entered
Row clearly and completely describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole realities
Apply example.Based on the embodiment in the present invention, those of ordinary skill in the art are obtained under the premise of creative work is not made
Every other embodiment, belong to the scope of protection of the invention.
The present invention provides a kind of multi-resources Heterogeneous data fusion platform herein by improving, and can give reality as follows
Apply;
The multi-resources Heterogeneous data fusion platform includes Metadata modules, data read module, Transformer modular converters
And Foreign Key repair modules;Shown in reference picture 1:
Wherein, the code structure of Metadata module analysis user, the SQL statement for building table, building constraint, indexing is automatically generated,
The desired data structure of user is generated in target database, refers to and the arrow for being is marked in figure.
Wherein, system data read module directly reads all data from source database.This process is extremely very long, often counts
My god;Therefore system is done once before reaching the standard grade first, this once, is performed never again later, this is full dose renewal, is referred in figure
Labeled as 2 arrow.
Meanwhile changed from source database and data change is continuously read in daily record, this is incremental update, refers to figure
It is middle to mark the arrow for being.
Which kind of mode data either are read with, are put into follow-up event queue, refer to and the arrow for being is marked in figure
Head.The existing purpose of event queue is to solve the processing of the processing speed of data read module and follow-up data modular converter
The problem of speed may be inconsistent, there is provided certain buffering.
Wherein, Transformer modular converters obtain data change from event queue, call personal code work, will be old
Data need to be changed into desired new data according to the specific business of project, refer to and the arrow for being is marked in figure.
Meanwhile the new data after processing is put into target database by Transformer modular converters, is referred in figure and is marked
For 6 arrow.
Transformer conversion modules notice Foreign Key repair modules new data arrives, it may be necessary to repairs outer
Key, refer to and the arrow for being is marked in figure.
Finally, Foreign Key repair modules will can become legal in target database because of the arrival of latest data
Legacy data foreign key field all repair, refer to and the arrow for being marked in figure.
Foreign key constraint restoration methods are:
(1), source database many datas are inserted by third party's service, but their parent datas for relying on jointly also not by
Insertion.After the synchronization of this platform, source database and target database are to such as Fig. 2;
(2), after one section of very long wait, the parent data belonging to these data is inserted into source data by third party's service finally
In storehouse;After the synchronization of this platform, source database and target database are to such as Fig. 3.
The foregoing description of the disclosed embodiments, professional and technical personnel in the field are enable to realize or using the present invention.
A variety of modifications to these embodiments will be apparent for those skilled in the art, as defined herein
General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, it is of the invention
The embodiments shown herein is not intended to be limited to, and is to fit to and principles disclosed herein and features of novelty phase one
The most wide scope caused.
Claims (4)
- A kind of 1. multi-resources Heterogeneous data fusion platform, it is characterised in that:The multi-resources Heterogeneous data fusion platform includes Metadata modules, data read module, Transformer modular converters and Foreign Key repair modules;Wherein, Metadata modules are used for the code structure for analyzing user, automatically generate the SQL languages for building table, building constraint, indexing Sentence, the desired data structure of user is generated in target database;Wherein, system data read module directly reads all data from low quality source database;Meanwhile changed from source database Data change is continuously read in daily record;Wherein, Transformer modular converters obtain data change from event queue, personal code work are called, by old data Need to be changed into desired new data according to the specific business of project;At the same time, the new data after processing is put into high quality target database by Transformer modular converters;Also, Transformer conversion modules notice Foreign Key repair modules new data arrives;Described, Foreign Key repair modules will can become legal institute because of the arrival of latest data in target database There is foreign key constraint all to repair.
- A kind of 2. multi-resources Heterogeneous data fusion platform according to claim 1, it is characterised in that:System data read module from Source database directly reads all data, and this process is extremely very long, often a couple of days;Therefore system is done once before reaching the standard grade first, only This once, later no longer full dose update, the substitute is incremental update.
- A kind of 3. multi-resources Heterogeneous data fusion platform according to claim 1, it is characterised in that:Come for data read module Say, either which kind of mode reads data, is put into follow-up event queue;The existing purpose of event queue is to solve The problem of processing speed of data read module and the processing speed of follow-up data modular converter may be inconsistent, there is provided certain Buffering.
- A kind of 4. multi-resources Heterogeneous data fusion platform according to claim 1, it is characterised in that:The conversion module is from event Not merely it is that target is simply converted and be saved according to business in queue after extraction source database data altering event In database;And whether there is external key in checking in the data being currently pushed(Foreign Key)If the value of some external keys Valid data can not be temporarily referred in target database, then is temporarily arranged to the external key of the data in target database Nothing, while its following possible value is retained in extra temporary marker field, it is continued until certain event in future After processing at this moment the desired quantity to be quoted of the temporary marker field just sets that original data according to when being also pushed in place External key, to allow it to quote the newest data being pushed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711113864.4A CN107844581A (en) | 2017-11-13 | 2017-11-13 | A kind of multi-resources Heterogeneous data fusion platform |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711113864.4A CN107844581A (en) | 2017-11-13 | 2017-11-13 | A kind of multi-resources Heterogeneous data fusion platform |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107844581A true CN107844581A (en) | 2018-03-27 |
Family
ID=61681050
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711113864.4A Pending CN107844581A (en) | 2017-11-13 | 2017-11-13 | A kind of multi-resources Heterogeneous data fusion platform |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107844581A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110888924A (en) * | 2018-09-10 | 2020-03-17 | 深圳市从晶科技有限公司 | Data acquisition system |
CN112817990A (en) * | 2021-01-28 | 2021-05-18 | 北京百度网讯科技有限公司 | Data processing method and device, electronic equipment and readable storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110060719A1 (en) * | 2009-09-05 | 2011-03-10 | Vivek Kapoor | Method for Transforming Setup Data in Business Applications |
CN102495916A (en) * | 2011-11-07 | 2012-06-13 | 中国南方电网有限责任公司 | Multi-application-system panoramic modeling method based on object matching |
CN103441988A (en) * | 2013-08-02 | 2013-12-11 | 广东电网公司电力科学研究院 | Data migration method crossing GIS platforms |
CN105808553A (en) * | 2014-09-26 | 2016-07-27 | 三星Sds株式会社 | Database migration method and device thereof |
CN106547853A (en) * | 2016-10-19 | 2017-03-29 | 北京航天泰坦科技股份有限公司 | Forestry big data building method based on a figure |
CN106933859A (en) * | 2015-12-30 | 2017-07-07 | ***通信集团公司 | The moving method and device of a kind of medical data |
-
2017
- 2017-11-13 CN CN201711113864.4A patent/CN107844581A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110060719A1 (en) * | 2009-09-05 | 2011-03-10 | Vivek Kapoor | Method for Transforming Setup Data in Business Applications |
CN102495916A (en) * | 2011-11-07 | 2012-06-13 | 中国南方电网有限责任公司 | Multi-application-system panoramic modeling method based on object matching |
CN103441988A (en) * | 2013-08-02 | 2013-12-11 | 广东电网公司电力科学研究院 | Data migration method crossing GIS platforms |
CN105808553A (en) * | 2014-09-26 | 2016-07-27 | 三星Sds株式会社 | Database migration method and device thereof |
CN106933859A (en) * | 2015-12-30 | 2017-07-07 | ***通信集团公司 | The moving method and device of a kind of medical data |
CN106547853A (en) * | 2016-10-19 | 2017-03-29 | 北京航天泰坦科技股份有限公司 | Forestry big data building method based on a figure |
Non-Patent Citations (2)
Title |
---|
MOZHGAN MEMARI ET AL.: ""SQL Data Profiling of Foreign Keys"", 《INTERNATIONAL CONFERENCE ON CONCEPTUAL MODELING》 * |
马伟: ""一种基于XML的异构数据源集成***的研究"", 《万方数据知识服务平台》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110888924A (en) * | 2018-09-10 | 2020-03-17 | 深圳市从晶科技有限公司 | Data acquisition system |
CN112817990A (en) * | 2021-01-28 | 2021-05-18 | 北京百度网讯科技有限公司 | Data processing method and device, electronic equipment and readable storage medium |
CN112817990B (en) * | 2021-01-28 | 2024-03-08 | 北京百度网讯科技有限公司 | Data processing method, device, electronic equipment and readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wynar et al. | Introduction to cataloging and classification | |
CN101509783B (en) | Data checking method and device applying to navigation electronic map production | |
US7162688B1 (en) | Method for automated generation and assembly of specifications documents in CADD environments | |
US20130338972A1 (en) | Building information management (bim) enablement platform of bim data model, data management services apis, restful apis for bim content and meta data hosting, format exchange, and workflow enablement | |
CA2606148A1 (en) | Method of building a validation database | |
CN101313300A (en) | Local search | |
CN103514223A (en) | Data synchronism method and system of database | |
CN103605518A (en) | Object deserialization method and device | |
CN102254029A (en) | View-based data access system and method | |
CN106777300A (en) | Base address base construction method and system | |
CN102799620A (en) | IEC 61850 universal database information model and design method for interface of IEC 61850 universal database information model | |
CN107844581A (en) | A kind of multi-resources Heterogeneous data fusion platform | |
CN100504878C (en) | SQL statement construction method and apparatus for preprocessing special-character | |
CN106970918A (en) | Generate the method and device of international address unique identifier | |
CN104915412A (en) | Method and system for connecting dynamic management database | |
CN106168949B (en) | The method and device that database is split | |
CN103455964A (en) | Case clue analyzing system and method based on case information | |
CN105723365A (en) | Method for optimizing index, master database node and subscriber database node | |
CN111914028A (en) | Method and device for synchronizing data relation of heterogeneous data sources based on graph increment | |
TWI681303B (en) | Flexible web data management system and a method thereof | |
CN105653532A (en) | Method for synchronizing heterogeneous database | |
JP3478558B2 (en) | Object storage search method in database | |
CN105975623A (en) | Method and system for obtaining organization information by means of query expressions | |
CN109446223B (en) | Data integration method among multiple systems | |
Paron et al. | Identifying the counterpart of HESS J1858+ 020 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
AD01 | Patent right deemed abandoned | ||
AD01 | Patent right deemed abandoned |
Effective date of abandoning: 20220614 |