CN109299068A - From relevant database to the data flow migration method of HBase database - Google Patents
From relevant database to the data flow migration method of HBase database Download PDFInfo
- Publication number
- CN109299068A CN109299068A CN201811012560.3A CN201811012560A CN109299068A CN 109299068 A CN109299068 A CN 109299068A CN 201811012560 A CN201811012560 A CN 201811012560A CN 109299068 A CN109299068 A CN 109299068A
- Authority
- CN
- China
- Prior art keywords
- value
- data
- database
- attribute
- processor
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The data flow migration method that the present invention relates to a kind of from relevant database to HBase database includes the following steps: the connection attribute that the ExecuteSQL processor of ETL tool is arranged, by ExecuteSQL processor and relevant database creation connection service;By ConvertAvroToJSON processor by the data table transmition of Avro format be JSON format tables of data;By ConvertJSONToSQL processor by the data table transmition of Json format be standard relationship type database format tables of data;By ReplaceText processor by the data table transmition of standard relationship type database format be HBase database format tables of data;The tables of data of HBase database format is stored in HBase database by PutSQL processor.The visualized operation of the present invention, process control are strong, and the error rate of data flow migration is lower.
Description
Technical field
The present invention relates to computer application technologies, in particular relate to one kind from relevant database to HBase number
According to the data flow migration method in library.
Background technique
Current needs to handle the isomeric data problem from multiple databases using generally existing, and the prior art is main
Several frequently seen relevant database, such as Mysql are solved, data are carried out between Oracle, Microsoft SQL Server
Migration uses in more and more applications simultaneously with the emergence and development of non-relational database and relevant database
Relevant database and non-relational database need to solve Data Migration between non-relational database and relevant database
Problem.
HBase database is a kind of non-relational databases distributed, towards column.More and more Web applications need
Data are rebuild on HBase database, how to be become the Data Migration in relevant database to HBase database
Urgent problem to be solved.
Summary of the invention
According to problems of the prior art, the present invention provides from relevant database to the number of HBase database
According to flow migration method, visualized operation, process control are strong, and the error rate of data flow migration is lower.
The invention adopts the following technical scheme:
From relevant database to the data flow migration method of HBase database, include the following steps:
The connection attribute of the ExecuteSQL processor of ETL tool is arranged, by ExecuteSQL processor and relationship type in S1
Database creation connection service, ExecuteSQL processor are inquired in turn and get the Avro format in relevant database
Tables of data;
The data table transmition of the Avro format is JSON format by ConvertAvroToJSON processor by S2
Tables of data;
S3, by ConvertJSONToSQL processor by the data table transmition of Json format be standard relationship type database
The tables of data of format;
S4, by ReplaceText processor by the data table transmition of standard relationship type database format be HBase data
The tables of data of library format;
The tables of data of HBase database format is stored in HBase database by S5 by PutSQL processor.
Preferably, in step S1, the setting of the connection attribute of ExecuteSQL processor includes to attribute Database
Connection URL、Database Driver Class Name、Database Driver Location(s)、
The setting of the value of Database User, Password;The value of attribute Database Connection URL is set to off
It is type database URL, sets relationship type database-driven file for the value of attribute Database Driver Class Name
Title, set the absolute of relationship type database-driven file for the value of attribute Database Driver Location (s)
Path sets the value of attribute Database User to the user name of access relational database, by the value of attribute Password
It is set as the corresponding password of user name of access relational database;The value of the attribute of the ExecuteSQL processor is set up
Cheng Hou executes ExecuteSQL processor, complete ExecuteSQL processor to the inquiry of the tables of data of the Avro format and
It obtains.
It is further preferred that in step S1, by ExecuteSQL processor and relevant database creation connection service, i.e.,
The parameter SQL select query of ExecuteSQL processor is set;When parameter SQL select query is set as
When select*from X, indicate that ExecuteSQL processor carries out tables of data X corresponding to parameter SQL select query
Inquiry and acquisition.
It still more preferably, further include carrying out task timer-triggered scheduler to ExecuteSQL processor to set in step S1
It sets, that is, the value of parameter Max wait time is set;When the value of parameter Max wait time is set as t, indicate
After ExecuteSQL processor and relevant database successfully create connection, wait t seconds execution ExecuteSQL processors to pass
It is the inquiry and acquisition of the tables of data in type database.
It preferably, include pair by the tables of data that the data table transmition of the Avro format is JSON format in step S2
The attribute JSON container options of ConvertAvroToJSON processor, the value of Wrap single Record
Setting;The value of attribute JSON container options indicates the expression way of the tables of data of JSON format, sets a property
The value of JSON container options is array, that is, indicates the tables of data that JSON format is indicated in a manner of array type;
The value of attribute Wrap single Record indicates whether to handle data flow with single recording mode, if it is, attribute Wrap
The value of single Record is set as true, if it is not, then the value of attribute Wrap single Record is set as false, this
The mode that Shi Caiyong is integrally packaged handles data flow;The value of the attribute of the ConvertAvroToJSON processor is provided with
Afterwards, the data table transmition for executing ConvertAvroToJSON processor, and then completing Avro format is the data of JSON format
Table.
It preferably, is the tables of data of standard relationship type database format by the data table transmition of Json format in step S3
Including attribute JDBC Connection Pool, the Statement Type, Table to ConvertJSONToSQL processor
Name、Translate Field Names、Unmatched Field Behavior、Unmatched Column
The setting of the value of Behavior, Quote Column Identifiers, Quote Table Identifiers;Attribute JDBC
The value of Connection Pool indicates the type of database to be connected, and the value for the JDBC Connection Pool that sets a property is
Hbase, i.e. connection Hbase database;The value of attribute Statement Type indicates the mode to Hbase database manipulation, if
The value for setting attribute Statement Type is INSERT, i.e. the mode of operation Hbase database is insertion operation;Attribute Table
The value of Name indicates the title of tables of data in Hbase database, and the value for the Table Name that sets a property is KK_PASS, that is, operates
The KK_PASS tables of data of Hbase database;The value of attribute Translate Field Names indicates whether in processing tables of data
Data, the value of the Translate Field Names that sets a property is true, i.e. data in processing tables of data;Attribute
Whether the value expression of Unmatched Field Behavior handles field unmatched in tables of data, sets a property
The value of Unmatched Field Behavior is Ignore Unmatched Fields, that is, ignores unmatched field;Attribute
The value of Unmatched Column Behavior indicates whether handle column unmatched in tables of data, sets a property
The value of Unmatched Column Behavior is Ignore Unmatched Columns, that is, ignores unmatched column;Attribute
Whether the value expression of Quote Column Identifiers modifies to the title arranged in tables of data, and set a property Quote
The value of Column Identifiers is false, i.e., to the title arranged in tables of data without modification;Attribute Quote Table
Whether the value expression of Identifiers modifies to the title of tables of data, and set a property Quote Table Identifiers
Value be false, i.e., to the title of tables of data without modification;The attribute of the ConvertJSONToSQL processor is set up
The data table transmition that Cheng Hou executes ConvertJSONToSQL processor, and then completes Json format is standard relationship type data
The tables of data of library format.
It preferably, is HBase database format by the data table transmition of standard relationship type database format in step S4
Tables of data includes attribute Search Value, Replacement Value, the Character to ReplaceText processor
The setting of the value of Set, Maximum Buffer Size, Replacement Strategy, Evaluation Mode;Attribute
The value of Search Value indicates that the field of search, the value for the Search Value that sets a property are INSERT, i.e. search field
INSERT;The value of attribute Replacement Value indicates the field for being used to replace search field, sets a property
The value of Replacement Value is UPSERT, i.e., INSERT is replaced all with UPSERT;Attribute Character Set's
Value indicates that the processing coding mode of Chinese, the value for the Character Set that sets a property are UTF-8, i.e. Chinese processing is encoded to
UTF-8;The value of attribute Maximum Buffer Size indicates maximum spatial cache, and set a property Maximum Buffer
The value of Size is 1MB, i.e., maximum spatial cache is set as 1MB;The value of attribute Replacement Strategy indicates field
Replacement policy, the value of the Replacement Strategy that sets a property is Regex Replace, i.e., with the side of regular expression
The matching of formula progress field;The value of attribute Evaluation Mode indicates the process range of field to be replaced, sets a property
The value of Evaluation Mode is Entire text, that is, the object handled is entire text;The ReplaceText processor
Attribute be provided with after, execute ReplaceText processor, and then complete the tables of data of standard relationship type database format
Be converted to the tables of data of HBase database format.
Preferably, in step S5, it includes pair that the tables of data of HBase database format, which is stored in HBase database,
The attribute JDBC Connection Pool of PutSQL processor, Support Fragmented Transactions,
The setting of the value of Transaction Timeout, Batch Size;The value of attribute JDBC Connection Pool is indicated to even
The type for connecing database, the value for the JDBC Connection Pool that sets a property are Hbase, i.e. connection Hbase database;Attribute
The bulk transmission that the value of Support Fragmented Transactions is indicated whether with data flow, set a property Support
The value of Fragmented Transactions is true, that is, supports the block-like transmission of data flow;Attribute Transaction
The value of Timeout indicates the time of delay, and the value for the Transaction Timeout that sets a property is null value, that is, transmits and do not postpone,
It is immediately performed the transmission;The value of attribute Batch Size indicates the size of processing unit capacity, sets a property Batch Size's
Value is 100, i.e., is handled as unit of 100MB size;After the attribute of the PutSQL processor is provided with, execute
PutSQL processor, and then complete for the tables of data of HBase database format to be stored in HBase database.
The advantages and beneficial effects of the present invention are:
1) present invention is the method based on ETL tool from relevant database to the data flow migration of HBase database,
ETL tool include ExecuteSQL processor, ConvertAvroToJSON processor, ConvertJSONToSQL processor,
ReplaceText processor, PutSQL processor;It is configured by the value of the attribute to these processors, so that data
Stream is migrated from relevant database toward HBase database, final to realize that tables of data is stored in HBase database.Entirely
In transition process, the setting of the value of the attribute of each processor is visually that operating process controllability is stronger, so that data
The error rate of stream migration substantially reduces.
Detailed description of the invention
Fig. 1 is the flow chart of method of the invention.
Fig. 2 is the setting figure one of the value of ExecuteSQL processor attribute of the invention.
Fig. 3 is the setting figure two of the value of ExecuteSQL processor attribute of the invention.
Fig. 4 is the setting figure of the value of ConvertAvroToJSON processor attribute of the invention.
Fig. 5 is the setting figure of the value of ConvertJSONToSQL processor attribute of the invention.
Fig. 6 is the setting figure of the value of ReplaceText processor attribute of the invention.
Fig. 7 is the setting figure of the value of PutSQL processor attribute of the invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
The ETL tool include ExecuteSQL processor, ConvertAvroToJSON processor,
ConvertJSONToSQL processor, ReplaceText processor, PutSQL processor;As shown in Figure 1, from relational data
Library includes the following steps: to the data flow migration method of HBase database
The connection attribute of the ExecuteSQL processor of ETL tool is arranged, by ExecuteSQL processor and relationship type in S1
Database creation connection service, ExecuteSQL processor are inquired in turn and get the Avro format in relevant database
Tables of data;
Specifically, the setting of the connection attribute of ExecuteSQL processor includes to attribute Database Connection
URL、Database Driver Class Name、Database Driver Location(s)、Database User、
The setting of the value of Password;Relevant database URL is set by the value of attribute Database Connection URL,
The value of attribute Database Driver Class Name is set to the title of relationship type database-driven file, by attribute
The value of Database Driver Location (s) is set as the absolute path of relationship type database-driven file, by attribute
The value of Database User is set as the user name of access relational database, and the value of attribute Password is set as accessing
The corresponding password of the user name of relevant database;After the value of the attribute of the ExecuteSQL processor is provided with, execute
ExecuteSQL processor completes inquiry and acquisition of the ExecuteSQL processor to the tables of data of the Avro format.
By ExecuteSQL processor and relevant database creation connection service, that is, ExecuteSQL processor is set
Parameter SQL select query;When parameter SQL select query is set as select*from X, indicate
ExecuteSQL processor is inquired and is obtained to tables of data X corresponding to parameter SQL select query.
Specifically, further include the setting that task timer-triggered scheduler is carried out to ExecuteSQL processor, i.e. setting parameter Max
The value of wait time;When the value of parameter Max wait time is set as t, ExecuteSQL processor and relationship type number are indicated
After successfully creating connection according to library, waits and execute within t seconds inquiry of the ExecuteSQL processor to the tables of data in relevant database
And acquisition.
The data table transmition of the Avro format is JSON format by ConvertAvroToJSON processor by S2
Tables of data;
Specifically, including pair by the tables of data that the data table transmition of the Avro format is JSON format
The attribute JSON container options of ConvertAvroToJSON processor, the value of Wrap single Record
Setting;The value of attribute JSON container options indicates the expression way of the tables of data of JSON format, sets a property
The value of JSON container options is array, that is, indicates the tables of data that JSON format is indicated in a manner of array type;
The value of attribute Wrap single Record indicates whether to handle data flow with single recording mode, if it is, attribute Wrap
The value of single Record is set as true, if it is not, then the value of attribute Wrap single Record is set as false, this
The mode that Shi Caiyong is integrally packaged handles data flow;The value of the attribute of the ConvertAvroToJSON processor is provided with
Afterwards, the data table transmition for executing ConvertAvroToJSON processor, and then completing Avro format is the data of JSON format
Table.
S3, by ConvertJSONToSQL processor by the data table transmition of Json format be standard relationship type database
The tables of data of format;
Specifically, including pair by the tables of data that the data table transmition of Json format is standard relationship type database format
The attribute JDBC Connection Pool of ConvertJSONToSQL processor, Statement Type, Table Name,
Translate Field Names、Unmatched Field Behavior、Unmatched Column Behavior、
The setting of the value of Quote Column Identifiers, Quote Table Identifiers;Attribute JDBC
The value of Connection Pool indicates the type of database to be connected, and the value for the JDBC Connection Pool that sets a property is
Hbase, i.e. connection Hbase database;The value of attribute Statement Type indicates the mode to Hbase database manipulation, if
The value for setting attribute Statement Type is INSERT, i.e. the mode of operation Hbase database is insertion operation;Attribute Table
The value of Name indicates the title of tables of data in Hbase database, and the value for the Table Name that sets a property is KK_PASS, that is, operates
The KK_PASS tables of data of Hbase database;The value of attribute Translate Field Names indicates whether in processing tables of data
Data, the value of the Translate Field Names that sets a property is true, i.e. data in processing tables of data;Attribute
Whether the value expression of Unmatched Field Behavior handles field unmatched in tables of data, sets a property
The value of Unmatched Field Behavior is Ignore Unmatched Fields, that is, ignores unmatched field;Attribute
The value of Unmatched Column Behavior indicates whether handle column unmatched in tables of data, sets a property
The value of Unmatched Column Behavior is Ignore Unmatched Columns, that is, ignores unmatched column;Attribute
Whether the value expression of Quote Column Identifiers modifies to the title arranged in tables of data, and set a property Quote
The value of Column Identifiers is false, i.e., to the title arranged in tables of data without modification;Attribute Quote Table
Whether the value expression of Identifiers modifies to the title of tables of data, and set a property Quote Table Identifiers
Value be false, i.e., to the title of tables of data without modification;The attribute of the ConvertJSONToSQL processor is set up
The data table transmition that Cheng Hou executes ConvertJSONToSQL processor, and then completes Json format is standard relationship type data
The tables of data of library format.
S4, by ReplaceText processor by the data table transmition of standard relationship type database format be HBase data
The tables of data of library format;
Specifically, being the tables of data packet of HBase database format by the data table transmition of standard relationship type database format
Include the attribute Search Value to ReplaceText processor, Replacement Value, Character Set,
The setting of the value of Maximum Buffer Size, Replacement Strategy, Evaluation Mode;Attribute Search
The value of Value indicates that the field of search, the value for the Search Value that sets a property are INSERT, i.e. search field INSERT;Belong to
Property Replacement Value value indicate the field for being used to replace search field, set a property Replacement Value's
Value is UPSERT, i.e., INSERT is replaced all with UPSERT;The value of attribute Character Set indicates the processing coding of Chinese
Mode, the value for the Character Set that sets a property are UTF-8, i.e. Chinese processing is encoded to UTF-8;Attribute Maximum
The value of Buffer Size indicates maximum spatial cache, and the value for the Maximum Buffer Size that sets a property is 1MB, i.e., maximum
Spatial cache be set as 1MB;The value of attribute Replacement Strategy indicates the replacement policy of field, sets a property
The value of Replacement Strategy is Regex Replace, i.e., the matching of field is carried out in a manner of regular expression;Belong to
The value of property Evaluation Mode indicates that the process range of field to be replaced, the value for the Evaluation Mode that sets a property are
Entire text, that is, the object handled are entire texts;After the attribute of the ReplaceText processor is provided with, execute
ReplaceText processor, and then complete the data table transmition of standard relationship type database format to be HBase database format
Tables of data.
The tables of data of HBase database format is stored in HBase database by S5 by PutSQL processor.
Specifically, it includes to PutSQL processor that the tables of data of HBase database format, which is stored in HBase database,
Attribute JDBC Connection Pool, Support Fragmented Transactions, Transaction
The setting of the value of Timeout, Batch Size;The value of attribute JDBC Connection Pool indicates the class of database to be connected
Type, the value for the JDBC Connection Pool that sets a property are Hbase, i.e. connection Hbase database;Attribute Support
The bulk transmission that the value of Fragmented Transactions is indicated whether with data flow, set a property Support
The value of Fragmented Transactions is true, that is, supports the block-like transmission of data flow;Attribute Transaction
The value of Timeout indicates the time of delay, and the value for the Transaction Timeout that sets a property is null value, that is, transmits and do not postpone,
It is immediately performed the transmission;The value of attribute Batch Size indicates the size of processing unit capacity, sets a property Batch Size's
Value is 100, i.e., is handled as unit of 100MB size;After the attribute of the PutSQL processor is provided with, execute
PutSQL processor, and then complete for the tables of data of HBase database format to be stored in HBase database.
Method of the invention is described in detail below with reference to embodiment and attached drawing.
Embodiment:
1, the connection attribute of the ExecuteSQL processor of ETL tool is set.
As described in Figure 2, the value of the attribute of ExecuteSQL processor is provided that
Database Connection URL:jdbc:mysql: // 192.168.99.140:3306/test;
Database Driver Class Name:com.mysql.jdbc.Driver;
Database Driver Location (s) :/nifi/mysql-jdbc.jar;
Database User: the user name of setting access database;
Password: the corresponding password of setting user name;
Meanwhile it servicing as shown in figure 3, ExecuteSQL processor is created to connect with relevant database, is right
ExecuteSQL processor carries out being provided that for task timer-triggered scheduler
SQL select query:select*from YW_MYSQL;
Max wait time:seconds.
After the value of the attribute of the ExecuteSQL processor is provided with, ExecuteSQL processor is executed, is completed
Inquiry and acquisition of the ExecuteSQL processor to tables of data.
It 2, is the tables of data of JSON format by the data table transmition of the Avro format, as shown in figure 4, specifically
ConvertAvroToJSON processor is provided that
JSON container options:array;
Wrap single Record:true;
After the value of the attribute of the ConvertAvroToJSON processor is provided with, ConvertAvroToJSON is executed
Processor, and then the data table transmition for completing Avro format is the tables of data of JSON format.
It 3, is the tables of data of standard relationship type database format by the data table transmition of Json format, as shown in figure 5, specifically
ConvertJSONToSQL processor is provided that
JDBC Connection Pool:Hbase;
Statement Type:INSERT;
Table Name:KK_PASS;
Translate Field Names:true;
Unmatched Field Behavior:Ignore Unmatched Fields;
Unmatched Column Behavior:Ignore Unmatched Columns;
Quote Column Identifiers:false;
Quote Table Identifiers:false;
After the attribute of the ConvertJSONToSQL processor is provided with, ConvertJSONToSQL processing is executed
Device, and then the data table transmition for completing Json format is the tables of data of standard relationship type database format.
It 4, is the tables of data of HBase database format by the data table transmition of standard relationship type database format, such as Fig. 6 institute
Show, specific ReplaceText processor is provided that
Search Value:INSERT;
Replacement Value:UPSERT;
Character Set:UTF-8;
Maximum Buffer Size:1MB;
Replacement Strategy:Regex Replace;
Evaluation Mode:Entire text;
After the attribute of the ReplaceText processor is provided with, ReplaceText processor is executed, and then complete
It is the tables of data of HBase database format by the data table transmition of standard relationship type database format.
5, the tables of data of HBase database format is stored in HBase database, as shown in fig. 7, at specific PutSQL
Reason device is provided that
JDBC Connection Pool:Hbase;
Support Fragmented Transactions:true;
Transaction Timeout: null value;
Batch Size:100;
After the attribute of the PutSQL processor is provided with, PutSQL processor is executed, and then complete HBase data
The tables of data of library format is stored in HBase database.
In conclusion the present invention provides the data flow migration method from relevant database to HBase database, it is complete
Journey visualized operation, process control are strong, and the error rate of data flow migration is lower.
Claims (8)
1. a kind of data flow migration method from relevant database to HBase database, which is characterized in that including walking as follows
It is rapid:
The connection attribute of the ExecuteSQL processor of ETL tool is arranged, by ExecuteSQL processor and relational data in S1
The data of the Avro format in relevant database are inquired and got in turn to library creation connection service, ExecuteSQL processor
Table;
S2, by ConvertAvroToJSON processor by the data table transmition of the Avro format be JSON format data
Table;
S3, by ConvertJSONToSQL processor by the data table transmition of Json format be standard relationship type database format
Tables of data;
S4, by ReplaceText processor by the data table transmition of standard relationship type database format be HBase data Coorg
The tables of data of formula;
The tables of data of HBase database format is stored in HBase database by S5 by PutSQL processor.
2. a kind of data flow migration method from relevant database to HBase database according to claim 1, special
Sign is: in step S1, the setting of the connection attribute of ExecuteSQL processor includes to attribute Database Connection
URL、Database Driver Class Name、Database Driver Location(s)、Database User、
The setting of the value of Password;Relevant database URL is set by the value of attribute Database Connection URL,
The value of attribute Database Driver Class Name is set to the title of relationship type database-driven file, by attribute
The value of Database Driver Location (s) is set as the absolute path of relationship type database-driven file, by attribute
The value of Database User is set as the user name of access relational database, and the value of attribute Password is set as accessing
The corresponding password of the user name of relevant database;After the value of the attribute of the ExecuteSQL processor is provided with, execute
ExecuteSQL processor completes inquiry and acquisition of the ExecuteSQL processor to the tables of data of the Avro format.
3. a kind of data flow migration method from relevant database to HBase database according to claim 2, special
Sign is: in step S1, by ExecuteSQL processor and relevant database creation connection service, i.e. setting ExecuteSQL
The parameter SQL select query of processor;When parameter SQL select query is set as select*from X, indicate
ExecuteSQL processor is inquired and is obtained to tables of data X corresponding to parameter SQL select query.
4. a kind of data flow migration method from relevant database to HBase database according to claim 3, special
Sign is: it further include the setting that task timer-triggered scheduler is carried out to ExecuteSQL processor in step S1, i.e. setting parameter Max
The value of wait time;When the value of parameter Max wait time is set as t, ExecuteSQL processor and relationship type number are indicated
After successfully creating connection according to library, waits and execute within t seconds inquiry of the ExecuteSQL processor to the tables of data in relevant database
And acquisition.
5. a kind of data flow migration method from relevant database to HBase database according to claim 1, special
Sign is: including pair by the tables of data that the data table transmition of the Avro format is JSON format in step S2
The attribute JSON container options of ConvertAvroToJSON processor, the value of Wrap single Record
Setting;The value of attribute JSON container options indicates the expression way of the tables of data of JSON format, sets a property
The value of JSON container options is array, that is, indicates the tables of data that JSON format is indicated in a manner of array type;
The value of attribute Wrap single Record indicates whether to handle data flow with single recording mode, if it is, attribute Wrap
The value of single Record is set as true, if it is not, then the value of attribute Wrap single Record is set as false, this
The mode that Shi Caiyong is integrally packaged handles data flow;The value of the attribute of the ConvertAvroToJSON processor is provided with
Afterwards, the data table transmition for executing ConvertAvroToJSON processor, and then completing Avro format is the data of JSON format
Table.
6. a kind of data flow migration method from relevant database to HBase database according to claim 1, special
Sign is: including pair by the tables of data that the data table transmition of Json format is standard relationship type database format in step S3
The attribute JDBC Connection Pool of ConvertJSONToSQL processor, Statement Type, Table Name,
Translate Field Names、Unmatched Field Behavior、Unmatched Column Behavior、
The setting of the value of Quote Column Identifiers, Quote Table Identifiers;Attribute JDBC
The value of Connection Pool indicates the type of database to be connected, and the value for the JDBC Connection Pool that sets a property is
Hbase, i.e. connection Hbase database;The value of attribute Statement Type indicates the mode to Hbase database manipulation, if
The value for setting attribute Statement Type is INSERT, i.e. the mode of operation Hbase database is insertion operation;Attribute Table
The value of Name indicates the title of tables of data in Hbase database, and the value for the Table Name that sets a property is KK_PASS, that is, operates
The KK_PASS tables of data of Hbase database;The value of attribute Translate Field Names indicates whether in processing tables of data
Data, the value of the Translate Field Names that sets a property is true, i.e. data in processing tables of data;Attribute
Whether the value expression of Unmatched Field Behavior handles field unmatched in tables of data, sets a property
The value of Unmatched Field Behavior is Ignore Unmatched Fields, that is, ignores unmatched field;Attribute
The value of Unmatched Column Behavior indicates whether handle column unmatched in tables of data, sets a property
The value of Unmatched Column Behavior is Ignore Unmatched Columns, that is, ignores unmatched column;Attribute
Whether the value expression of Quote Column Identifiers modifies to the title arranged in tables of data, and set a property Quote
The value of Column Identifiers is false, i.e., to the title arranged in tables of data without modification;Attribute Quote Table
Whether the value expression of Identifiers modifies to the title of tables of data, and set a property Quote Table Identifiers
Value be false, i.e., to the title of tables of data without modification;The attribute of the ConvertJSONToSQL processor is set up
The data table transmition that Cheng Hou executes ConvertJSONToSQL processor, and then completes Json format is standard relationship type data
The tables of data of library format.
7. a kind of data flow migration method from relevant database to HBase database according to claim 1, special
Sign is: being the tables of data packet of HBase database format by the data table transmition of standard relationship type database format in step S4
Include the attribute Search Value to ReplaceText processor, Replacement Value, Character Set,
The setting of the value of Maximum Buffer Size, Replacement Strategy, Evaluation Mode;Attribute Search
The value of Value indicates that the field of search, the value for the Search Value that sets a property are INSERT, i.e. search field INSERT;Belong to
Property Replacement Value value indicate the field for being used to replace search field, set a property Replacement Value's
Value is UPSERT, i.e., INSERT is replaced all with UPSERT;The value of attribute Character Set indicates the processing coding of Chinese
Mode, the value for the Character Set that sets a property are UTF-8, i.e. Chinese processing is encoded to UTF-8;Attribute Maximum
The value of Buffer Size indicates maximum spatial cache, and the value for the Maximum Buffer Size that sets a property is 1MB, i.e., maximum
Spatial cache be set as 1MB;The value of attribute Replacement Strategy indicates the replacement policy of field, sets a property
The value of Replacement Strategy is Regex Replace, i.e., the matching of field is carried out in a manner of regular expression;Belong to
The value of property Evaluation Mode indicates that the process range of field to be replaced, the value for the Evaluation Mode that sets a property are
Entire text, that is, the object handled are entire texts;After the attribute of the ReplaceText processor is provided with, execute
ReplaceText processor, and then complete the data table transmition of standard relationship type database format to be HBase database format
Tables of data.
8. a kind of data flow migration method from relevant database to HBase database according to claim 1, special
Sign is: in step S5, it includes to PutSQL processing that the tables of data of HBase database format, which is stored in HBase database,
Attribute JDBC Connection Pool, Support Fragmented Transactions, Transaction of device
The setting of the value of Timeout, Batch Size;The value of attribute JDBC Connection Pool indicates the class of database to be connected
Type, the value for the JDBC Connection Pool that sets a property are Hbase, i.e. connection Hbase database;Attribute Support
The bulk transmission that the value of Fragmented Transactions is indicated whether with data flow, set a property Support
The value of Fragmented Transactions is true, that is, supports the block-like transmission of data flow;Attribute Transaction
The value of Timeout indicates the time of delay, and the value for the Transaction Timeout that sets a property is null value, that is, transmits and do not postpone,
It is immediately performed the transmission;The value of attribute Batch Size indicates the size of processing unit capacity, sets a property Batch Size's
Value is 100, i.e., is handled as unit of 100MB size;After the attribute of the PutSQL processor is provided with, execute
PutSQL processor, and then complete for the tables of data of HBase database format to be stored in HBase database.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811012560.3A CN109299068A (en) | 2018-08-31 | 2018-08-31 | From relevant database to the data flow migration method of HBase database |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811012560.3A CN109299068A (en) | 2018-08-31 | 2018-08-31 | From relevant database to the data flow migration method of HBase database |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109299068A true CN109299068A (en) | 2019-02-01 |
Family
ID=65165931
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811012560.3A Pending CN109299068A (en) | 2018-08-31 | 2018-08-31 | From relevant database to the data flow migration method of HBase database |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109299068A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110287172A (en) * | 2019-07-01 | 2019-09-27 | 四川新网银行股份有限公司 | A method of formatting HBase data |
CN110704528A (en) * | 2019-10-11 | 2020-01-17 | 苏州易博创云网络科技有限公司 | Data processing method capable of automatic identification and configuration conversion |
CN111177244A (en) * | 2019-12-24 | 2020-05-19 | 四川文轩教育科技有限公司 | Data association analysis method for multiple heterogeneous databases |
CN112559606A (en) * | 2019-09-26 | 2021-03-26 | 北京国双科技有限公司 | Conversion method and conversion device for JSON format data |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103440273A (en) * | 2013-08-06 | 2013-12-11 | 北京航空航天大学 | Data cross-platform migration method and device |
CN103631907A (en) * | 2013-11-26 | 2014-03-12 | 中国科学院信息工程研究所 | Method and system for migrating relational data to HBbase |
CN105426506A (en) * | 2015-11-27 | 2016-03-23 | 中国科学院重庆绿色智能技术研究院 | Massive dynamic data management method |
CN106528786A (en) * | 2016-11-08 | 2017-03-22 | 国网山东省电力公司电力科学研究院 | Method and system for rapidly transferring multi-source heterogeneous power grid big data to HBase |
-
2018
- 2018-08-31 CN CN201811012560.3A patent/CN109299068A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103440273A (en) * | 2013-08-06 | 2013-12-11 | 北京航空航天大学 | Data cross-platform migration method and device |
CN103631907A (en) * | 2013-11-26 | 2014-03-12 | 中国科学院信息工程研究所 | Method and system for migrating relational data to HBbase |
CN105426506A (en) * | 2015-11-27 | 2016-03-23 | 中国科学院重庆绿色智能技术研究院 | Massive dynamic data management method |
CN106528786A (en) * | 2016-11-08 | 2017-03-22 | 国网山东省电力公司电力科学研究院 | Method and system for rapidly transferring multi-source heterogeneous power grid big data to HBase |
Non-Patent Citations (1)
Title |
---|
SAIKRISHNA TEJA BOBBA: ""Ingest Salesforce Data Incrementally Into Hive Using Apache NiFi"", 《DZONE HTTPS://DZONE.COM/ARTICLES/ACCESS-DATA-VIA-JDBC-WITH-APACHE-NIFI》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110287172A (en) * | 2019-07-01 | 2019-09-27 | 四川新网银行股份有限公司 | A method of formatting HBase data |
CN110287172B (en) * | 2019-07-01 | 2023-05-02 | 四川新网银行股份有限公司 | Method for formatting HBase data |
CN112559606A (en) * | 2019-09-26 | 2021-03-26 | 北京国双科技有限公司 | Conversion method and conversion device for JSON format data |
CN110704528A (en) * | 2019-10-11 | 2020-01-17 | 苏州易博创云网络科技有限公司 | Data processing method capable of automatic identification and configuration conversion |
CN111177244A (en) * | 2019-12-24 | 2020-05-19 | 四川文轩教育科技有限公司 | Data association analysis method for multiple heterogeneous databases |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109299068A (en) | From relevant database to the data flow migration method of HBase database | |
US20220035815A1 (en) | Processing database queries using format conversion | |
CN105531698B (en) | Equipment, system and method for batch and real time data processing | |
US9002813B2 (en) | Execution plan preparation in application server | |
KR20200106950A (en) | Dimensional context propagation techniques for optimizing SQL query plans | |
US9146979B2 (en) | Optimization of business warehouse queries by calculation engines | |
US10102269B2 (en) | Object query model for analytics data access | |
US9218373B2 (en) | In-memory data profiling | |
EP1368745A2 (en) | Item name normalization | |
US9846714B2 (en) | Database device | |
US10776353B2 (en) | Application programming interface for database access | |
US10838959B2 (en) | Harmonized structured query language and non-structured query language query processing | |
US11409722B2 (en) | Database live reindex | |
CN104133870A (en) | Web page similarity calculation method and web page similarity calculation device | |
US10497039B1 (en) | Techniques for dynamic variations of a search query | |
CN109284469B (en) | Webpage development framework | |
US20130060795A1 (en) | Prepared statements to improve performance in database interfaces | |
CN105574027A (en) | On-line transaction processing/on-line analytical processing (OLTP/OLAP) hybrid application based multi-dimensional performance data storage method, device and system | |
US10789249B2 (en) | Optimal offset pushdown for multipart sorting | |
US10599728B1 (en) | Metadata agent for query management | |
US20140344245A1 (en) | Calculation Engine with Optimized Multi-Part Querying | |
US9852162B2 (en) | Defining a set of data across multiple databases using variables and functions | |
US20240176803A1 (en) | Simplified schema generation for data ingestion | |
CN107908785A (en) | Incorporeity class based on SSM frames realizes data page | |
Jánki et al. | Full-stack FHIR-based MBaaS with server-and client-side caching capable WebDAO |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190201 |