CN109299068A - From relevant database to the data flow migration method of HBase database - Google Patents

From relevant database to the data flow migration method of HBase database Download PDF

Info

Publication number
CN109299068A
CN109299068A CN201811012560.3A CN201811012560A CN109299068A CN 109299068 A CN109299068 A CN 109299068A CN 201811012560 A CN201811012560 A CN 201811012560A CN 109299068 A CN109299068 A CN 109299068A
Authority
CN
China
Prior art keywords
value
data
database
attribute
processor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811012560.3A
Other languages
Chinese (zh)
Inventor
邓惠元
范联伟
余保华
徐圣吉
刘春珲
李贤军
胡鸿超
金文林
吴婷婷
徐剑
张国林
张金国
展昭
何宽宽
杨培韬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Sun Create Electronic Co Ltd
Original Assignee
Anhui Sun Create Electronic Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Sun Create Electronic Co Ltd filed Critical Anhui Sun Create Electronic Co Ltd
Priority to CN201811012560.3A priority Critical patent/CN109299068A/en
Publication of CN109299068A publication Critical patent/CN109299068A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The data flow migration method that the present invention relates to a kind of from relevant database to HBase database includes the following steps: the connection attribute that the ExecuteSQL processor of ETL tool is arranged, by ExecuteSQL processor and relevant database creation connection service;By ConvertAvroToJSON processor by the data table transmition of Avro format be JSON format tables of data;By ConvertJSONToSQL processor by the data table transmition of Json format be standard relationship type database format tables of data;By ReplaceText processor by the data table transmition of standard relationship type database format be HBase database format tables of data;The tables of data of HBase database format is stored in HBase database by PutSQL processor.The visualized operation of the present invention, process control are strong, and the error rate of data flow migration is lower.

Description

From relevant database to the data flow migration method of HBase database
Technical field
The present invention relates to computer application technologies, in particular relate to one kind from relevant database to HBase number According to the data flow migration method in library.
Background technique
Current needs to handle the isomeric data problem from multiple databases using generally existing, and the prior art is main Several frequently seen relevant database, such as Mysql are solved, data are carried out between Oracle, Microsoft SQL Server Migration uses in more and more applications simultaneously with the emergence and development of non-relational database and relevant database Relevant database and non-relational database need to solve Data Migration between non-relational database and relevant database Problem.
HBase database is a kind of non-relational databases distributed, towards column.More and more Web applications need Data are rebuild on HBase database, how to be become the Data Migration in relevant database to HBase database Urgent problem to be solved.
Summary of the invention
According to problems of the prior art, the present invention provides from relevant database to the number of HBase database According to flow migration method, visualized operation, process control are strong, and the error rate of data flow migration is lower.
The invention adopts the following technical scheme:
From relevant database to the data flow migration method of HBase database, include the following steps:
The connection attribute of the ExecuteSQL processor of ETL tool is arranged, by ExecuteSQL processor and relationship type in S1 Database creation connection service, ExecuteSQL processor are inquired in turn and get the Avro format in relevant database Tables of data;
The data table transmition of the Avro format is JSON format by ConvertAvroToJSON processor by S2 Tables of data;
S3, by ConvertJSONToSQL processor by the data table transmition of Json format be standard relationship type database The tables of data of format;
S4, by ReplaceText processor by the data table transmition of standard relationship type database format be HBase data The tables of data of library format;
The tables of data of HBase database format is stored in HBase database by S5 by PutSQL processor.
Preferably, in step S1, the setting of the connection attribute of ExecuteSQL processor includes to attribute Database Connection URL、Database Driver Class Name、Database Driver Location(s)、 The setting of the value of Database User, Password;The value of attribute Database Connection URL is set to off It is type database URL, sets relationship type database-driven file for the value of attribute Database Driver Class Name Title, set the absolute of relationship type database-driven file for the value of attribute Database Driver Location (s) Path sets the value of attribute Database User to the user name of access relational database, by the value of attribute Password It is set as the corresponding password of user name of access relational database;The value of the attribute of the ExecuteSQL processor is set up Cheng Hou executes ExecuteSQL processor, complete ExecuteSQL processor to the inquiry of the tables of data of the Avro format and It obtains.
It is further preferred that in step S1, by ExecuteSQL processor and relevant database creation connection service, i.e., The parameter SQL select query of ExecuteSQL processor is set;When parameter SQL select query is set as When select*from X, indicate that ExecuteSQL processor carries out tables of data X corresponding to parameter SQL select query Inquiry and acquisition.
It still more preferably, further include carrying out task timer-triggered scheduler to ExecuteSQL processor to set in step S1 It sets, that is, the value of parameter Max wait time is set;When the value of parameter Max wait time is set as t, indicate After ExecuteSQL processor and relevant database successfully create connection, wait t seconds execution ExecuteSQL processors to pass It is the inquiry and acquisition of the tables of data in type database.
It preferably, include pair by the tables of data that the data table transmition of the Avro format is JSON format in step S2 The attribute JSON container options of ConvertAvroToJSON processor, the value of Wrap single Record Setting;The value of attribute JSON container options indicates the expression way of the tables of data of JSON format, sets a property The value of JSON container options is array, that is, indicates the tables of data that JSON format is indicated in a manner of array type; The value of attribute Wrap single Record indicates whether to handle data flow with single recording mode, if it is, attribute Wrap The value of single Record is set as true, if it is not, then the value of attribute Wrap single Record is set as false, this The mode that Shi Caiyong is integrally packaged handles data flow;The value of the attribute of the ConvertAvroToJSON processor is provided with Afterwards, the data table transmition for executing ConvertAvroToJSON processor, and then completing Avro format is the data of JSON format Table.
It preferably, is the tables of data of standard relationship type database format by the data table transmition of Json format in step S3 Including attribute JDBC Connection Pool, the Statement Type, Table to ConvertJSONToSQL processor Name、Translate Field Names、Unmatched Field Behavior、Unmatched Column The setting of the value of Behavior, Quote Column Identifiers, Quote Table Identifiers;Attribute JDBC The value of Connection Pool indicates the type of database to be connected, and the value for the JDBC Connection Pool that sets a property is Hbase, i.e. connection Hbase database;The value of attribute Statement Type indicates the mode to Hbase database manipulation, if The value for setting attribute Statement Type is INSERT, i.e. the mode of operation Hbase database is insertion operation;Attribute Table The value of Name indicates the title of tables of data in Hbase database, and the value for the Table Name that sets a property is KK_PASS, that is, operates The KK_PASS tables of data of Hbase database;The value of attribute Translate Field Names indicates whether in processing tables of data Data, the value of the Translate Field Names that sets a property is true, i.e. data in processing tables of data;Attribute Whether the value expression of Unmatched Field Behavior handles field unmatched in tables of data, sets a property The value of Unmatched Field Behavior is Ignore Unmatched Fields, that is, ignores unmatched field;Attribute The value of Unmatched Column Behavior indicates whether handle column unmatched in tables of data, sets a property The value of Unmatched Column Behavior is Ignore Unmatched Columns, that is, ignores unmatched column;Attribute Whether the value expression of Quote Column Identifiers modifies to the title arranged in tables of data, and set a property Quote The value of Column Identifiers is false, i.e., to the title arranged in tables of data without modification;Attribute Quote Table Whether the value expression of Identifiers modifies to the title of tables of data, and set a property Quote Table Identifiers Value be false, i.e., to the title of tables of data without modification;The attribute of the ConvertJSONToSQL processor is set up The data table transmition that Cheng Hou executes ConvertJSONToSQL processor, and then completes Json format is standard relationship type data The tables of data of library format.
It preferably, is HBase database format by the data table transmition of standard relationship type database format in step S4 Tables of data includes attribute Search Value, Replacement Value, the Character to ReplaceText processor The setting of the value of Set, Maximum Buffer Size, Replacement Strategy, Evaluation Mode;Attribute The value of Search Value indicates that the field of search, the value for the Search Value that sets a property are INSERT, i.e. search field INSERT;The value of attribute Replacement Value indicates the field for being used to replace search field, sets a property The value of Replacement Value is UPSERT, i.e., INSERT is replaced all with UPSERT;Attribute Character Set's Value indicates that the processing coding mode of Chinese, the value for the Character Set that sets a property are UTF-8, i.e. Chinese processing is encoded to UTF-8;The value of attribute Maximum Buffer Size indicates maximum spatial cache, and set a property Maximum Buffer The value of Size is 1MB, i.e., maximum spatial cache is set as 1MB;The value of attribute Replacement Strategy indicates field Replacement policy, the value of the Replacement Strategy that sets a property is Regex Replace, i.e., with the side of regular expression The matching of formula progress field;The value of attribute Evaluation Mode indicates the process range of field to be replaced, sets a property The value of Evaluation Mode is Entire text, that is, the object handled is entire text;The ReplaceText processor Attribute be provided with after, execute ReplaceText processor, and then complete the tables of data of standard relationship type database format Be converted to the tables of data of HBase database format.
Preferably, in step S5, it includes pair that the tables of data of HBase database format, which is stored in HBase database, The attribute JDBC Connection Pool of PutSQL processor, Support Fragmented Transactions, The setting of the value of Transaction Timeout, Batch Size;The value of attribute JDBC Connection Pool is indicated to even The type for connecing database, the value for the JDBC Connection Pool that sets a property are Hbase, i.e. connection Hbase database;Attribute The bulk transmission that the value of Support Fragmented Transactions is indicated whether with data flow, set a property Support The value of Fragmented Transactions is true, that is, supports the block-like transmission of data flow;Attribute Transaction The value of Timeout indicates the time of delay, and the value for the Transaction Timeout that sets a property is null value, that is, transmits and do not postpone, It is immediately performed the transmission;The value of attribute Batch Size indicates the size of processing unit capacity, sets a property Batch Size's Value is 100, i.e., is handled as unit of 100MB size;After the attribute of the PutSQL processor is provided with, execute PutSQL processor, and then complete for the tables of data of HBase database format to be stored in HBase database.
The advantages and beneficial effects of the present invention are:
1) present invention is the method based on ETL tool from relevant database to the data flow migration of HBase database, ETL tool include ExecuteSQL processor, ConvertAvroToJSON processor, ConvertJSONToSQL processor, ReplaceText processor, PutSQL processor;It is configured by the value of the attribute to these processors, so that data Stream is migrated from relevant database toward HBase database, final to realize that tables of data is stored in HBase database.Entirely In transition process, the setting of the value of the attribute of each processor is visually that operating process controllability is stronger, so that data The error rate of stream migration substantially reduces.
Detailed description of the invention
Fig. 1 is the flow chart of method of the invention.
Fig. 2 is the setting figure one of the value of ExecuteSQL processor attribute of the invention.
Fig. 3 is the setting figure two of the value of ExecuteSQL processor attribute of the invention.
Fig. 4 is the setting figure of the value of ConvertAvroToJSON processor attribute of the invention.
Fig. 5 is the setting figure of the value of ConvertJSONToSQL processor attribute of the invention.
Fig. 6 is the setting figure of the value of ReplaceText processor attribute of the invention.
Fig. 7 is the setting figure of the value of PutSQL processor attribute of the invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
The ETL tool include ExecuteSQL processor, ConvertAvroToJSON processor, ConvertJSONToSQL processor, ReplaceText processor, PutSQL processor;As shown in Figure 1, from relational data Library includes the following steps: to the data flow migration method of HBase database
The connection attribute of the ExecuteSQL processor of ETL tool is arranged, by ExecuteSQL processor and relationship type in S1 Database creation connection service, ExecuteSQL processor are inquired in turn and get the Avro format in relevant database Tables of data;
Specifically, the setting of the connection attribute of ExecuteSQL processor includes to attribute Database Connection URL、Database Driver Class Name、Database Driver Location(s)、Database User、 The setting of the value of Password;Relevant database URL is set by the value of attribute Database Connection URL, The value of attribute Database Driver Class Name is set to the title of relationship type database-driven file, by attribute The value of Database Driver Location (s) is set as the absolute path of relationship type database-driven file, by attribute The value of Database User is set as the user name of access relational database, and the value of attribute Password is set as accessing The corresponding password of the user name of relevant database;After the value of the attribute of the ExecuteSQL processor is provided with, execute ExecuteSQL processor completes inquiry and acquisition of the ExecuteSQL processor to the tables of data of the Avro format.
By ExecuteSQL processor and relevant database creation connection service, that is, ExecuteSQL processor is set Parameter SQL select query;When parameter SQL select query is set as select*from X, indicate ExecuteSQL processor is inquired and is obtained to tables of data X corresponding to parameter SQL select query.
Specifically, further include the setting that task timer-triggered scheduler is carried out to ExecuteSQL processor, i.e. setting parameter Max The value of wait time;When the value of parameter Max wait time is set as t, ExecuteSQL processor and relationship type number are indicated After successfully creating connection according to library, waits and execute within t seconds inquiry of the ExecuteSQL processor to the tables of data in relevant database And acquisition.
The data table transmition of the Avro format is JSON format by ConvertAvroToJSON processor by S2 Tables of data;
Specifically, including pair by the tables of data that the data table transmition of the Avro format is JSON format The attribute JSON container options of ConvertAvroToJSON processor, the value of Wrap single Record Setting;The value of attribute JSON container options indicates the expression way of the tables of data of JSON format, sets a property The value of JSON container options is array, that is, indicates the tables of data that JSON format is indicated in a manner of array type; The value of attribute Wrap single Record indicates whether to handle data flow with single recording mode, if it is, attribute Wrap The value of single Record is set as true, if it is not, then the value of attribute Wrap single Record is set as false, this The mode that Shi Caiyong is integrally packaged handles data flow;The value of the attribute of the ConvertAvroToJSON processor is provided with Afterwards, the data table transmition for executing ConvertAvroToJSON processor, and then completing Avro format is the data of JSON format Table.
S3, by ConvertJSONToSQL processor by the data table transmition of Json format be standard relationship type database The tables of data of format;
Specifically, including pair by the tables of data that the data table transmition of Json format is standard relationship type database format The attribute JDBC Connection Pool of ConvertJSONToSQL processor, Statement Type, Table Name, Translate Field Names、Unmatched Field Behavior、Unmatched Column Behavior、 The setting of the value of Quote Column Identifiers, Quote Table Identifiers;Attribute JDBC The value of Connection Pool indicates the type of database to be connected, and the value for the JDBC Connection Pool that sets a property is Hbase, i.e. connection Hbase database;The value of attribute Statement Type indicates the mode to Hbase database manipulation, if The value for setting attribute Statement Type is INSERT, i.e. the mode of operation Hbase database is insertion operation;Attribute Table The value of Name indicates the title of tables of data in Hbase database, and the value for the Table Name that sets a property is KK_PASS, that is, operates The KK_PASS tables of data of Hbase database;The value of attribute Translate Field Names indicates whether in processing tables of data Data, the value of the Translate Field Names that sets a property is true, i.e. data in processing tables of data;Attribute Whether the value expression of Unmatched Field Behavior handles field unmatched in tables of data, sets a property The value of Unmatched Field Behavior is Ignore Unmatched Fields, that is, ignores unmatched field;Attribute The value of Unmatched Column Behavior indicates whether handle column unmatched in tables of data, sets a property The value of Unmatched Column Behavior is Ignore Unmatched Columns, that is, ignores unmatched column;Attribute Whether the value expression of Quote Column Identifiers modifies to the title arranged in tables of data, and set a property Quote The value of Column Identifiers is false, i.e., to the title arranged in tables of data without modification;Attribute Quote Table Whether the value expression of Identifiers modifies to the title of tables of data, and set a property Quote Table Identifiers Value be false, i.e., to the title of tables of data without modification;The attribute of the ConvertJSONToSQL processor is set up The data table transmition that Cheng Hou executes ConvertJSONToSQL processor, and then completes Json format is standard relationship type data The tables of data of library format.
S4, by ReplaceText processor by the data table transmition of standard relationship type database format be HBase data The tables of data of library format;
Specifically, being the tables of data packet of HBase database format by the data table transmition of standard relationship type database format Include the attribute Search Value to ReplaceText processor, Replacement Value, Character Set, The setting of the value of Maximum Buffer Size, Replacement Strategy, Evaluation Mode;Attribute Search The value of Value indicates that the field of search, the value for the Search Value that sets a property are INSERT, i.e. search field INSERT;Belong to Property Replacement Value value indicate the field for being used to replace search field, set a property Replacement Value's Value is UPSERT, i.e., INSERT is replaced all with UPSERT;The value of attribute Character Set indicates the processing coding of Chinese Mode, the value for the Character Set that sets a property are UTF-8, i.e. Chinese processing is encoded to UTF-8;Attribute Maximum The value of Buffer Size indicates maximum spatial cache, and the value for the Maximum Buffer Size that sets a property is 1MB, i.e., maximum Spatial cache be set as 1MB;The value of attribute Replacement Strategy indicates the replacement policy of field, sets a property The value of Replacement Strategy is Regex Replace, i.e., the matching of field is carried out in a manner of regular expression;Belong to The value of property Evaluation Mode indicates that the process range of field to be replaced, the value for the Evaluation Mode that sets a property are Entire text, that is, the object handled are entire texts;After the attribute of the ReplaceText processor is provided with, execute ReplaceText processor, and then complete the data table transmition of standard relationship type database format to be HBase database format Tables of data.
The tables of data of HBase database format is stored in HBase database by S5 by PutSQL processor.
Specifically, it includes to PutSQL processor that the tables of data of HBase database format, which is stored in HBase database, Attribute JDBC Connection Pool, Support Fragmented Transactions, Transaction The setting of the value of Timeout, Batch Size;The value of attribute JDBC Connection Pool indicates the class of database to be connected Type, the value for the JDBC Connection Pool that sets a property are Hbase, i.e. connection Hbase database;Attribute Support The bulk transmission that the value of Fragmented Transactions is indicated whether with data flow, set a property Support The value of Fragmented Transactions is true, that is, supports the block-like transmission of data flow;Attribute Transaction The value of Timeout indicates the time of delay, and the value for the Transaction Timeout that sets a property is null value, that is, transmits and do not postpone, It is immediately performed the transmission;The value of attribute Batch Size indicates the size of processing unit capacity, sets a property Batch Size's Value is 100, i.e., is handled as unit of 100MB size;After the attribute of the PutSQL processor is provided with, execute PutSQL processor, and then complete for the tables of data of HBase database format to be stored in HBase database.
Method of the invention is described in detail below with reference to embodiment and attached drawing.
Embodiment:
1, the connection attribute of the ExecuteSQL processor of ETL tool is set.
As described in Figure 2, the value of the attribute of ExecuteSQL processor is provided that
Database Connection URL:jdbc:mysql: // 192.168.99.140:3306/test;
Database Driver Class Name:com.mysql.jdbc.Driver;
Database Driver Location (s) :/nifi/mysql-jdbc.jar;
Database User: the user name of setting access database;
Password: the corresponding password of setting user name;
Meanwhile it servicing as shown in figure 3, ExecuteSQL processor is created to connect with relevant database, is right ExecuteSQL processor carries out being provided that for task timer-triggered scheduler
SQL select query:select*from YW_MYSQL;
Max wait time:seconds.
After the value of the attribute of the ExecuteSQL processor is provided with, ExecuteSQL processor is executed, is completed Inquiry and acquisition of the ExecuteSQL processor to tables of data.
It 2, is the tables of data of JSON format by the data table transmition of the Avro format, as shown in figure 4, specifically ConvertAvroToJSON processor is provided that
JSON container options:array;
Wrap single Record:true;
After the value of the attribute of the ConvertAvroToJSON processor is provided with, ConvertAvroToJSON is executed Processor, and then the data table transmition for completing Avro format is the tables of data of JSON format.
It 3, is the tables of data of standard relationship type database format by the data table transmition of Json format, as shown in figure 5, specifically ConvertJSONToSQL processor is provided that
JDBC Connection Pool:Hbase;
Statement Type:INSERT;
Table Name:KK_PASS;
Translate Field Names:true;
Unmatched Field Behavior:Ignore Unmatched Fields;
Unmatched Column Behavior:Ignore Unmatched Columns;
Quote Column Identifiers:false;
Quote Table Identifiers:false;
After the attribute of the ConvertJSONToSQL processor is provided with, ConvertJSONToSQL processing is executed Device, and then the data table transmition for completing Json format is the tables of data of standard relationship type database format.
It 4, is the tables of data of HBase database format by the data table transmition of standard relationship type database format, such as Fig. 6 institute Show, specific ReplaceText processor is provided that
Search Value:INSERT;
Replacement Value:UPSERT;
Character Set:UTF-8;
Maximum Buffer Size:1MB;
Replacement Strategy:Regex Replace;
Evaluation Mode:Entire text;
After the attribute of the ReplaceText processor is provided with, ReplaceText processor is executed, and then complete It is the tables of data of HBase database format by the data table transmition of standard relationship type database format.
5, the tables of data of HBase database format is stored in HBase database, as shown in fig. 7, at specific PutSQL Reason device is provided that
JDBC Connection Pool:Hbase;
Support Fragmented Transactions:true;
Transaction Timeout: null value;
Batch Size:100;
After the attribute of the PutSQL processor is provided with, PutSQL processor is executed, and then complete HBase data The tables of data of library format is stored in HBase database.
In conclusion the present invention provides the data flow migration method from relevant database to HBase database, it is complete Journey visualized operation, process control are strong, and the error rate of data flow migration is lower.

Claims (8)

1. a kind of data flow migration method from relevant database to HBase database, which is characterized in that including walking as follows It is rapid:
The connection attribute of the ExecuteSQL processor of ETL tool is arranged, by ExecuteSQL processor and relational data in S1 The data of the Avro format in relevant database are inquired and got in turn to library creation connection service, ExecuteSQL processor Table;
S2, by ConvertAvroToJSON processor by the data table transmition of the Avro format be JSON format data Table;
S3, by ConvertJSONToSQL processor by the data table transmition of Json format be standard relationship type database format Tables of data;
S4, by ReplaceText processor by the data table transmition of standard relationship type database format be HBase data Coorg The tables of data of formula;
The tables of data of HBase database format is stored in HBase database by S5 by PutSQL processor.
2. a kind of data flow migration method from relevant database to HBase database according to claim 1, special Sign is: in step S1, the setting of the connection attribute of ExecuteSQL processor includes to attribute Database Connection URL、Database Driver Class Name、Database Driver Location(s)、Database User、 The setting of the value of Password;Relevant database URL is set by the value of attribute Database Connection URL, The value of attribute Database Driver Class Name is set to the title of relationship type database-driven file, by attribute The value of Database Driver Location (s) is set as the absolute path of relationship type database-driven file, by attribute The value of Database User is set as the user name of access relational database, and the value of attribute Password is set as accessing The corresponding password of the user name of relevant database;After the value of the attribute of the ExecuteSQL processor is provided with, execute ExecuteSQL processor completes inquiry and acquisition of the ExecuteSQL processor to the tables of data of the Avro format.
3. a kind of data flow migration method from relevant database to HBase database according to claim 2, special Sign is: in step S1, by ExecuteSQL processor and relevant database creation connection service, i.e. setting ExecuteSQL The parameter SQL select query of processor;When parameter SQL select query is set as select*from X, indicate ExecuteSQL processor is inquired and is obtained to tables of data X corresponding to parameter SQL select query.
4. a kind of data flow migration method from relevant database to HBase database according to claim 3, special Sign is: it further include the setting that task timer-triggered scheduler is carried out to ExecuteSQL processor in step S1, i.e. setting parameter Max The value of wait time;When the value of parameter Max wait time is set as t, ExecuteSQL processor and relationship type number are indicated After successfully creating connection according to library, waits and execute within t seconds inquiry of the ExecuteSQL processor to the tables of data in relevant database And acquisition.
5. a kind of data flow migration method from relevant database to HBase database according to claim 1, special Sign is: including pair by the tables of data that the data table transmition of the Avro format is JSON format in step S2 The attribute JSON container options of ConvertAvroToJSON processor, the value of Wrap single Record Setting;The value of attribute JSON container options indicates the expression way of the tables of data of JSON format, sets a property The value of JSON container options is array, that is, indicates the tables of data that JSON format is indicated in a manner of array type; The value of attribute Wrap single Record indicates whether to handle data flow with single recording mode, if it is, attribute Wrap The value of single Record is set as true, if it is not, then the value of attribute Wrap single Record is set as false, this The mode that Shi Caiyong is integrally packaged handles data flow;The value of the attribute of the ConvertAvroToJSON processor is provided with Afterwards, the data table transmition for executing ConvertAvroToJSON processor, and then completing Avro format is the data of JSON format Table.
6. a kind of data flow migration method from relevant database to HBase database according to claim 1, special Sign is: including pair by the tables of data that the data table transmition of Json format is standard relationship type database format in step S3 The attribute JDBC Connection Pool of ConvertJSONToSQL processor, Statement Type, Table Name, Translate Field Names、Unmatched Field Behavior、Unmatched Column Behavior、 The setting of the value of Quote Column Identifiers, Quote Table Identifiers;Attribute JDBC The value of Connection Pool indicates the type of database to be connected, and the value for the JDBC Connection Pool that sets a property is Hbase, i.e. connection Hbase database;The value of attribute Statement Type indicates the mode to Hbase database manipulation, if The value for setting attribute Statement Type is INSERT, i.e. the mode of operation Hbase database is insertion operation;Attribute Table The value of Name indicates the title of tables of data in Hbase database, and the value for the Table Name that sets a property is KK_PASS, that is, operates The KK_PASS tables of data of Hbase database;The value of attribute Translate Field Names indicates whether in processing tables of data Data, the value of the Translate Field Names that sets a property is true, i.e. data in processing tables of data;Attribute Whether the value expression of Unmatched Field Behavior handles field unmatched in tables of data, sets a property The value of Unmatched Field Behavior is Ignore Unmatched Fields, that is, ignores unmatched field;Attribute The value of Unmatched Column Behavior indicates whether handle column unmatched in tables of data, sets a property The value of Unmatched Column Behavior is Ignore Unmatched Columns, that is, ignores unmatched column;Attribute Whether the value expression of Quote Column Identifiers modifies to the title arranged in tables of data, and set a property Quote The value of Column Identifiers is false, i.e., to the title arranged in tables of data without modification;Attribute Quote Table Whether the value expression of Identifiers modifies to the title of tables of data, and set a property Quote Table Identifiers Value be false, i.e., to the title of tables of data without modification;The attribute of the ConvertJSONToSQL processor is set up The data table transmition that Cheng Hou executes ConvertJSONToSQL processor, and then completes Json format is standard relationship type data The tables of data of library format.
7. a kind of data flow migration method from relevant database to HBase database according to claim 1, special Sign is: being the tables of data packet of HBase database format by the data table transmition of standard relationship type database format in step S4 Include the attribute Search Value to ReplaceText processor, Replacement Value, Character Set, The setting of the value of Maximum Buffer Size, Replacement Strategy, Evaluation Mode;Attribute Search The value of Value indicates that the field of search, the value for the Search Value that sets a property are INSERT, i.e. search field INSERT;Belong to Property Replacement Value value indicate the field for being used to replace search field, set a property Replacement Value's Value is UPSERT, i.e., INSERT is replaced all with UPSERT;The value of attribute Character Set indicates the processing coding of Chinese Mode, the value for the Character Set that sets a property are UTF-8, i.e. Chinese processing is encoded to UTF-8;Attribute Maximum The value of Buffer Size indicates maximum spatial cache, and the value for the Maximum Buffer Size that sets a property is 1MB, i.e., maximum Spatial cache be set as 1MB;The value of attribute Replacement Strategy indicates the replacement policy of field, sets a property The value of Replacement Strategy is Regex Replace, i.e., the matching of field is carried out in a manner of regular expression;Belong to The value of property Evaluation Mode indicates that the process range of field to be replaced, the value for the Evaluation Mode that sets a property are Entire text, that is, the object handled are entire texts;After the attribute of the ReplaceText processor is provided with, execute ReplaceText processor, and then complete the data table transmition of standard relationship type database format to be HBase database format Tables of data.
8. a kind of data flow migration method from relevant database to HBase database according to claim 1, special Sign is: in step S5, it includes to PutSQL processing that the tables of data of HBase database format, which is stored in HBase database, Attribute JDBC Connection Pool, Support Fragmented Transactions, Transaction of device The setting of the value of Timeout, Batch Size;The value of attribute JDBC Connection Pool indicates the class of database to be connected Type, the value for the JDBC Connection Pool that sets a property are Hbase, i.e. connection Hbase database;Attribute Support The bulk transmission that the value of Fragmented Transactions is indicated whether with data flow, set a property Support The value of Fragmented Transactions is true, that is, supports the block-like transmission of data flow;Attribute Transaction The value of Timeout indicates the time of delay, and the value for the Transaction Timeout that sets a property is null value, that is, transmits and do not postpone, It is immediately performed the transmission;The value of attribute Batch Size indicates the size of processing unit capacity, sets a property Batch Size's Value is 100, i.e., is handled as unit of 100MB size;After the attribute of the PutSQL processor is provided with, execute PutSQL processor, and then complete for the tables of data of HBase database format to be stored in HBase database.
CN201811012560.3A 2018-08-31 2018-08-31 From relevant database to the data flow migration method of HBase database Pending CN109299068A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811012560.3A CN109299068A (en) 2018-08-31 2018-08-31 From relevant database to the data flow migration method of HBase database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811012560.3A CN109299068A (en) 2018-08-31 2018-08-31 From relevant database to the data flow migration method of HBase database

Publications (1)

Publication Number Publication Date
CN109299068A true CN109299068A (en) 2019-02-01

Family

ID=65165931

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811012560.3A Pending CN109299068A (en) 2018-08-31 2018-08-31 From relevant database to the data flow migration method of HBase database

Country Status (1)

Country Link
CN (1) CN109299068A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287172A (en) * 2019-07-01 2019-09-27 四川新网银行股份有限公司 A method of formatting HBase data
CN110704528A (en) * 2019-10-11 2020-01-17 苏州易博创云网络科技有限公司 Data processing method capable of automatic identification and configuration conversion
CN111177244A (en) * 2019-12-24 2020-05-19 四川文轩教育科技有限公司 Data association analysis method for multiple heterogeneous databases
CN112559606A (en) * 2019-09-26 2021-03-26 北京国双科技有限公司 Conversion method and conversion device for JSON format data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103440273A (en) * 2013-08-06 2013-12-11 北京航空航天大学 Data cross-platform migration method and device
CN103631907A (en) * 2013-11-26 2014-03-12 中国科学院信息工程研究所 Method and system for migrating relational data to HBbase
CN105426506A (en) * 2015-11-27 2016-03-23 中国科学院重庆绿色智能技术研究院 Massive dynamic data management method
CN106528786A (en) * 2016-11-08 2017-03-22 国网山东省电力公司电力科学研究院 Method and system for rapidly transferring multi-source heterogeneous power grid big data to HBase

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103440273A (en) * 2013-08-06 2013-12-11 北京航空航天大学 Data cross-platform migration method and device
CN103631907A (en) * 2013-11-26 2014-03-12 中国科学院信息工程研究所 Method and system for migrating relational data to HBbase
CN105426506A (en) * 2015-11-27 2016-03-23 中国科学院重庆绿色智能技术研究院 Massive dynamic data management method
CN106528786A (en) * 2016-11-08 2017-03-22 国网山东省电力公司电力科学研究院 Method and system for rapidly transferring multi-source heterogeneous power grid big data to HBase

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SAIKRISHNA TEJA BOBBA: ""Ingest Salesforce Data Incrementally Into Hive Using Apache NiFi"", 《DZONE HTTPS://DZONE.COM/ARTICLES/ACCESS-DATA-VIA-JDBC-WITH-APACHE-NIFI》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287172A (en) * 2019-07-01 2019-09-27 四川新网银行股份有限公司 A method of formatting HBase data
CN110287172B (en) * 2019-07-01 2023-05-02 四川新网银行股份有限公司 Method for formatting HBase data
CN112559606A (en) * 2019-09-26 2021-03-26 北京国双科技有限公司 Conversion method and conversion device for JSON format data
CN110704528A (en) * 2019-10-11 2020-01-17 苏州易博创云网络科技有限公司 Data processing method capable of automatic identification and configuration conversion
CN111177244A (en) * 2019-12-24 2020-05-19 四川文轩教育科技有限公司 Data association analysis method for multiple heterogeneous databases

Similar Documents

Publication Publication Date Title
CN109299068A (en) From relevant database to the data flow migration method of HBase database
US20220035815A1 (en) Processing database queries using format conversion
CN105531698B (en) Equipment, system and method for batch and real time data processing
US9002813B2 (en) Execution plan preparation in application server
KR20200106950A (en) Dimensional context propagation techniques for optimizing SQL query plans
US9146979B2 (en) Optimization of business warehouse queries by calculation engines
US10102269B2 (en) Object query model for analytics data access
US9218373B2 (en) In-memory data profiling
EP1368745A2 (en) Item name normalization
US9846714B2 (en) Database device
US10776353B2 (en) Application programming interface for database access
US10838959B2 (en) Harmonized structured query language and non-structured query language query processing
US11409722B2 (en) Database live reindex
CN104133870A (en) Web page similarity calculation method and web page similarity calculation device
US10497039B1 (en) Techniques for dynamic variations of a search query
CN109284469B (en) Webpage development framework
US20130060795A1 (en) Prepared statements to improve performance in database interfaces
CN105574027A (en) On-line transaction processing/on-line analytical processing (OLTP/OLAP) hybrid application based multi-dimensional performance data storage method, device and system
US10789249B2 (en) Optimal offset pushdown for multipart sorting
US10599728B1 (en) Metadata agent for query management
US20140344245A1 (en) Calculation Engine with Optimized Multi-Part Querying
US9852162B2 (en) Defining a set of data across multiple databases using variables and functions
US20240176803A1 (en) Simplified schema generation for data ingestion
CN107908785A (en) Incorporeity class based on SSM frames realizes data page
Jánki et al. Full-stack FHIR-based MBaaS with server-and client-side caching capable WebDAO

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190201