CN112288586A - Insurance industry data integration method based on HBase and related equipment - Google Patents

Insurance industry data integration method based on HBase and related equipment Download PDF

Info

Publication number
CN112288586A
CN112288586A CN202011312448.9A CN202011312448A CN112288586A CN 112288586 A CN112288586 A CN 112288586A CN 202011312448 A CN202011312448 A CN 202011312448A CN 112288586 A CN112288586 A CN 112288586A
Authority
CN
China
Prior art keywords
data
hbase
data integration
row
data source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011312448.9A
Other languages
Chinese (zh)
Inventor
范铮
陈学亮
赵星光
高擎阳
袁利鸥
曲明钰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Life Insurance Co Ltd China
Original Assignee
China Life Insurance Co Ltd China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Life Insurance Co Ltd China filed Critical China Life Insurance Co Ltd China
Priority to CN202011312448.9A priority Critical patent/CN112288586A/en
Publication of CN112288586A publication Critical patent/CN112288586A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/221Column-oriented storage; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Computing Systems (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

One or more embodiments of the present specification provide an insurance industry data integration method based on HBase and related devices; the method comprises the following steps: carrying out reverse order or Hash processing on the policy number, and taking the policy number subjected to the reverse order or Hash processing as a row key rowkey of an HBase data integration model table of the starting database; setting the column name of each column in the HBase data integration model table as the table name of a data source table to be integrated and the primary key value of each row of the data source table; and respectively splicing each row of data of the data source table into JSON character strings, and storing the JSON character strings into corresponding fields of the HBase data integration model table. The method and the related equipment provided by the specification utilize the technical advantages of HBase and combine the characteristics of insurance industry data, and solve the problems of difficult data updating, high performance consumption in the data integration process and high field expansion cost of the data integration scheme.

Description

Insurance industry data integration method based on HBase and related equipment
Technical Field
One or more embodiments of the present disclosure relate to the field of big data technologies, and in particular, to an insurance industry data integration method based on HBase and a related device.
Background
Many systems, such as a core business transaction system, a client resource management system, etc., are purchased or developed in the process of building the insurance company system, and as business develops, the systems are evolving, and there may be a plurality of systems for generating business data. These data are valuable assets for insurance enterprises, but the assets are scattered in information islands and cannot exert the value of the assets. How to effectively integrate data with large capacity, multiple types, rapid growth and low value density is a difficult problem for each insurance company.
In the prior art, an insurance industry data warehouse has a data integration function, and usually establishes a plurality of theme tables according to business processes (such as policy, insurance, claim settlement, etc.), and integrates data of a plurality of systems into the theme tables. However, the insurance industry is characterized in that the service data is not only newly added (insert), but also is much updated (update), and in the data integration process, the update to be processed in the traditional relational database is much more complicated than simple addition, and the original data is often required to be deleted and then written, so that the system performance overhead is higher.
Based on this, a data integration method that can realize simple update and small system performance overhead is needed.
Disclosure of Invention
In view of the above, one or more embodiments of the present disclosure are directed to an insurance industry data integration method based on HBase and related equipment, so as to overcome all or part of the deficiencies in the prior art.
In view of the above, one or more embodiments of the present specification provide an insurance industry data integration method based on HBase, including:
determining a row key of an open source database HBase data integration model table according to a policy number in at least one data source table to be integrated;
setting the column name of each column of the HBase data integration model table according to the table name of the at least one data source table;
splicing fields of each row of data of the at least one data source table into a character string respectively; and
storing the character string into a corresponding field in the HBase data integration model table; the row key of the corresponding field is a row key determined by the policy number corresponding to the character string; and the column name of the corresponding field is the column name set by the table name of the data source table where the character string is located.
Based on the same inventive concept, one or more embodiments of the present specification further provide an insurance industry data integration apparatus based on HBase, including:
the determining module is configured to determine a row key of an open source database HBase data integration model table according to the policy number in at least one data source table to be integrated;
the setting module is configured to set a column name of each column of the HBase data integration model table according to the table name of the at least one data source table;
the splicing and storing module is used for splicing the fields of each row of data of the at least one data source table into a character string; and
storing the character string into a corresponding field in the HBase data integration model table; the row key of the corresponding field is a row key determined by the policy number corresponding to the character string; and the column name of the corresponding field is the column name set by the table name of the data source table where the character string is located.
Based on the same inventive concept, one or more embodiments of the present specification further provide an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor implements the method as described in any one of the above items when executing the program.
Based on the same inventive concept, one or more embodiments of the present specification also provide a non-transitory computer-readable storage medium, wherein the non-transitory computer-readable storage medium stores computer instructions for causing the computer to perform the method as described in any one of the above.
As can be seen from the above, the insurance industry data integration method based on the HBase and the related device provided in one or more embodiments of the present disclosure form a data integration scheme by using the technical advantages of the HBase and combining the characteristics of insurance industry data, and solve the problems of difficulty in data updating, high performance consumption in the data integration process, and high field expansion cost of the data integration scheme.
Drawings
In order to more clearly illustrate one or more embodiments or prior art solutions of the present specification, the drawings that are needed in the description of the embodiments or prior art will be briefly described below, and it is obvious that the drawings in the following description are only one or more embodiments of the present specification, and that other drawings may be obtained by those skilled in the art without inventive effort from these drawings.
FIG. 1 is a flow diagram of a HBase-based insurance industry data integration method according to one or more embodiments of the present disclosure;
FIG. 2 is a diagram illustrating an HBase-based billing form data integration method in one or more embodiments of the present disclosure;
FIG. 3 is a schematic diagram of an HBase-based insurance industry data integration method in accordance with one or more embodiments of the present disclosure;
FIG. 4 is a schematic structural diagram of an HBase-based insurance industry data integration apparatus according to one or more embodiments of the present disclosure;
fig. 5 is a schematic structural diagram of an electronic device according to one or more embodiments of the present disclosure.
Detailed Description
For the purpose of promoting a better understanding of the objects, aspects and advantages of the present disclosure, reference is made to the following detailed description taken in conjunction with the accompanying drawings.
It is to be noted that unless otherwise defined, technical or scientific terms used in one or more embodiments of the present specification should have the ordinary meaning as understood by those of ordinary skill in the art to which this disclosure belongs. The use of the terms "comprising" or "including" and the like in one or more embodiments of the present specification is intended to mean that the element or item presented before the term "comprises" or "comprising" is included in the list of elements or items listed after the term and its equivalents, without excluding other elements or items.
As described in the background section, the existing insurance industry data integration method has the problems of complex updating, high system performance overhead and the like. In the process of implementing the present disclosure, the applicant finds that the existing insurance industry data integration method has the following defects:
(1) multiple large data tables are associated with a large cost. The integrated model often encounters the situation that different fields come from different systems or different tables, and when a row of records in the model needs to be updated, the data causing changes is firstly taken as a condition, and the whole row is spliced and then updated, so that the associated cost is large.
(2) The detail data is easy to lose. The model created in the traditional relational database is stored in a two-dimensional table form, the fields are fixed, all details are difficult to keep, and therefore when new fields need to be added, the model needs to be continuously modified, changes of machining programs are brought, data are completely re-machined, and the cost is very high.
In recent years, more and more enterprises introduce big data technologies to process the problems of data integration, real-time analysis and the like which are difficult to deal with by traditional software, new components and technologies are in endless, wherein an open-source framework Hadoop is one of main stream big data solutions which are continuously developed, and a large number of components are derived by the open-source framework to meet various business requirements of the enterprises.
The distributed key-value type open source database HBase is an important member in Hadoop ecology, runs on a distributed file system HDFS, has high availability and expansibility of the HDFS, and also plays many characteristics of the key-value type database together, such as: insert i.e. update, allow only partial column updates, can accommodate billions of rows and millions of columns, large concurrent query millisecond returns, etc. The characteristics of HBase are expected to be fully utilized to form a set of solution for loading, storing, processing and inquiring, and a new design choice is provided for the construction of a large enterprise data platform in the insurance industry.
In view of this, one or more embodiments of the present disclosure provide an insurance industry data integration method based on HBase, and specifically, a policy number is first subjected to reverse order or hash processing, and the policy number subjected to the reverse order or hash processing is used as a row key rowkey of an HBase data integration model table of an open source database, so that a rowkey for uniformly hashing data can be obtained. And then setting the column name of each column in the HBase data integration model table as the table name of a data source table to be integrated and the primary key value of each row of the data source table. And finally, splicing each row of data of the data source table into JSON character strings respectively, and storing the JSON character strings into corresponding fields of the HBase data integration model table to finish data integration.
Therefore, the insurance industry data integration method based on the HBase and the related equipment in one or more embodiments of the specification form a data integration scheme by using the technical advantages of the HBase and combining the characteristics of insurance industry data, and solve the problems of difficult data updating, high performance consumption in the data integration process and high field expansion cost of the data integration scheme.
The technical solutions of one or more embodiments of the present specification are described in detail below with reference to specific embodiments.
Referring to fig. 1, an embodiment of the present disclosure of an HBase-based insurance industry data integration method includes the following steps:
step S101, determining a row key of an open source database HBase data integration model table according to a policy number in at least one data source table to be integrated.
In this step, the HBase data integration model is a model in which all data penetrating through the service entity is recorded in one row, and when data in each data source table is imported into the HBase data integration model, a rowkey field is specified, and for an insurance company, most of applications and data are centered on a policy, so that the policy number is used as the rowkey of the HBase data integration model table. The best case for the HBase data integration model table is that data is uniformly distributed in a plurality of regions, each region has a boundary: startrow and endrow (except for 1 region, there is no startrow and endrow), and because data is stored in different regions according to rowkey, the rowkey is ordered according to the ascii code, if the data is to be uniformly distributed, a rowkey capable of uniformly hashing the data is needed, the number of the policy is the serial number of the organization and the year from the previous bit, and the n last bits are usually self-increment sequences (serial numbers), therefore, the policy number needs to be processed in reverse order to make the serial number in front, and the policy number processed in reverse order is used as the rowkey of the HBase data integration model table.
The negative sequence of the policy number is specifically as follows: and writing the character strings corresponding to the policy number into the open source database HBase data integration model table in a reverse order so as to lead the serial number in the policy number to be in front.
In this embodiment, in addition to performing reverse order processing on the policy number to obtain a rowkey capable of uniformly hashing the data, hash processing may be performed on the policy number, and the policy number after the hash processing is used as the rowkey of the HBase data integration model table.
And S102, setting the column name of each column of the HBase data integration model table according to the table name of the at least one data source table.
In this step, specifically, the column name of each column in the HBase data integration model table needs to be set as the table name of the at least one data source table plus the primary key value of each row of the data source table. This allows for differentiation of each column in the HBase data integration model table. For example: in the insurance industry, the multiple charges for a policy are recorded in multiple rows of a table in a source system conforming to a paradigm, while in the HBase data integration model, each row exists in a separate column named table name of a data source table + primary key value of each row of the data source table. Specifically, referring to fig. 2, a schematic diagram of a charging form data integration method based on HBase in an embodiment of this specification is shown, in the diagram, a table name of a data source table is "charging", there are three rows of data, primary key values of the three rows of data are 01, 02, and 03, a plurality of fields need to be stored in an HBase data integration model, that is, column names of each column are set in an HBase data integration model table, and then the column names are: charge 01, charge 02, charge 03.
Step S103, splicing fields of each line of data of the at least one data source table into a character string respectively and storing the character string into a corresponding field in the HBase data integration model table; the row key of the corresponding field is a row key determined by the policy number corresponding to the character string; and the column name of the corresponding field is the column name set by the table name of the data source table where the character string is located.
In this step, the character string is in a JSON format, and the JSON character string is in a key-value format. JSON strings are a relatively common format that preserves every field in a data source table, and are available as handlers for the JSON format in many high-level languages.
For example, the data source table a has two fields a and b, and a row value in the data source table a is v _ a and v _ b, respectively, then the JSON character string spliced into the key-value format is { a: v _ a, b: v _ b }, that is, "column name: column value "stored in the corresponding field in the HBase data integration model table. First, if the data source table fields are set to be in one-to-one correspondence, namely the data source table fields are all scattered, the growth speed of the columns can be amplified by dozens of times, and the performance problem is caused; secondly, the data is stored in the form of JSON character strings, so that the data can keep the row-column relationship (a plurality of columns in the same row) in the data source table, and the relationship is difficult to process if the data is stored in a scattered manner. Specifically, referring to fig. 2, the data source table has three rows of data, and to integrate the data source table into the HBase data integration model table, each row of data needs to be spliced into a corresponding JSON character string, that is, a primary key: 01, warranty number: 001, type: a, amount: 100}, { primary key: 02, warranty number: 001, type: b, amount: 100}, { primary key: 03, warranty number: 001, type: c, amount: 100, and stored in the corresponding column of the HBase data integration model table, the three rows of data of the data source table become three columns of data.
Referring to fig. 3, a schematic diagram of an insurance industry data integration method based on HBase according to an embodiment of the present disclosure is shown, in the insurance industry, there are data source tables of multiple topics, and the data source tables are, for example: the policy table, the applicant table, the claim settlement table, the security table, the charging table and the like are all established by taking a policy number as a center, the policy numbers are stored in the data source tables, each row of records of the data source tables needs to be spliced into JSON character strings respectively, then the JSON character strings spliced by each row of records are stored in each field of the HBase data integration model table of the open source database respectively, and therefore data related to each policy number are placed in each row in the HBase data integration model table respectively.
After the source data table is integrated according to the HBase data integration model, a policy detail table is formed, all data of the policy are stored in one large-width table, and data are aggregated according to the table and the rows.
As can be seen, in the insurance industry data integration method based on HBase provided in the embodiments of the present specification, the policy number after reverse order or hash processing is used as the row key rowkey of the table of the HBase data integration model, so that data can be uniformly distributed; the characteristics of insurance industry data are combined, the technical advantages of HBase are utilized, and the problems that data updating is difficult, performance consumption is high in the data integration process, and the field expansion cost of a data integration scheme is high are solved; the policy number after the reverse order or hash processing is used as a row key, and all the data related to the policy number is stored in a row, so that the integrated data is very suitable for being used as a detail layer integrated in a data warehouse, the updating is convenient, the data detail can be kept, and the loss of the detail data is avoided.
It should be noted that the method of one or more embodiments of the present disclosure may be performed by a single device, such as a computer or server. The method of the embodiment can also be applied to a distributed scene and completed by the mutual cooperation of a plurality of devices. In such a distributed scenario, one of the devices may perform only one or more steps of the method of one or more embodiments of the present disclosure, and the devices may interact with each other to complete the method.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
Based on the same inventive concept, corresponding to any embodiment method, one or more embodiments of the present specification further provide an insurance industry data integration device based on HBase. Referring to fig. 4, the HBase-based insurance industry data integration apparatus includes:
a determining module 401 configured to determine a row key of an open source database HBase data integration model table according to a policy number in at least one data source table to be integrated;
a setting module 402 configured to set a column name of each column of the HBase data integration model table according to a table name of the at least one data source table;
a splicing and storing module 403 configured to splice fields of each row of data of the at least one data source table into a character string; and
storing the character string into a corresponding field in the HBase data integration model table; the row key of the corresponding field is a row key determined by the policy number corresponding to the character string; and the column name of the corresponding field is the column name set by the table name of the data source table where the character string is located.
As an optional embodiment, the determining module 401 is specifically configured to perform reverse order processing on the policy number; and taking the policy number processed in the reverse order as a row key of the HBase data integration model table.
As an optional embodiment, the reverse order processing on the policy number is specifically configured to write a character string corresponding to the policy number into the HBase data integration model table in a reverse order, so that the serial number in the policy number is before.
As an optional embodiment, the determining module 401 is specifically configured to perform hash processing on the policy number; and taking the policy number after the hash processing as a row key of the HBase data integration model table.
As an alternative embodiment, the setting module 402 is specifically configured to set the column name of each column in the HBase data integration model table as the table name of the at least one data source table plus the primary key value of each row of the data source table.
As an optional embodiment, the splicing the fields of each row of data of the at least one data source table into a string is specifically configured to splice each row of data of the data source table into a JSON string.
As an alternative embodiment, the HBase data integration model table is a wide table.
For convenience of description, the above devices are described as being divided into various modules by functions, and are described separately. Of course, the functionality of the modules may be implemented in the same one or more software and/or hardware implementations in implementing one or more embodiments of the present description.
The apparatus in the foregoing embodiment is used to implement the corresponding method for integrating insurance industry data based on HBase in the foregoing embodiment, and has the beneficial effects of the corresponding method embodiment, which are not described herein again.
Based on the same inventive concept, corresponding to any of the above embodiments, one or more embodiments of the present specification further provide an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the HBase-based insurance industry data integration method according to any of the above embodiments is implemented.
Fig. 5 is a schematic diagram illustrating a more specific hardware structure of an electronic device according to this embodiment, where the electronic device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein the processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 are communicatively coupled to each other within the device via bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. The memory 1020 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present specification is implemented by software or firmware, the relevant program codes are stored in the memory 1020 and called to be executed by the processor 1010.
The input/output interface 1030 is used for connecting an input/output module to input and output information. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
The communication interface 1040 is used for connecting a communication module (not shown in the drawings) to implement communication interaction between the present apparatus and other apparatuses. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).
Bus 1050 includes a path that transfers information between various components of the device, such as processor 1010, memory 1020, input/output interface 1030, and communication interface 1040.
It should be noted that although the above-mentioned device only shows the processor 1010, the memory 1020, the input/output interface 1030, the communication interface 1040 and the bus 1050, in a specific implementation, the device may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only those components necessary to implement the embodiments of the present description, and not necessarily all of the components shown in the figures.
The electronic device in the above embodiment is used to implement the corresponding HBase-based insurance industry data integration method in any of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiment, which are not described herein again.
Based on the same inventive concept, corresponding to any of the above-described embodiment methods, one or more embodiments of the present specification further provide a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the HBase-based insurance industry data integration method according to any of the above-described embodiments.
Computer-readable media of the present embodiments, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
The computer instructions stored in the storage medium of the above embodiment are used to enable the computer to execute the HBase-based insurance industry data integration method according to any of the above embodiments, and have the beneficial effects of corresponding method embodiments, which are not described herein again.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, is limited to these examples; within the spirit of the present disclosure, features from the above embodiments or from different embodiments may also be combined, steps may be implemented in any order, and there are many other variations of different aspects of one or more embodiments of the present description as described above, which are not provided in detail for the sake of brevity.
While the present disclosure has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of these embodiments will be apparent to those of ordinary skill in the art in light of the foregoing description. For example, other memory architectures (e.g., dynamic ram (dram)) may use the discussed embodiments.
It is intended that the one or more embodiments of the present specification embrace all such alternatives, modifications and variations as fall within the broad scope of the appended claims. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of one or more embodiments of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (10)

1. An insurance industry data integration method based on HBase comprises the following steps:
determining a row key of an open source database HBase data integration model table according to a policy number in at least one data source table to be integrated;
setting the column name of each column of the HBase data integration model table according to the table name of the at least one data source table;
splicing fields of each row of data of the at least one data source table into a character string respectively; and
storing the character string into a corresponding field in the HBase data integration model table; the row key of the corresponding field is a row key determined by the policy number corresponding to the character string; and the column name of the corresponding field is the column name set by the table name of the data source table where the character string is located.
2. The method according to claim 1, wherein the determining the row key of the HBase data integration model table according to the policy number in the at least one data source table to be integrated comprises:
carrying out reverse order processing on the policy number; and
and taking the policy number processed in the reverse order as a row key of the HBase data integration model table.
3. The method of claim 2, wherein said reverse ordering the policy number comprises: and writing the character string corresponding to the policy number into the HBase data integration model table in a reverse order so that the serial number in the policy number is in front.
4. The method according to claim 1, wherein the determining the row key of the HBase data integration model table according to the policy number in the at least one data source table to be integrated comprises:
performing hash processing on the policy number; and
and taking the policy number after the hash processing as a row key of the HBase data integration model table.
5. The method according to claim 1, wherein the setting of the column name of each column of the HBase data integration model table according to the table name of the at least one data source table comprises: and setting the column name of each column in the HBase data integration model table as the table name of the at least one data source table plus the primary key value of each row of the data source table.
6. The method of claim 1, wherein the splicing fields of each row of data of the at least one data source table into a string comprises: and respectively splicing each row of data of the data source table into a JSON character string.
7. The method according to claim 1, wherein the HBase data integration model table is a wide table.
8. An insurance industry data integration device based on HBase, includes:
the determining module is configured to determine a row key of an open source database HBase data integration model table according to the policy number in at least one data source table to be integrated;
the setting module is configured to set a column name of each column of the HBase data integration model table according to the table name of the at least one data source table;
the splicing and storing module is used for splicing the fields of each row of data of the at least one data source table into a character string; and
storing the character string into a corresponding field in the HBase data integration model table; the row key of the corresponding field is a row key determined by the policy number corresponding to the character string; and the column name of the corresponding field is the column name set by the table name of the data source table where the character string is located.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of any one of claims 1 to 7 when executing the program.
10. A non-transitory computer-readable storage medium, wherein the non-transitory computer-readable storage medium stores computer instructions for causing the computer to perform the method of any one of claims 1 to 7.
CN202011312448.9A 2020-11-20 2020-11-20 Insurance industry data integration method based on HBase and related equipment Pending CN112288586A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011312448.9A CN112288586A (en) 2020-11-20 2020-11-20 Insurance industry data integration method based on HBase and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011312448.9A CN112288586A (en) 2020-11-20 2020-11-20 Insurance industry data integration method based on HBase and related equipment

Publications (1)

Publication Number Publication Date
CN112288586A true CN112288586A (en) 2021-01-29

Family

ID=74399549

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011312448.9A Pending CN112288586A (en) 2020-11-20 2020-11-20 Insurance industry data integration method based on HBase and related equipment

Country Status (1)

Country Link
CN (1) CN112288586A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116842031A (en) * 2023-09-01 2023-10-03 北京车与车科技有限公司 Data updating method, device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109840852A (en) * 2019-01-09 2019-06-04 中国平安人寿保险股份有限公司 A kind of method that declaration form automatically processes and Related product
CN110427473A (en) * 2019-08-02 2019-11-08 泰康保险集团股份有限公司 Data processing method, device, equipment and storage medium
CN110442642A (en) * 2019-06-19 2019-11-12 北京航天智造科技发展有限公司 Data processing method, device and the storage medium of distributed data base

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109840852A (en) * 2019-01-09 2019-06-04 中国平安人寿保险股份有限公司 A kind of method that declaration form automatically processes and Related product
CN110442642A (en) * 2019-06-19 2019-11-12 北京航天智造科技发展有限公司 Data processing method, device and the storage medium of distributed data base
CN110427473A (en) * 2019-08-02 2019-11-08 泰康保险集团股份有限公司 Data processing method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈利强: "大数据建模方法与实践", 金融电子化, pages 56 - 58 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116842031A (en) * 2023-09-01 2023-10-03 北京车与车科技有限公司 Data updating method, device and storage medium
CN116842031B (en) * 2023-09-01 2023-12-05 北京车与车科技有限公司 Data updating method, device and storage medium

Similar Documents

Publication Publication Date Title
US10255108B2 (en) Parallel execution of blockchain transactions
US10169433B2 (en) Systems and methods for an SQL-driven distributed operating system
US20190065542A1 (en) Parallel processing of disjoint change streams into a single stream
US11681651B1 (en) Lineage data for data records
CN102999537A (en) System and method for data migration
US10977011B2 (en) Structured development for web application frameworks
CN110008018A (en) A kind of batch tasks processing method, device and equipment
Verma et al. Big Data representation for grade analysis through Hadoop framework
Qureshi et al. Towards efficient big data and data analytics: a review
CN111966760B (en) Test data generation method and device based on Hive data warehouse
CN108062384A (en) The method and apparatus of data retrieval
CN110019111A (en) Data processing method, device, storage medium and processor
CN111382155A (en) Data processing method of data warehouse, electronic equipment and medium
CN103150145A (en) Parallel processing of semantically grouped data in data warehouse environments
CN112860412B (en) Service data processing method and device, electronic equipment and storage medium
CN112288586A (en) Insurance industry data integration method based on HBase and related equipment
CN115329011A (en) Data model construction method, data query method, data model construction device and data query device, and storage medium
CN109582476B (en) Data processing method, device and system
CN112559603B (en) Feature extraction method, device, equipment and computer-readable storage medium
CN112463785A (en) Data quality monitoring method and device, electronic equipment and storage medium
CN112445810A (en) Data updating method and device for data warehouse, electronic device and storage medium
CN112445759B (en) Method and device for copying data across clusters of distributed database and electronic equipment
US11609834B2 (en) Event based aggregation for distributed scale-out storage systems
CN109299125A (en) Database update method and device
US11954531B2 (en) Use of relational databases in ephemeral computing nodes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination