CN110647577A

CN110647577A - Data cube partitioning method and device, computer equipment and storage medium

Info

Publication number: CN110647577A
Application number: CN201910711717.XA
Authority: CN
Inventors: 颜许晓; 鲜丹
Original assignee: Weifutong Technology Co Ltd
Current assignee: Weifutong Technology Co Ltd
Priority date: 2019-08-02
Filing date: 2019-08-02
Publication date: 2020-01-03

Abstract

The embodiment of the application belongs to the technical field of information security, and relates to a partitioning method of a data cube, which comprises the following steps: receiving a partition request carrying database identification information; acquiring original service data corresponding to the database identification information in a service database, wherein the original service data comprises time information and channel information; preprocessing the time information and the channel information; constructing a data cube corresponding to the original service data; partitioning the data cube according to the channel information to obtain a plurality of subdata cubes; and respectively storing the plurality of sub data cubes into different storage media. The application also provides a partitioning device of the data cube, computer equipment and a storage medium. According to the data cube partitioning storage method and device, the data cube is stored in a partitioning mode, a plurality of partitioned subdata cubes needing to be inquired can work in parallel, and therefore the data inquiring efficiency is improved.

Description

Data cube partitioning method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of information security technologies, and in particular, to a data cube partitioning method and apparatus, a computer device, and a storage medium.

Background

With the development and application of database technology, the amount of data stored in the database is shifted from megabytes (M) and gigabytes (G) of the 80 th 20 th century to terabytes (T) and gigabytes (P) at present, and at the same time, the query requirements of users are more and more complicated, and the query or manipulation not only refers to querying or manipulating one or more records in a relational table, but also refers to data analysis and information synthesis of tens of millions of records in multiple tables.

In the existing data management method, a fact table and a dimension table are constructed, and the dimension table and the fact table are linked together through a primary key word and a foreign key word to form a relational database in a star mode, so that data management is realized.

However, the conventional data management method is generally not intelligent, and as the amount of data stored in the database is increased, the number of data tables generated by the data management method is increased, so that the storage space occupied by the data management method is increased, and the processing efficiency of the database is greatly reduced.

Disclosure of Invention

An embodiment of the application aims to provide a data cube partitioning method, a data cube partitioning device, computer equipment and a storage medium, and aims to solve the problems that an existing data management method occupies a large amount of storage space and is low in processing efficiency.

In order to solve the above technical problem, an embodiment of the present invention provides a partitioning method for a data cube, which adopts the following technical solutions:

receiving a partition request carrying database identification information;

acquiring original service data corresponding to the database identification information in a service database, wherein the original service data comprises time information and channel information;

preprocessing the time information and the channel information;

constructing a data cube corresponding to the original service data;

partitioning the data cube according to the channel information to obtain a plurality of subdata cubes;

and respectively storing the plurality of sub data cubes into different storage media.

In order to solve the above technical problem, an embodiment of the present invention further provides a partitioning device for a data cube, which adopts the following technical solutions:

the request receiving module is used for receiving a partition request carrying database identification information;

the data acquisition module is used for acquiring original service data corresponding to the database identification information in a service database, wherein the original service data comprises time information and channel information;

the preprocessing module is used for preprocessing the time information and the channel information;

the cube construction module is used for constructing a data cube corresponding to the original service data;

the partition processing module is used for partitioning the data cube according to the channel information to obtain a plurality of sub data cubes; and

and the partition storage module is used for respectively storing the plurality of sub data cubes into different storage media.

In order to solve the above technical problem, an embodiment of the present application further provides a computer device, which adopts the following technical solutions:

comprising a memory and a processor;

the memory has stored therein a computer program which when executed by the processor implements the steps of the method for partitioning a data cube as described above.

In order to solve the above technical problem, an embodiment of the present application further provides a computer-readable storage medium, which adopts the following technical solutions:

the computer-readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the partitioning method for a data cube as described above.

Compared with the prior art, the embodiment of the application mainly has the following beneficial effects:

the invention provides a partitioning method of a data cube, which comprises the steps of receiving a partitioning request carrying database identification information; acquiring original service data corresponding to the database identification information in a service database, wherein the original service data comprises time information and channel information; preprocessing the time information and the channel information; constructing a data cube corresponding to the original service data; partitioning the data cube according to the channel information to obtain a plurality of subdata cubes; and respectively storing the plurality of sub data cubes into different storage media. By partitioning and cutting the data in the original database according to the channel information and storing the data in different storage media, a user can enable a plurality of sub data cubes which need to be inquired and are cut to work in parallel when inquiring the data information, and therefore the efficiency of inquiring the data is greatly improved.

Drawings

In order to more clearly illustrate the solution of the present application, the drawings needed for describing the embodiments of the present application will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and that other drawings can be obtained by those skilled in the art without inventive effort.

FIG. 1 is a flowchart illustrating an implementation of a partitioning method for a data cube according to an embodiment of the present invention;

FIG. 2 is a diagram illustrating an example of a data cube provided by an embodiment of the invention;

FIG. 3 is an exemplary diagram of a partitioned data cube according to an embodiment of the present invention;

FIG. 4 is a flowchart of the implementation of step S104 in FIG. 1;

FIG. 5 is a flowchart illustrating an implementation of a partitioning apparatus for a data cube according to a second embodiment of the present invention;

FIG. 6 is a schematic block diagram of the cube construction module 104 of FIG. 5;

FIG. 7 is a schematic block diagram of one embodiment of a computer device according to the present application.

Detailed Description

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "including" and "having," and any variations thereof, in the description and claims of this application and the description of the above figures are intended to cover non-exclusive inclusions. The terms "first," "second," and the like in the description and claims of this application or in the above-described drawings are used for distinguishing between different objects and not for describing a particular order.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.

According to the partitioning method for the data cube provided by the embodiment of the invention, the data in the original database is partitioned and diced according to time and a service body and stored in different storage media, so that a plurality of diced sub data cubes to be queried can work in parallel when a user queries data information, and the efficiency of querying data is greatly improved.

Example one

Fig. 1 is a flowchart illustrating an implementation of a partitioning method for a data cube according to an embodiment of the present invention, and for convenience of description, only the portions related to the present invention are shown.

In step S101, a partition request carrying database identification information is received.

In the embodiment of the present invention, the database identification information is used to uniquely identify a database storing business data, and may be an organization name, such as a database of a commercial bank (gsyhsjk), a database of an agricultural bank (nyyhsjk), and the like; number information is also possible, such as 0001, 0002, etc., as examples; organization name and number information may also be incorporated, such as an industrial and commercial bank database 0001, an agricultural bank database 0002, etc., it being understood that the examples of database identification information herein are for ease of understanding only and are not intended to limit the present invention.

In the embodiment of the present invention, the database for storing the business data may be a relational database, or may be a data warehouse based on a Hadoop spectrum (Hadoop).

In step S102, original service data corresponding to the database identification information is obtained in a service database, where the original service data includes time information and channel information.

In the embodiment of the present invention, the original service data refers to service data of a large data table stored in the database.

In the embodiment of the present invention, the time information is used to record the execution time of the service data.

In the embodiment of the present invention, the channel information is used for recording the main body providing the business service, and the channel information may be organization name information, for example, such as a general issuing bank, a contracting bank, a Fujian agriculture letter, and the like.

In the embodiment of the present invention, the original service data may further include merchant information, region information, industry information, transaction amount information, and the like.

In step S103, the time information and the channel information are preprocessed.

In the embodiment of the present invention, the preprocessing operation is used for performing a preliminary screening operation on the time information and the channel information, and the preprocessing operation includes decryption, decoding and/or data deduplication processing on the time information and the channel information, for example, texts with similar address contents actually point to the same address, and texts with similar business names point to the same business.

In step S104, a data cube corresponding to the original service data is constructed.

In the embodiment of the present invention, the data cube is used for performing dynamic multidimensional analysis on data stored in the Relational database, and the data cube may be a Relational analysis processing method (Relational OLAP) based on the Relational database, as shown in fig. 2, the data cube is constructed by using time information and channel information as main axes.

In step S105, the data cube is partitioned according to the channel information, and a plurality of sub data cubes are obtained.

In the embodiment of the present invention, the partition processing is used to segment the file system file corresponding to the original service data according to the channel information dimension of the data cube, so as to obtain a cube file with a smaller data capacity, as shown in fig. 3, the cube is a sub-data cube obtained by segmenting the original service data according to the channel information.

In step S106, the plurality of child data cubes are stored in different storage media, respectively.

In the embodiment of the invention, a partitioning method of a data cube is provided, which receives a partitioning request carrying database identification information; acquiring original service data corresponding to the database identification information in a service database, wherein the original service data comprises time information and channel information; preprocessing the time information and the channel information; constructing a data cube corresponding to the original service data; partitioning the data cube according to the channel information to obtain a plurality of subdata cubes; and respectively storing the plurality of sub data cubes into different storage media. By partitioning and cutting the data in the original database according to the channel information and storing the data in different storage media, a user can enable a plurality of sub data cubes which need to be inquired and are cut to work in parallel when inquiring the data information, and therefore the efficiency of inquiring the data is greatly improved.

Fig. 4 shows a flowchart for implementing step S104 in fig. 1, and for convenience of explanation, only the part relevant to the present invention is shown.

In some optional implementation manners of the first embodiment of the present invention, the step S104 specifically includes: step S201, step S202, and step S203.

In step S201, the time information and the channel information are respectively used as the main axes of the initial cube.

In step S202, extended data corresponding to the time information and the channel information is acquired in the original service data.

In the embodiment of the present invention, since the original business data is stored in the relational database or a data warehouse based on a Hadoop spectrum (Hadoop), other fact tables corresponding to the time information and the channel information may be obtained according to a dimension table in the relational database or the data warehouse.

In the embodiment of the present invention, the extension data refers to data of other fact tables corresponding to the time information and the channel information in the service database.

In the embodiment of the invention, as the data of the business database is continuously updated, the data cube can also dynamically update the filled data.

In step S203, the extended data is stored to the initial cube based on the correspondence between the time information or the channel information and the extended data, so as to obtain the data cube.

In the embodiment of the invention, the data content of the data cube is perfected by storing the extended data of the time information or the channel information into the initial cube, so that the conversion from the dispersed association of various ER relationships or MAP data objects to the unified multidimensional analysis data cube is completed, and the association corresponding to the time information and the channel information is established.

In some optional implementation manners as the first embodiment of the present invention, the original service data further includes fact data corresponding to the channel information, and the step S105 specifically includes: and segmenting the data cube into a plurality of sub data cubes based on the corresponding relation between the channel information and the fact data.

In the embodiment of the present invention, the fact data refers to fact table data corresponding to the channel information dimension, and the fact data includes digital data (facts), for example, data generated by cash registration transaction, it should be understood that the example of the fact data is only for convenience of understanding and is not used to limit the present invention.

In some optional implementation manners as the first embodiment of the present invention, the step 202 specifically includes: and carrying out data conversion operation on the extension data.

In the embodiment of the present invention, the data conversion operation is used to further specify the format of the data information, and the data conversion operation may be blank type character processing (blank data deletion), encoding processing (scrambling code deletion), character processing (blank type character processing, special character processing, and character string representation mode conversion), representation conversion, and the like.

In the embodiment of the invention, the original service data is processed in the data format in advance, so that the problem that a subsequently generated data cube contains wrong data or indiscernible code data is avoided.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the computer program is executed. The storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a Random Access Memory (RAM).

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

Example two

With further reference to fig. 5, as an implementation of the method shown in fig. 1, the present application provides a partitioning apparatus for a data cube, where an embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 1, and the apparatus may be applied to various electronic devices.

As shown in fig. 5, a partitioning apparatus 100 for a data cube according to a second embodiment of the present invention includes: a request accepting module 101, a data acquiring module 102, a preprocessing module 103, a cube constructing module 104, a partition processing module 105, and a partition storing module 106. Wherein:

a request receiving module 101, configured to receive a partition request carrying database identification information;

a data obtaining module 102, configured to obtain, in a service database, original service data corresponding to the database identification information, where the original service data includes time information and channel information;

the preprocessing module 103 is used for preprocessing the time information and the channel information;

a cube construction module 104, configured to construct a data cube corresponding to the original business data;

the partition processing module 105 is configured to perform partition processing on the data cube according to the channel information to obtain a plurality of sub data cubes; and

and the partition storage module 106 is configured to store the plurality of child data cubes into different storage media respectively.

In the embodiment of the invention, a partitioning device of a data cube is provided, wherein a request receiving module is used for receiving a partitioning request carrying database identification information; the data acquisition module is used for acquiring original service data corresponding to the database identification information in a service database, wherein the original service data comprises time information and channel information; the preprocessing module is used for preprocessing the time information and the channel information; the cube construction module is used for constructing a data cube corresponding to the original service data; the partition processing module is used for partitioning the data cube according to the channel information to obtain a plurality of sub data cubes; and the partition storage module is used for respectively storing the plurality of sub data cubes into different storage media. By partitioning and cutting the data in the original database according to the channel information and storing the data in different storage media, a user can enable a plurality of sub data cubes which need to be inquired and are cut to work in parallel when inquiring the data information, and therefore the efficiency of inquiring the data is greatly improved.

With continued reference to fig. 6, a schematic structural diagram of a cube construction module provided in the second embodiment of the present invention is shown, and for convenience of explanation, only the parts related to the present invention are shown.

In some optional implementations of the second embodiment of the present invention, the cube construction module 104 includes: a main axis determination sub-module 1041, an extended data acquisition sub-module 1042, and a cube perfection sub-module 1043. Wherein:

the main shaft determining submodule is used for respectively taking the time information and the channel information as main shafts of an initial cube;

an extended data obtaining sub-module, configured to obtain, in the original service data, extended data corresponding to the time information and the channel information;

and the cube perfecting submodule is used for storing the extended data into the initial cube based on the corresponding relation between the time information or the channel information and the extended data to obtain the data cube.

In some optional implementations as the first embodiment of the present invention, the partition processing module includes: and a partition processing submodule. Wherein:

and the partition processing submodule is used for segmenting the data cube into a plurality of sub data cubes based on the corresponding relation between the channel information and the fact data.

In some optional implementations as the first embodiment of the present invention, the extended data obtaining sub-module 1042 specifically includes: and a data conversion submodule. Wherein:

and the data conversion submodule is used for carrying out data conversion operation on the extended data.

In summary, the embodiment of the present invention provides a partitioning method for a data cube, which receives a partitioning request sent by a user, where the partitioning request at least carries database identification information; responding to the partition request, and acquiring original service data in a service database corresponding to the database identification information, wherein the original service data at least carries time information and channel information; constructing a data cube corresponding to the original service data by taking the time information and the channel information as main axes; performing partition storage operation on the data cube based on a preset partition rule; outputting a partition complete signal to the user. By partitioning and blocking the data in the original database according to time and a service body and storing the data in different storage media, a user can enable a plurality of blocked subdata cubes to be queried to work in parallel when querying data information, and the efficiency of querying data is greatly improved. Meanwhile, the extended data of the time information or the channel information is stored in the initial cube, so that the data content of the data cube is perfected, the conversion from the dispersed association of various ER relationships or MAP data objects to the unified multidimensional analysis data cube is further completed, and the association corresponding to the time information and the channel information is established; by processing the original service data in a data format in advance, the problem that a subsequently generated data cube contains wrong data or indiscriminate code data is avoided.

In order to solve the technical problem, an embodiment of the present application further provides a computer device. Referring to fig. 7, fig. 7 is a block diagram of a basic structure of a computer device according to the present embodiment.

The computer device 7 comprises a memory 71, a processor 72, a network interface 73, which are communicatively connected to each other via a system bus. It is noted that only a computer device 7 having components 71-73 is shown, but it is to be understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead. As will be understood by those skilled in the art, the computer device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable gate array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.

The computer device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The computer equipment can carry out man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch panel or voice control equipment and the like.

The memory 71 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the storage 71 may be an internal storage unit of the computer device 7, such as a hard disk or a memory of the computer device 7. In other embodiments, the memory 71 may also be an external storage device of the computer device 7, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a flash Card (FlashCard), and the like, which are provided on the computer device 7. Of course, the memory 71 may also comprise both an internal storage unit of the computer device 7 and an external storage device thereof. In this embodiment, the memory 71 is generally used for storing an operating system installed in the computer device 7 and various types of application software, such as program codes of a partition method of a data cube. Further, the memory 71 may also be used to temporarily store various types of data that have been output or are to be output.

The processor 72 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 72 is typically used to control the overall operation of the computer device 7. In this embodiment, the processor 72 is configured to execute the program code stored in the memory 71 or process data, for example, execute the program code of the partition method of the data cube.

The network interface 73 may comprise a wireless network interface or a wired network interface, and the network interface 73 is generally used for establishing a communication connection between the computer device 7 and other electronic devices.

The present application further provides another embodiment, which provides a computer-readable storage medium storing a partitioning program for a data cube, the partitioning program being executable by at least one processor to cause the at least one processor to perform the steps of the partitioning method for a data cube as described above.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.

It is to be understood that the above-described embodiments are merely illustrative of some, but not restrictive, of the broad invention, and that the appended drawings illustrate preferred embodiments of the invention and do not limit the scope of the invention. This application is capable of embodiments in many different forms and is provided for the purpose of enabling a thorough understanding of the disclosure of the application. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to one skilled in the art that the present application may be practiced without modification or with equivalents of some of the features described in the foregoing embodiments. All equivalent structures made by using the contents of the specification and the drawings of the present application are directly or indirectly applied to other related technical fields and are within the protection scope of the present application.

Claims

1. A method of partitioning a data cube, comprising:

receiving a partition request carrying database identification information;

preprocessing the time information and the channel information;

constructing a data cube corresponding to the original service data;

2. The method for partitioning a data cube according to claim 1, wherein said step of constructing a data cube corresponding to said original business data comprises:

respectively taking the time information and the channel information as main shafts of an initial cube;

acquiring extended data corresponding to the time information and the channel information from the original service data;

and storing the extended data to the initial cube based on the corresponding relation between the time information or the channel information and the extended data to obtain the data cube.

3. The method for partitioning a data cube according to claim 1, wherein the raw service data further includes fact data corresponding to the channel information, and the step of partitioning the data cube according to the channel information includes:

and segmenting the data cube into a plurality of sub data cubes based on the corresponding relation between the channel information and the fact data.

4. The method of partitioning a data cube of claim 1, wherein pre-processing the time information and the channel information comprises:

and carrying out decryption, decoding and data deduplication processing on the time information and the channel information.

5. An apparatus for partitioning a data cube, the apparatus comprising:

6. The apparatus for partitioning a data cube of claim 5, wherein said cube construction module comprises:

7. The apparatus for partitioning a data cube according to claim 5, wherein said raw business data further contains fact data corresponding to said channel information, said partitioning processing module comprising:

8. The apparatus for partitioning a data cube of claim 5, wherein said preprocessing module comprises:

and the preprocessing submodule is used for carrying out decryption, decoding and data deduplication processing on the time information and the channel information.

9. A computer device comprising a memory having stored therein a computer program and a processor implementing the steps of the partitioning method of the data cube according to any one of claims 1 to 4 when executing the computer program.

10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the steps of the partitioning method of the data cube according to any one of claims 1 to 4.