CN109388637A - Data warehouse information processing method, device, system, medium - Google Patents

Data warehouse information processing method, device, system, medium Download PDF

Info

Publication number
CN109388637A
CN109388637A CN201811111998.7A CN201811111998A CN109388637A CN 109388637 A CN109388637 A CN 109388637A CN 201811111998 A CN201811111998 A CN 201811111998A CN 109388637 A CN109388637 A CN 109388637A
Authority
CN
China
Prior art keywords
history
query
sentence
historical
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811111998.7A
Other languages
Chinese (zh)
Other versions
CN109388637B (en
Inventor
范叶亮
卢周
钱勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingdong Technology Holding Co Ltd
Original Assignee
Beijing Jingdong Financial Technology Holding Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Financial Technology Holding Co Ltd filed Critical Beijing Jingdong Financial Technology Holding Co Ltd
Priority to CN201811111998.7A priority Critical patent/CN109388637B/en
Publication of CN109388637A publication Critical patent/CN109388637A/en
Application granted granted Critical
Publication of CN109388637B publication Critical patent/CN109388637B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Present disclose provides a kind of data warehouse information processing methods, it include the history table of multiple associated storages in the data warehouse, the described method includes: obtaining at least one historical query sentence, the historical query sentence is used to inquire the related data of multiple history tables in the history table of the associated storage;Determine the corresponding multiple history tables of at least one described historical query sentence;Target table is generated based on the particular historical table in the multiple history table, the target table includes the related data in the particular historical table.

Description

Data warehouse information processing method, device, system, medium
Technical field
This disclosure relates to field of computer technology, more particularly, to a kind of data warehouse information processing method, a kind of number According to warehouse information processing unit, a kind of data warehouse information processing system and a kind of computer readable storage medium.
Background technique
As internet enters big data era, acquisition, the storage of data are the directions of primary study, for example, various industry The related data of business scene is stored in data warehouse, it usually needs is inquired from the mass data that data warehouse is stored related Data, and confluence analysis is carried out to make operational decision making, wherein the data in data warehouse are usually deposited in a manner of table Storage, different tables stores different data, such as different tables stores user data, commodity data, merchant data respectively Etc., when needing to inquire relevant sales data, needs to obtain related data from multiple tables, thereby result in query process It is cumbersome, it usually include the correlation of multiple tables based on the building of multiple tables by business personnel in data warehouse field therefore The wide table of data, the width table can be convenient for the data needed for inquiry, therefore, and the building process for how optimizing wide table becomes at present urgently Problem to be solved.
During realizing disclosure design, at least there are the following problems in the prior art for inventor's discovery, existing Wide table building process depends on the experience of business expert unduly, so that the building of wide table needs a large amount of artificial participations, and due to mistake Degree relies on artificial participate in so that the building process of wide table is not objective enough.
Summary of the invention
In view of this, including multiple present disclose provides a kind of data warehouse information processing method, in the data warehouse The history table of associated storage, which comprises obtain at least one historical query sentence, the historical query sentence is used for The related data of multiple history tables in the history table of the associated storage is inquired, determines at least one described historical query The corresponding multiple history tables of sentence generate target table based on the particular historical table in the multiple history table, described Target table includes the related data in the particular historical table.
In accordance with an embodiment of the present disclosure, at least one historical query sentence of above-mentioned acquisition, comprising: acquisition passes through the history Query statement inquires the related number when related data of multiple history tables in the history table of the associated storage According to the history data in warehouse, based at least one described historical query sentence determining in the operation data.
In accordance with an embodiment of the present disclosure, the corresponding multiple history lists of at least one historical query sentence described in above-mentioned determination Lattice, comprising: at least one described historical query sentence is parsed, the association of at least one historical query sentence is obtained Information, the related information include associate field and Correlation Criteria, based on the related information determine it is described at least one go through The corresponding the multiple history table of history query statement.
In accordance with an embodiment of the present disclosure, the above method further include: obtained from multiple initial history query statements and meet the The query statement of one preset condition is as at least one described historical query sentence.
In accordance with an embodiment of the present disclosure, above-mentioned obtain from multiple initial history query statements meets the first preset condition Query statement is as at least one described historical query sentence, comprising: clusters to the multiple initial history query statement Obtain at least one query statement group, wherein similarity between the historical query sentence in each query statement group meets the One preset threshold determines the query statement group conduct for meeting first preset condition from least one described query statement group Target query sentence group, the target query sentence group include at least one described historical query sentence.
In accordance with an embodiment of the present disclosure, above-mentioned that the multiple initial history query statement is clustered to obtain at least one Query statement group, comprising: the multiple initial history query statement is handled, the multiple initial history inquiry language is obtained The corresponding vector of sentence is clustered the corresponding vector of multiple initial history query statements to obtain at least one described query statement Group, at least one described query statement group include the corresponding vector of respective queries sentence.
In accordance with an embodiment of the present disclosure, above-mentioned particular historical table is to meet the second default threshold in the multiple history table The history table of value.
In accordance with an embodiment of the present disclosure, the above method further include: meet the feelings of the second preset condition in the target table Under condition, the target table is stored.
In accordance with an embodiment of the present disclosure, above-mentioned in the case where the target table meets the second preset condition, store institute State target table, comprising: the similarity for obtaining other history tables in the target table and the data warehouse, described In the case that similarity meets third predetermined threshold value, the target table is stored.
Another aspect of the disclosure provides a kind of data warehouse information processing unit, includes more in the data warehouse The history table of a associated storage, described device include: the first acquisition module, determining module and generation module.Wherein, first It obtains module and obtains at least one historical query sentence, the historical query sentence is used to inquire the history lists of the associated storage The related data of multiple history tables in lattice, determining module determine that at least one described historical query sentence is corresponding and multiple go through History table, generation module generate target table, the target table based on the particular historical table in the multiple history table Including the related data in the particular historical table.
In accordance with an embodiment of the present disclosure, at least one historical query sentence of above-mentioned acquisition, comprising: acquisition passes through the history Query statement inquires the related number when related data of multiple history tables in the history table of the associated storage According to the history data in warehouse, based at least one described historical query sentence determining in the operation data.
In accordance with an embodiment of the present disclosure, the corresponding multiple history lists of at least one historical query sentence described in above-mentioned determination Lattice, comprising: at least one described historical query sentence is parsed, the association of at least one historical query sentence is obtained Information, the related information include associate field and Correlation Criteria, based on the related information determine it is described at least one go through The corresponding the multiple history table of history query statement.
In accordance with an embodiment of the present disclosure, above-mentioned apparatus further include: second obtains module, from multiple initial history query statements The middle query statement for meeting the first preset condition that obtains is as at least one described historical query sentence.
In accordance with an embodiment of the present disclosure, above-mentioned obtain from multiple initial history query statements meets the first preset condition Query statement is as at least one described historical query sentence, comprising: clusters to the multiple initial history query statement Obtain at least one query statement group, wherein similarity between the historical query sentence in each query statement group meets the One preset threshold determines the query statement group conduct for meeting first preset condition from least one described query statement group Target query sentence group, the target query sentence group include at least one described historical query sentence.
In accordance with an embodiment of the present disclosure, above-mentioned that the multiple initial history query statement is clustered to obtain at least one Query statement group, comprising: the multiple initial history query statement is handled, the multiple initial history inquiry language is obtained The corresponding vector of sentence is clustered the corresponding vector of multiple initial history query statements to obtain at least one described query statement Group, at least one described query statement group include the corresponding vector of respective queries sentence.
In accordance with an embodiment of the present disclosure, above-mentioned particular historical table is to meet the second default threshold in the multiple history table The history table of value.
In accordance with an embodiment of the present disclosure, above-mentioned apparatus further include: it is default to meet second in the target table for memory module In the case where condition, the target table is stored.
In accordance with an embodiment of the present disclosure, above-mentioned in the case where the target table meets the second preset condition, store institute State target table, comprising: the similarity for obtaining other history tables in the target table and the data warehouse, described In the case that similarity meets third predetermined threshold value, the target table is stored.
Another aspect of the present disclosure provides a kind of computer readable storage medium, is stored with computer executable instructions, Described instruction is when executed for realizing method as described above.
Another aspect of the present disclosure provides a kind of computer program, and the computer program, which includes that computer is executable, to be referred to It enables, described instruction is when executed for realizing method as described above.
In accordance with an embodiment of the present disclosure, wide table building process in the prior art can at least be partially solved and depend on industry unduly The experience of business expert, so that the building of wide table needs a large amount of artificial participations, and due to depending on artificial participate in so that wide table unduly Not objective enough the problem of building process, and the building process of wide table in optimization data warehouse therefore may be implemented, such as realize The technical effect of the automation building of wide table.
Detailed description of the invention
By referring to the drawings to the description of the embodiment of the present disclosure, the above-mentioned and other purposes of the disclosure, feature and Advantage will be apparent from, in the accompanying drawings:
Fig. 1 is diagrammatically illustrated according to the data warehouse information processing method of the embodiment of the present disclosure and the system of processing system Framework;
Fig. 2A~2C diagrammatically illustrates the data warehouse information processing method and processing system according to the embodiment of the present disclosure Application scenarios;
Fig. 3 A diagrammatically illustrates the flow chart of the data warehouse information processing method according to the embodiment of the present disclosure;
Fig. 3 B diagrammatically illustrates the history data schematic diagram according to the data warehouse of the embodiment of the present disclosure;
Fig. 3 C diagrammatically illustrates the visualization schematic diagram according to the abstract syntax tree of the embodiment of the present disclosure;
Fig. 4 diagrammatically illustrates the flow chart of the data warehouse information processing method according to another embodiment of the disclosure;
Fig. 5 diagrammatically illustrates the flow chart of the data warehouse information processing method according to disclosure another embodiment;
Fig. 6 diagrammatically illustrates the wide table building flow chart of data warehouse according to the embodiment of the present disclosure;
Fig. 7 is diagrammatically illustrated according to the wide table candidate template generation of data warehouse of the embodiment of the present disclosure and auditing flow Figure;
Fig. 8 diagrammatically illustrates the block diagram of the data warehouse information processing unit according to the embodiment of the present disclosure;
Fig. 9 diagrammatically illustrates the block diagram of the data warehouse information processing unit according to another embodiment of the disclosure;
Figure 10 diagrammatically illustrates the block diagram of the data warehouse information processing unit according to disclosure another embodiment;And
Figure 11 diagrammatically illustrates the computer system for being suitable for data warehouse information processing according to the embodiment of the present disclosure Block diagram.
Specific embodiment
Hereinafter, will be described with reference to the accompanying drawings embodiment of the disclosure.However, it should be understood that these descriptions are only exemplary , and it is not intended to limit the scope of the present disclosure.In the following detailed description, to elaborate many specific thin convenient for explaining Section is to provide the comprehensive understanding to the embodiment of the present disclosure.It may be evident, however, that one or more embodiments are not having these specific thin It can also be carried out in the case where section.In addition, in the following description, descriptions of well-known structures and technologies are omitted, to avoid Unnecessarily obscure the concept of the disclosure.
Term as used herein is not intended to limit the disclosure just for the sake of description specific embodiment.It uses herein The terms "include", "comprise" etc. show the presence of the feature, step, operation and/or component, but it is not excluded that in the presence of Or add other one or more features, step, operation or component.
There are all terms (including technical and scientific term) as used herein those skilled in the art to be generally understood Meaning, unless otherwise defined.It should be noted that term used herein should be interpreted that with consistent with the context of this specification Meaning, without that should be explained with idealization or excessively mechanical mode.
It, in general should be according to this using statement as " at least one in A, B and C etc. " is similar to Field technical staff is generally understood the meaning of the statement to make an explanation (for example, " system at least one in A, B and C " Should include but is not limited to individually with A, individually with B, individually with C, with A and B, with A and C, have B and C, and/or System etc. with A, B, C).Using statement as " at least one in A, B or C etc. " is similar to, generally come Saying be generally understood the meaning of the statement according to those skilled in the art to make an explanation (for example, " having in A, B or C at least One system " should include but is not limited to individually with A, individually with B, individually with C, with A and B, have A and C, have B and C, and/or the system with A, B, C etc.).It should also be understood by those skilled in the art that substantially arbitrarily indicating two or more The adversative conjunction and/or phrase of optional project shall be construed as either in specification, claims or attached drawing A possibility that giving including one of these projects, either one or two projects of these projects.For example, phrase " A or B " should A possibility that being understood to include " A " or " B " or " A and B ".
Embodiment of the disclosure provides a kind of data warehouse information processing method, includes multiple associations in the data warehouse The history table of storage, this method comprises: obtaining at least one historical query sentence, historical query sentence is deposited for inquiring association The related data of multiple history tables in the history table of storage determines the corresponding multiple history of at least one historical query sentence Table generates target table based on the particular historical table in multiple history tables, and target table includes in particular historical table Related data.
Fig. 1 is diagrammatically illustrated at data warehouse information processing method and data warehouse information according to the embodiment of the present disclosure The system architecture of reason system.It should be noted that only can showing using the system architecture of the embodiment of the present disclosure shown in Fig. 1 Example, to help skilled in the art to understand the technology contents of the disclosure, but is not meant to that the embodiment of the present disclosure cannot be used In other equipment, system, environment or scene.
As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network according to this embodiment 104 and server 105.Network 104 between terminal device 101,102,103 and server 105 to provide communication link Medium.Network 104 may include various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out Send message etc..Various telecommunication customer end applications, such as the application of shopping class, net can be installed on terminal device 101,102,103 (merely illustrative) such as the application of page browsing device, searching class application, instant messaging tools, mailbox client, social platform softwares.
Terminal device 101,102,103 can be the various electronic equipments with display screen and supported web page browsing, packet Include but be not limited to smart phone, tablet computer, pocket computer on knee and desktop computer etc..
Server 105 can be to provide the server of various services, such as utilize terminal device 101,102,103 to user The website browsed provides the back-stage management server (merely illustrative) supported.Back-stage management server can be to the use received The data such as family request analyze etc. processing, and by processing result (such as according to user's request or the webpage of generation, believe Breath or data etc.) feed back to terminal device.
It should be noted that data warehouse information processing method provided by the embodiment of the present disclosure generally can be by server 105 execute.Correspondingly, data warehouse information processing unit provided by the embodiment of the present disclosure generally can be set in server In 105.Data warehouse information processing method provided by the embodiment of the present disclosure can also be by being different from server 105 and can be with The server or server cluster that terminal device 101,102,103 and/or server 105 communicate execute.Correspondingly, the disclosure is real Applying data warehouse information processing unit provided by example also can be set in being different from server 105 and can be with terminal device 101,102,103 and/or server 105 communicate server or server cluster in.
For example, the historical query sentence and history table of the embodiment of the present disclosure can store terminal device 101,102, In 103, query statement and history table are sent in server 105 by terminal device 101,102,103, server 105 Target table is created based on query statement and history table, alternatively, terminal device 101,102,103 can also be directly based upon inquiry Sentence and history table create target table.In addition, query statement and history table can also be stored directly in server 105 In, query statement is directly based upon by server 105 and history table creates target table.
It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need It wants, can have any number of terminal device, network and server.
Fig. 2A~2C diagrammatically illustrates the data warehouse information processing method and data warehouse according to the embodiment of the present disclosure The application scenarios of information processing system.It should be noted that being only the field that can apply the embodiment of the present disclosure shown in Fig. 2A~2C The example of scape to help skilled in the art to understand the technology contents of the disclosure, but is not meant to the embodiment of the present disclosure not It can be used for other equipment, system, environment or scene.
As shown in Fig. 2A~2C, which for example may include a variety of table schemas in data warehouse, such as Including Star Schema 210 and snowflake type mode 220 and wide table 230.
According to the embodiment of the present disclosure, Star Schema 210 and snowflake type mode 220 for example can be in data warehouse and be used for The table schema of storing data, every kind of table schema is for example including multiple associated tables.As shown in Figure 2 A, Star Schema 210 for example including table 211, table 212 and table 213 etc..As shown in Figure 2 B, snowflake type mode 220 is for example including table Lattice 221, table 222, table 223, table 224, table 225, table 226 etc..
Wherein, the data of multiple table such as storage are more dispersed, such as sales data, multiple list In each table for example store user information, merchandise news, Business Information etc. respectively.
In the embodiments of the present disclosure, usually by the related table in query statement query data repository library, by obtaining phase The data in table are closed, analysis mining is carried out with this, provides decision for various businesses.
Due to obtaining the process of relevant information more by multiple related tables in query statement query data repository library It is complicated, it is to be understood that needed for the incidence relation between the information and multiple tables of multiple table storages could be inquired preferably Data.Such as when needing to inquire sales data, associatedly searching user's information table, merchandise news table, Business Information are needed Table etc..
In order to improve the convenience that business uses, it usually needs wide table is established, such as establishes the wide table about sales data, The width table includes the data information of multiple tables, the data letter for example including user message table, merchandise news table, Business Information table Breath uses convenient for service inquiry.
The embodiment of the present disclosure can be by obtaining the query statement for inquiring multiple tables, such as acquisition for inquiry table Lattice 221, table 222, table 223, table 224, table 225, table 226 query statement, from the query statement determine institute The table being related to, such as determine that the table that query statement is related to is table 221, table 222, table 223, table 224, table 225, table 226 create relevant wide table 230 based on multiple table, and as shown in Figure 2 C, the wide table 230 is for example including multiple The data information of table.
The embodiment of the present disclosure by from query statement determine data warehouse in multiple tables, and be based on multiple table The wide table of data warehouse is created, realizes the automatic building process of wide table.
Fig. 3 A diagrammatically illustrates the flow chart of the data warehouse information processing method according to the embodiment of the present disclosure.
As shown in Figure 3A, this method includes operation S310~S330.
In the embodiments of the present disclosure, the major function of data warehouse is that operation system is passed through Transaction Processing (OLTP) Generated mass data is utilized through data storage framework specific to data warehouse theory by systematically analysis and arrangement Various analysis methods, such as on-line analytical processing (OLAP) and data mining (Data Mining), and then service such as decision branch Hold the systems such as system (Decision Support System).Data warehouse can aid decision making person fast and effeciently from a large amount of In data, valuable information is analyzed, drafts in order to decision and being changed with fast reaction external environment, helps construction business intelligence It can solve scheme.
In the embodiments of the present disclosure, in building process data warehouse, dimension design can be by relationship map to one group of relationship Table can be designed: Star Schema and snowflake type mode using two ways under normal conditions.Star Schema can be described as a letter Single is star-like: central table includes factual data, and multiple tables are radially distributed centered on central table, they are by major key and outside Key is connected with each other.
It in accordance with an embodiment of the present disclosure, include the history table of multiple associated storages in data warehouse, it will be understood that this public affairs Opening table described in embodiment includes the tables of data in data warehouse for storing data, wherein the tables of data may include more A data arrange (or data field), and the data of each data column are the data of different field types.Specifically, multiple history Table for example can be true table, dimension table or wide table created in data warehouse etc..
In operation S310, at least one historical query sentence is obtained, historical query sentence is for inquiring going through for associated storage The related data of multiple history tables in history table.
According to the embodiment of the present disclosure, historical query sentence for example can be related service personnel in query data repository library Data used in query statement, which can be used for from the history table of the associated storage in data warehouse inquiring The related data of multiple history tables.The query statement can be SQL query statement.
For example, obtaining at least one historical query sentence, comprising: obtain and inquire associated storage by historical query sentence The history data of related data warehouse when the related data of multiple history tables in history table is transported based on history At least one historical query sentence is determined in row data.
In the embodiments of the present disclosure, in the process for the related data for inquiring multiple history tables by historical query sentence In, the history data of data warehouse can be generated, which for example can be in query data repository library When related data, to base warehouse table, basic fairground table and the user's user-defined data table relevant original operation day in data warehouse Will.
Fig. 3 B diagrammatically illustrates the history data schematic diagram according to the data warehouse of the embodiment of the present disclosure.
Wherein, the raw operational data of data warehouse for example can be the original operation day of data warehouse shown in Fig. 3 B Will, the original running log include the historical query sentence for related data in query data repository library.
In the embodiments of the present disclosure, at least one historical query sentence is determined from operation data, such as can be from number Believe according to proposing that the important system of clean, complete, orderly historical query sentence and correlation is run in the original running log in warehouse It ceases (for example including associated alarm and error message).
Wherein, such as it can use the regular expression of customization and simply cleaned relatively chaotic running log, Such as the original running log in Fig. 3 B is cleaned, obtained wash result is as shown in table 1.Wherein, the wash result Including historical query sentence, for example including SQL content.
Table 1
In operation S320, the corresponding multiple history tables of at least one historical query sentence are determined.
In the embodiments of the present disclosure, multiple history tables for example can be table involved in historical query sentence, that is, when When needing the related data in query data repository library, multiple tables in historical query sentence query data repository library can be passed through Data, wherein multiple table is the corresponding history table of historical query sentence.
Wherein it is determined that the corresponding multiple history tables of at least one historical query sentence, comprising: look at least one history It askes sentence to be parsed, obtains the related information of at least one historical query sentence, related information includes associate field and pass Bracing part.
In the embodiments of the present disclosure, historical query sentence can be parsed to obtain the related letter of historical query sentence Breath, such as historical query sentence can be parsed by SQL resolver to obtain the related information of historical query sentence.
Wherein, SQL resolver is a kind of using between field in metadata information and data query SQL analysis query SQL Main foreign key relationship, table association etc. correlation analyses tool.
For example, historical query sentence is to analyze the association that the historical query sentence obtains using SQL resolver shown in table 2 Information result is as shown in table 3.
Wherein, which for example may include the associate field and Correlation Criteria in query statement, wherein being associated with Field for example may include Aggregation field, and sort field, condition field, inquiry field etc., Correlation Criteria for example may include table Association, field association.
Table 2
SELECT
A, B, Z, COUNT (1) AS CT
FROM ODS.FOO FOO
INNER JOIN ODS.BAR BAR
ON FOO.A=BAR.X AND FOO.B=BAR.Y
GROUP BY A.B.Z
ORDER BY A, B, Z DESC;
Table 3
According to the embodiment of the present disclosure, the corresponding multiple history lists of at least one historical query sentence are determined based on related information Lattice.
For example, it may be the corresponding multiple history tables of query statement are determined by the related information in table 3, it is multiple History table is for example stored in data warehouse, for example, the multiple history tables determined are as shown in table 4 and table 5.
The specific workflow of SQL resolver is simply introduced below.
Fig. 3 C diagrammatically illustrates the visualization schematic diagram according to the abstract syntax tree of the embodiment of the present disclosure.
For convenience of explanation, citing is made for a relatively simple historical query sentence herein, the historical query language Sentence is as shown in table 6.
Table 4
Field Field type Field annotation
A STRING Column A
B STRING Column B
C STRING Column C
D STRING Column D
Table 5
Field Field type Field annotation
X STRING Column X
Y STRING Column Y
Z STRING Column Z
Table 6
SELECT A.Z
FROM ODS.FOO FOO
INNER JOIN ODS.BAR BAR
ON FOO.B=BAR.Y
WHERE BAR.Z LIKE′LEO′;
Original query SQL (historical query sentence) is parsed into SQL abstract syntax tree, effect of visualization such as Fig. 3 C first It is shown.
By the SQL abstract syntax tree according to building, related information therein is extracted, such as extracts table and is associated with (Join Clauses), field association (Join Clauses Conditions), Aggregation field (Group By), sort field (Order By), condition field (Where Clauses), inquiry field (Query Columns) etc..
In the embodiments of the present disclosure, can by the related information analyzed using SQL resolver (such as shown in table 3), Building indexes and is put in storage query engine, so as to subsequent query calling.
In operation S330, target table is generated based on the particular historical table in multiple history tables, target table includes Related data in particular historical table.
According to the embodiment of the present disclosure, particular historical table is, for example, all forms or part table in multiple history tables Lattice.For example, particular historical table is the history table for meeting the second preset threshold in multiple history tables.Second preset threshold Such as it can be the high table of multiple history table frequencies of occurrences, that is, particular historical table can be to be related in historical query sentence And history table often can indicate the particular historical table since the frequency of occurrence of the particular historical table is high It is queried often, and then learns that demand of the business personnel to the particular historical table is big.
In the embodiments of the present disclosure, by the particular historical table create at data warehouse target table, such as creation at The wide table of data warehouse, the target table include the related data in particular historical table, and the target table is convenient for users to making With in other words, leniently inquiry data are more convenient in table by user, reach effectively and rapidly that query analysis has from data warehouse The information of value, convenient for making a policy.
In accordance with an embodiment of the present disclosure, by determining multiple history tables based on historical query sentence, multiple history are based on Table constructs target table, which includes the related data of multiple history tables, which is, for example, data bins The wide table in library, can be realized the building process of wide table in optimization data warehouse by the scheme of the embodiment of the present disclosure, such as reach The technical effect of the automation building of wide table.
Fig. 4 diagrammatically illustrates the flow chart of the data warehouse information processing method according to another embodiment of the disclosure.
As shown in figure 4, this method includes operation S310~S330 and operation S410.Wherein, operation S310~S320 with The upper operation with reference to described in Fig. 3 A is same or like, and details are not described herein.
In operation S410, the query statement conduct for meeting the first preset condition is obtained from multiple initial history query statements At least one historical query sentence.
Implemented according to the disclosure, multiple initial history query statements for example can be phase in query data repository library for a long time Close the query statement of data, wherein the first preset condition for example can be the high query statement of similarity, that is, from multiple initial The high query statement of similarity is obtained in historical query sentence as at least one historical query sentence.
According to the embodiment of the present disclosure, the inquiry language for meeting the first preset condition is obtained from multiple initial history query statements Sentence is used as at least one historical query sentence, comprising: is clustered to obtain at least one to multiple initial history query statements and be looked into Ask sentence group, wherein the similarity between the historical query sentence in each query statement group meets the first preset threshold.
In the embodiments of the present disclosure, such as can be gone out by clustering method off-line analysis similar in multiple historical query sentences High cluster is spent, the high cluster of the similarity can be used for school for example including at least one historical query sentence, the high cluster of the similarity It tests between the wide table of current data warehouse and fairground with the presence or absence of the wide table of redundancy that similarity is excessively high.Wherein, for already existing The wide table of data warehouse and fairground and the customized all temporary query sentences of user, according to the real-valued vectors after its vectorization, And an only inquiry ID is assigned, real value search index is constructed, storage arrives real-valued vectors query engine, makes for follow-up process inquiry With.
It in the embodiments of the present disclosure, can also be to multiple before being clustered to multiple initial history query statements Initial history query statement is pre-processed.
Wherein, data prediction is carried out to initial history query statement, such as can be handled by SQL resolver and is initially gone through The related information that history query statement obtains constructs Semantic mapping based on associated data.Semantic mapping is it is to be understood that for same A concept (such as: commodity ID), field name in table 4 is A, and field name in table 5 is B, according to related information, really Field name in fixed unique identification replacement SQL in different table names.In addition to this, pretreatment further includes some SQL syntaxes Standardization adjustment, the work such as capital and small letter conversion, it is therefore an objective to ensure to have the code snippet of semantic consistency natively to ensure that it The similitude of content.
In disclosure implementation, multiple initial history query statements are clustered to obtain at least one query statement group packet It includes:
Multiple initial history query statements are handled, the corresponding vector of multiple initial history query statements is obtained.
For example, need to carry out the query statement vectorization before clustering to multiple initial history query statements, and Clustering processing is carried out to the query statement after vectorization.For example, using Word2Vec, Sentence2Vec and Document2Vec Initial history query statement is converted into real-valued vectors by equal natural languages vectorization method.For example, the example after a conversion is such as Shown in table 7.
Table 7
It is clustered the corresponding vector of multiple initial history query statements to obtain at least one query statement group, at least one A query statement group includes the corresponding vector of respective queries sentence.
For example, the corresponding vector of multiple initial history query statements is clustered to obtain at least one query statement group, Each query statement group includes the vector of multiple queries sentence, and the query statement in each query statement group has certain similar Degree, the similarity for example can be the default similarity set according to demand.
Determine the query statement group for meeting the first preset condition as target query language from least one query statement group Sentence group, target query sentence group includes at least one historical query sentence.
In the embodiments of the present disclosure, the first preset condition for example can be present count magnitude, wherein in query statement group Query statement have corresponding quantitative value, when the quantitative value meets present count magnitude, can using the query statement group as Target query sentence group.The target query sentence group is for example including multiple historical query sentences.
Fig. 5 diagrammatically illustrates the flow chart of the data warehouse information processing method according to disclosure another embodiment.
As shown in figure 5, this method includes operation S310~S320, operation S410 and operation S510.Wherein, S310 is operated ~S330 is same or like with the upper operation with reference to described in Fig. 3 A, and operation S410 is identical as the upper operation with reference to described in Fig. 4 Or it is similar, details are not described herein.
Target table is stored in the case where target table meets the second preset condition in operation S510.For example, The similarity for obtaining other history tables in target table and data warehouse, the case where similarity meets third predetermined threshold value Under, store target table.
According to the embodiment of the present disclosure, the second preset condition for example can be the target table for meeting default similarity, such as Determine the similarity of other history tables in target table and data warehouse, the storage when similarity meets third predetermined threshold value Target table, wherein the third predetermined threshold value for example can be specific data, avoid the table in data warehouse similar with this Degree height causes data redundancy.
Fig. 6 diagrammatically illustrates the wide table building flow chart of data warehouse according to the embodiment of the present disclosure.
As shown in fig. 6, the embodiment of the present disclosure discloses a kind of wide table automation building of the data warehouse based on information extraction Method, entire construction method include operation S610~S650.
In operation S610, collect some in depot layer, collection city level and the client layer obtained by conformable layer data summarization Base warehouse table, basic fairground table unstructured data related to the query SQL of user's user-defined data table, inquiry log etc..
In operation S620, using customized SQL resolver, parsed from query SQL and inquiry log different tables it Between main foreign key relationship, correlation inquiry, correlativities such as alias, and establish the query engine of related data.
In operation S630, using customized SQL vectorization method (SQL2Vec), according to query SQL, inquiry log with And the related data index built excavates and obtains similar inquiry, and establishes historical query SQL real-valued vectors query engine and look into Ask SQL similitude cluster result.
In operation S640, the related data engine, historical query SQL real-valued vectors query engine, inquiry built is utilized SQL similitude cluster result, statistics summarize the information such as the higher data field of co-occurrence frequency, tables of data, generate new data warehouse The candidate template of wide table.
In operation S650, according to the candidate template of the wide table in new data warehouse, in conjunction with business expert advice, obtain final new The wide table of data warehouse simultaneously solidifies.
Fig. 7 is diagrammatically illustrated according to the wide table candidate template generation of data warehouse of the embodiment of the present disclosure and auditing flow Figure.
As shown in fig. 7, in the embodiments of the present disclosure, the last one process of the wide table constructing plan in automated data warehouse is just It is the solidification of the candidate template generation of the wide table of data warehouse and the audit of business expert and the final wide table in new data warehouse.It should Process includes operation S710~S790.
Its vector is obtained by the pretreatment in above-mentioned process for customized new query SQL is used in operation S710 Change result.
In operation S720, the result after its vectorization is added to historical query SQL real-valued vectors query engine.
In operation S730, query SQL similitude cluster result is updated.
In operation S740, periodically by updated historical query SQL real-valued vectors data and query SQL similitude cluster knot Fruit passes to trigger, and trigger judges whether to generate the wide table template of new data warehouse according to the rule of definition.In trigger In, core trigger rule can be understood as largely new inquiry similarity with higher being gathered for a cluster, while with it is existing When all inquiry similarities in some data warehouse tables are respectively less than certain value, then gathered in the new inquiry for cluster from this Extract the information such as the higher tables of data of co-occurrence frequency and data field.
In operation S750, the information such as the higher tables of data of co-occurrence frequency and data field based on extraction generate new data The template of the wide table in warehouse.
Auditing flow is triggered, by the business of data warehouse after the new data warehouse of generation wide table template in operation S760 Expert carries out audit amendment.
In operation S770, the wide table of new data warehouse will be finally cured as by auditing revised wide table template.
The relevant information of the new wide table of data warehouse is updated to the inquiry of historical query SQL real-valued vectors in operation S780 and is drawn In holding up.
The relevant information of the new wide table of data warehouse is updated into query SQL similitude cluster result in operation S790.
Fig. 8 diagrammatically illustrates the block diagram of the data warehouse information processing unit according to the embodiment of the present disclosure.
As shown in figure 8, data warehouse information processing unit 800 include first obtain module 810, determining module 820 and Generation module 830.
First obtains at least one the available historical query sentence of module 810, and historical query sentence is for inquiring association The related data of multiple history tables in the history table of storage.
According to the embodiment of the present disclosure, at least one historical query sentence is obtained, comprising: acquisition is looked by historical query sentence Ask the history run number of related data warehouse when the related data of multiple history tables in the history table of associated storage According to based at least one historical query sentence determining in operation data.
According to the embodiment of the present disclosure, the first acquisition module 810 can for example execute the operation above with reference to Fig. 3 A description S310, details are not described herein.
Determining module 820 can determine the corresponding multiple history tables of at least one historical query sentence.
According to the embodiment of the present disclosure, the corresponding multiple history tables of at least one historical query sentence are determined, comprising: to extremely A few historical query sentence is parsed, and the related information of at least one historical query sentence is obtained, and related information includes closing Join field and Correlation Criteria, the corresponding multiple history tables of at least one historical query sentence are determined based on related information.
According to the embodiment of the present disclosure, determining module 820 can for example execute the operation S320 above with reference to Fig. 3 A description, This is repeated no more.
Generation module 830 can generate target table, target table based on the particular historical table in multiple history tables Including the related data in particular historical table.
According to the embodiment of the present disclosure, particular historical table is the history lists for meeting the second preset threshold in multiple history tables Lattice.
According to the embodiment of the present disclosure, generation module 830 can for example execute the operation S330 above with reference to Fig. 3 A description, This is repeated no more.
Fig. 9 diagrammatically illustrates the block diagram of the data warehouse information processing unit according to another embodiment of the disclosure.
As shown in figure 9, data warehouse information processing unit 900 includes the first acquisition module 810, determining module 820, generates Module 830 and the second acquisition module 910.Wherein, first obtain module 810, determining module 820 and generation module 830 with On with reference to Fig. 8 describe module it is same or like, details are not described herein.
Second acquisition module 910 can obtain the inquiry for meeting the first preset condition from multiple initial history query statements Sentence is as at least one historical query sentence.
According to the embodiment of the present disclosure, the inquiry language for meeting the first preset condition is obtained from multiple initial history query statements Sentence is used as at least one historical query sentence, comprising: is clustered to obtain at least one to multiple initial history query statements and be looked into Ask sentence group, wherein similarity between the historical query sentence in each query statement group meets the first preset threshold, to Determine the query statement group for meeting the first preset condition as target query sentence group, target query in a few query statement group Sentence group includes at least one historical query sentence.
According to the embodiment of the present disclosure, multiple initial history query statements are clustered to obtain at least one query statement Group, comprising: multiple initial history query statements are handled, the corresponding vector of multiple initial history query statements is obtained, it will The corresponding vector of multiple initial history query statements is clustered to obtain at least one query statement group, at least one query statement Group includes the corresponding vector of respective queries sentence.
According to the embodiment of the present disclosure, the second acquisition module 910 can for example execute the operation above with reference to Fig. 4 description S410, details are not described herein.
Figure 10 diagrammatically illustrates the block diagram of the data warehouse information processing unit according to disclosure another embodiment.
As shown in Figure 10, data warehouse information processing unit 1000 includes the first acquisition module 810, determining module 820, life Module 910 and memory module 1010 are obtained at module 830, second.Wherein, first obtain module 810, determining module 820 with And the module that generation module 830 is described on reference to Fig. 8 is same or like, details are not described herein.Second obtain module 910 with it is upper The module described with reference to Fig. 9 is same or like, and details are not described herein.
Memory module 1010 can store target table in the case where target table meets the second preset condition.
According to the embodiment of the present disclosure, in the case where target table meets the second preset condition, target table, packet are stored It includes: obtaining the similarity of other history tables in target table and data warehouse, meet third predetermined threshold value in similarity In the case of, store target table.
According to the embodiment of the present disclosure, memory module 1010 can for example execute the operation S510 above with reference to Fig. 5 description, This is repeated no more.
It is module according to an embodiment of the present disclosure, submodule, unit, any number of or in which any more in subelement A at least partly function can be realized in a module.It is single according to the module of the embodiment of the present disclosure, submodule, unit, son Any one or more in member can be split into multiple modules to realize.According to the module of the embodiment of the present disclosure, submodule, Any one or more in unit, subelement can at least be implemented partly as hardware circuit, such as field programmable gate Array (FPGA), programmable logic array (PLA), system on chip, the system on substrate, the system in encapsulation, dedicated integrated electricity Road (ASIC), or can be by the hardware or firmware for any other rational method for integrate or encapsulate to circuit come real Show, or with any one in three kinds of software, hardware and firmware implementations or with wherein any several appropriately combined next reality It is existing.Alternatively, can be at least by part according to one or more of the module of the embodiment of the present disclosure, submodule, unit, subelement Ground is embodied as computer program module, when the computer program module is run, can execute corresponding function.
For example, first obtains module 810, determining module 820, the acquisition module 910 of generation module 830, second and storage Any number of in module 1010, which may be incorporated in a module, to be realized or any one module therein can be split At multiple modules.Alternatively, at least partly function of one or more modules in these modules can be with other modules at least Partial function combines, and realizes in a module.In accordance with an embodiment of the present disclosure, first module 810, determining module are obtained 820, the acquisition of generation module 830, second at least one of module 910 and memory module 1010 can be at least by partly real Now on hardware circuit, such as field programmable gate array (FPGA), programmable logic array (PLA), system on chip, substrate System, specific integrated circuit (ASIC) in system, encapsulation, or can by circuit carry out it is integrated or encapsulate any other The hardware such as rational method or firmware realize, with any one in three kinds of software, hardware and firmware implementations or with It is wherein any several appropriately combined to realize.Alternatively, first obtains module 810, determining module 820, generation module 830, the Two acquisition at least one of modules 910 and memory module 1010 can at least be implemented partly as computer program mould Block can execute corresponding function when the computer program module is run.
Figure 11 diagrammatically illustrates the computer system for being suitable for data warehouse information processing according to the embodiment of the present disclosure Block diagram.Computer system shown in Figure 11 is only an example, should not function and use scope to the embodiment of the present disclosure Bring any restrictions.
It as shown in figure 11, include processor 1101 according to the computer system of the embodiment of the present disclosure 1100, it can basis The program that is stored in read-only memory (ROM) 1102 is loaded into random access storage device (RAM) from storage section 1108 Program in 1103 and execute various movements appropriate and processing.Processor 1101 for example may include general purpose microprocessor (example Such as CPU), instruction set processor and/or related chip group and/or special microprocessor (for example, specific integrated circuit (ASIC)), Etc..Processor 1101 can also include the onboard storage device for caching purposes.Processor 1101 may include for executing According to single treatment unit either multiple processing units of the different movements of the method flow of the embodiment of the present disclosure.
In RAM 1103, it is stored with system 1100 and operates required various programs and data.Processor 1101, ROM 1102 and RAM 1103 is connected with each other by bus 1104.Processor 1101 is by executing ROM 1102 and/or RAM 1103 In program execute the various operations of the method flow according to the embodiment of the present disclosure.It is noted that described program can also deposit Storage is in one or more memories in addition to ROM 1102 and RAM 1103.Processor 1101 can also be by executing storage Program in one or more of memories executes the various operations of the method flow according to the embodiment of the present disclosure.
In accordance with an embodiment of the present disclosure, system 1100 can also include input/output (I/O) interface 1105, input/output (I/O) interface 1105 is also connected to bus 1104.System 1100 can also include being connected in lower component of I/O interface 1105 It is one or more: the importation 1106 including keyboard, mouse etc.;Including such as cathode-ray tube (CRT), liquid crystal display And the output par, c 1107 of loudspeaker etc. (LCD) etc.;Storage section 1108 including hard disk etc.;And including such as LAN card, The communications portion 1109 of the network interface card of modem etc..Communications portion 1109 executes logical via the network of such as internet Letter processing.Driver 1110 is also connected to I/O interface 1105 as needed.Detachable media 1116, such as disk, CD, magnetic CD, semiconductor memory etc. are mounted on as needed on driver 1110, in order to from the computer program read thereon It is mounted into storage section 1108 as needed.
In accordance with an embodiment of the present disclosure, computer software journey may be implemented as according to the method flow of the embodiment of the present disclosure Sequence.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer readable storage medium Computer program, which includes the program code for method shown in execution flow chart.In such implementation In example, which can be downloaded and installed from network by communications portion 1109, and/or from detachable media 1111 are mounted.The computer program by processor 1101 execute when, execute limited in the system of the embodiment of the present disclosure it is upper State function.In accordance with an embodiment of the present disclosure, system as described above, unit, module, unit etc. can pass through computer Program module is realized.
The disclosure additionally provides a kind of computer readable storage medium, which can be above-mentioned reality It applies included in equipment/device/system described in example;Be also possible to individualism, and without be incorporated the equipment/device/ In system.Above-mentioned computer readable storage medium carries one or more program, when said one or multiple program quilts When execution, the method according to the embodiment of the present disclosure is realized.
In accordance with an embodiment of the present disclosure, it is non-volatile computer-readable to can be computer for computer readable storage medium Storage medium, such as can include but is not limited to: portable computer diskette, hard disk, random access storage device (RAM), Read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In the disclosure, computer-readable Storage medium can be it is any include or storage program tangible medium, the program can be commanded execution system, device or Device use or in connection.
For example, in accordance with an embodiment of the present disclosure, computer readable storage medium may include above-described ROM 1102 And/or one or more memories other than RAM 1103 and/or ROM 1102 and RAM 1103.
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of above-mentioned module, program segment or code include one or more Executable instruction for implementing the specified logical function.It should also be noted that in some implementations as replacements, institute in box The function of mark can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are practical On can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it wants It is noted that the combination of each box in block diagram or flow chart and the box in block diagram or flow chart, can use and execute rule The dedicated hardware based systems of fixed functions or operations is realized, or can use the group of specialized hardware and computer instruction It closes to realize.
It will be understood by those skilled in the art that the feature recorded in each embodiment and/or claim of the disclosure can To carry out multiple combinations or/or combination, even if such combination or combination are not expressly recited in the disclosure.Particularly, exist In the case where not departing from disclosure spirit or teaching, the feature recorded in each embodiment and/or claim of the disclosure can To carry out multiple combinations and/or combination.All these combinations and/or combination each fall within the scope of the present disclosure.
Embodiment of the disclosure is described above.But the purpose that these embodiments are merely to illustrate that, and It is not intended to limit the scope of the present disclosure.Although respectively describing each embodiment above, but it is not intended that each reality Use cannot be advantageously combined by applying the measure in example.The scope of the present disclosure is defined by the appended claims and the equivalents thereof.It does not take off From the scope of the present disclosure, those skilled in the art can make a variety of alternatives and modifications, these alternatives and modifications should all fall in this Within scope of disclosure.

Claims (20)

  1. It include the history table of multiple associated storages 1. a kind of data warehouse information processing method, in the data warehouse, it is described Method includes:
    At least one historical query sentence is obtained, the historical query sentence is for inquiring in the history table of the associated storage Multiple history tables related data;
    Determine the corresponding multiple history tables of at least one described historical query sentence;
    Target table is generated based on the particular historical table in the multiple history table, the target table includes described specific Related data in history table.
  2. 2. according to the method described in claim 1, wherein, described at least one historical query sentence of acquisition, comprising:
    Obtain the correlation that multiple history tables in the history table of the associated storage are inquired by the historical query sentence The history data of related data warehouse when data;
    Based at least one described historical query sentence determining in the operation data.
  3. 3. according to the method described in claim 1, wherein, at least one historical query sentence described in the determination is corresponding multiple History table, comprising:
    At least one described historical query sentence is parsed, the association letter of at least one historical query sentence is obtained Breath, the related information includes associate field and Correlation Criteria;
    The corresponding the multiple history table of at least one described historical query sentence is determined based on the related information.
  4. 4. according to the method described in claim 1, further include:
    At least one is gone through described in the query statement conduct of the first preset condition of acquisition satisfaction from multiple initial history query statements History query statement.
  5. 5. according to the method described in claim 4, wherein, described obtain from multiple initial history query statements meets first in advance If the query statement of condition is as at least one described historical query sentence, comprising:
    The multiple initial history query statement is clustered to obtain at least one query statement group, wherein each inquiry language The similarity between historical query sentence in sentence group meets the first preset threshold;
    Determine that the query statement group for meeting first preset condition is looked into as target from least one described query statement group Sentence group is ask, the target query sentence group includes at least one described historical query sentence.
  6. 6. described to be clustered to obtain to the multiple initial history query statement according to the method described in claim 5, wherein At least one query statement group, comprising:
    The multiple initial history query statement is handled, obtain the multiple initial history query statement it is corresponding to Amount;
    The corresponding vector of multiple initial history query statements is clustered to obtain at least one described query statement group, it is described extremely A few query statement group includes the corresponding vector of respective queries sentence.
  7. 7. according to the method described in claim 1, wherein, the particular historical table is to meet the in the multiple history table The history table of two preset thresholds.
  8. 8. according to the method described in claim 1, further include:
    In the case where the target table meets the second preset condition, the target table is stored.
  9. 9. described the case where the target table meets the second preset condition according to the method described in claim 8, wherein Under, store the target table, comprising:
    Obtain the similarity of other history tables in the target table and the data warehouse;
    In the case where the similarity meets third predetermined threshold value, the target table is stored.
  10. It include the history table of multiple associated storages 10. a kind of data warehouse information processing unit, in the data warehouse, it is described Device includes:
    First obtains module, obtains at least one historical query sentence, and the historical query sentence is deposited for inquiring the association The related data of multiple history tables in the history table of storage;
    Determining module determines the corresponding multiple history tables of at least one described historical query sentence;
    Generation module generates target table, the target table packet based on the particular historical table in the multiple history table Include the related data in the particular historical table.
  11. 11. device according to claim 10, wherein described at least one historical query sentence of acquisition, comprising:
    Obtain the correlation that multiple history tables in the history table of the associated storage are inquired by the historical query sentence The history data of related data warehouse when data;
    Based at least one described historical query sentence determining in the operation data.
  12. 12. device according to claim 10, wherein at least one historical query sentence described in the determination is corresponding more A history table, comprising:
    At least one described historical query sentence is parsed, the association letter of at least one historical query sentence is obtained Breath, the related information includes associate field and Correlation Criteria;
    The corresponding the multiple history table of at least one described historical query sentence is determined based on the related information.
  13. 13. device according to claim 10, further includes:
    Second obtains module, and the query statement for meeting the first preset condition is obtained from multiple initial history query statements as institute State at least one historical query sentence.
  14. 14. device according to claim 13, wherein described obtain from multiple initial history query statements meets first The query statement of preset condition is as at least one described historical query sentence, comprising:
    The multiple initial history query statement is clustered to obtain at least one query statement group, wherein each inquiry language The similarity between historical query sentence in sentence group meets the first preset threshold;
    Determine that the query statement group for meeting first preset condition is looked into as target from least one described query statement group Sentence group is ask, the target query sentence group includes at least one described historical query sentence.
  15. 15. device according to claim 14, wherein described cluster to the multiple initial history query statement To at least one query statement group, comprising:
    The multiple initial history query statement is handled, obtain the multiple initial history query statement it is corresponding to Amount;
    The corresponding vector of multiple initial history query statements is clustered to obtain at least one described query statement group, it is described extremely A few query statement group includes the corresponding vector of respective queries sentence.
  16. 16. device according to claim 10, wherein the particular historical table is to meet in the multiple history table The history table of second preset threshold.
  17. 17. device according to claim 10, further includes:
    Memory module stores the target table in the case where the target table meets the second preset condition.
  18. 18. device according to claim 17, wherein described the case where the target table meets the second preset condition Under, store the target table, comprising:
    Obtain the similarity of other history tables in the target table and the data warehouse;
    In the case where the similarity meets third predetermined threshold value, the target table is stored.
  19. 19. a kind of data warehouse information processing system, comprising:
    One or more processors;
    Storage device, for storing one or more programs,
    Wherein, when one or more of programs are executed by one or more of processors, so that one or more of Processor executes method according to claim 1 to 9.
  20. 20. a kind of computer readable storage medium, is stored thereon with executable instruction, which makes to handle when being executed by processor Device executes method according to claim 1 to 9.
CN201811111998.7A 2018-09-21 2018-09-21 Data warehouse information processing method, device, system and medium Active CN109388637B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811111998.7A CN109388637B (en) 2018-09-21 2018-09-21 Data warehouse information processing method, device, system and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811111998.7A CN109388637B (en) 2018-09-21 2018-09-21 Data warehouse information processing method, device, system and medium

Publications (2)

Publication Number Publication Date
CN109388637A true CN109388637A (en) 2019-02-26
CN109388637B CN109388637B (en) 2020-09-01

Family

ID=65417630

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811111998.7A Active CN109388637B (en) 2018-09-21 2018-09-21 Data warehouse information processing method, device, system and medium

Country Status (1)

Country Link
CN (1) CN109388637B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110275920A (en) * 2019-06-27 2019-09-24 中国石油集团东方地球物理勘探有限责任公司 Data query method, apparatus, electronic equipment and computer readable storage medium
CN110674117A (en) * 2019-09-26 2020-01-10 京东数字科技控股有限公司 Data modeling method and device, computer readable medium and electronic equipment
CN110781203A (en) * 2019-09-09 2020-02-11 国网电子商务有限公司 Method and device for determining data width table
CN110837507A (en) * 2019-11-08 2020-02-25 深圳市彬讯科技有限公司 Dynamic processing method, equipment and storage medium of data table
CN110895533A (en) * 2019-11-29 2020-03-20 北京锐安科技有限公司 Form mapping method and device, computer equipment and storage medium
CN111198918A (en) * 2020-01-17 2020-05-26 国网福建省电力有限公司 Data processing system based on big data platform and link optimization method
CN111399843A (en) * 2020-03-11 2020-07-10 中国邮政储蓄银行股份有限公司 Method, system and electronic device for mapping SQ L operation information to SQ L file
CN111694891A (en) * 2019-03-12 2020-09-22 马上消费金融股份有限公司 Data table processing method and device
CN111694813A (en) * 2020-05-08 2020-09-22 北京明略软件***有限公司 Data source management method and device
CN111984631A (en) * 2020-09-02 2020-11-24 深圳壹账通智能科技有限公司 Production data migration method and device, computer equipment and storage medium
CN112540978A (en) * 2019-09-23 2021-03-23 北京顺源开华科技有限公司 Wide table generation method and device and electronic equipment
CN113535817A (en) * 2021-07-13 2021-10-22 浙江网商银行股份有限公司 Method and device for generating characteristic broad table and training business processing model
CN114168595A (en) * 2021-12-09 2022-03-11 中国建设银行股份有限公司 Data analysis method and device
CN114238286A (en) * 2022-02-28 2022-03-25 连连(杭州)信息技术有限公司 Data warehouse data processing method and device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102479223A (en) * 2010-11-25 2012-05-30 ***通信集团浙江有限公司 Data query method and system
CN102542009A (en) * 2011-12-14 2012-07-04 中兴通讯股份有限公司 Data querying method and device
CN103064853A (en) * 2011-10-20 2013-04-24 北京百度网讯科技有限公司 Search suggestion generation method, device and system
CN106951552A (en) * 2017-03-27 2017-07-14 重庆邮电大学 A kind of user behavior data processing method based on Hadoop

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102479223A (en) * 2010-11-25 2012-05-30 ***通信集团浙江有限公司 Data query method and system
CN103064853A (en) * 2011-10-20 2013-04-24 北京百度网讯科技有限公司 Search suggestion generation method, device and system
CN102542009A (en) * 2011-12-14 2012-07-04 中兴通讯股份有限公司 Data querying method and device
CN106951552A (en) * 2017-03-27 2017-07-14 重庆邮电大学 A kind of user behavior data processing method based on Hadoop

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111694891A (en) * 2019-03-12 2020-09-22 马上消费金融股份有限公司 Data table processing method and device
CN111694891B (en) * 2019-03-12 2021-01-12 马上消费金融股份有限公司 Data table processing method and device
CN110275920A (en) * 2019-06-27 2019-09-24 中国石油集团东方地球物理勘探有限责任公司 Data query method, apparatus, electronic equipment and computer readable storage medium
CN110275920B (en) * 2019-06-27 2021-08-03 中国石油集团东方地球物理勘探有限责任公司 Data query method and device, electronic equipment and computer readable storage medium
CN110781203A (en) * 2019-09-09 2020-02-11 国网电子商务有限公司 Method and device for determining data width table
CN112540978A (en) * 2019-09-23 2021-03-23 北京顺源开华科技有限公司 Wide table generation method and device and electronic equipment
CN110674117A (en) * 2019-09-26 2020-01-10 京东数字科技控股有限公司 Data modeling method and device, computer readable medium and electronic equipment
CN110837507A (en) * 2019-11-08 2020-02-25 深圳市彬讯科技有限公司 Dynamic processing method, equipment and storage medium of data table
CN110837507B (en) * 2019-11-08 2022-10-14 土巴兔集团股份有限公司 Dynamic processing method, equipment and storage medium of data table
CN110895533A (en) * 2019-11-29 2020-03-20 北京锐安科技有限公司 Form mapping method and device, computer equipment and storage medium
CN110895533B (en) * 2019-11-29 2023-01-17 北京锐安科技有限公司 Form mapping method and device, computer equipment and storage medium
CN111198918A (en) * 2020-01-17 2020-05-26 国网福建省电力有限公司 Data processing system based on big data platform and link optimization method
CN111198918B (en) * 2020-01-17 2022-10-04 国网福建省电力有限公司 Data processing system based on big data platform and link optimization method
CN111399843A (en) * 2020-03-11 2020-07-10 中国邮政储蓄银行股份有限公司 Method, system and electronic device for mapping SQ L operation information to SQ L file
CN111399843B (en) * 2020-03-11 2023-08-01 中国邮政储蓄银行股份有限公司 Method, system and electronic equipment for mapping SQL running information to SQL file
CN111694813A (en) * 2020-05-08 2020-09-22 北京明略软件***有限公司 Data source management method and device
CN111984631A (en) * 2020-09-02 2020-11-24 深圳壹账通智能科技有限公司 Production data migration method and device, computer equipment and storage medium
CN113535817A (en) * 2021-07-13 2021-10-22 浙江网商银行股份有限公司 Method and device for generating characteristic broad table and training business processing model
CN113535817B (en) * 2021-07-13 2024-05-14 浙江网商银行股份有限公司 Feature broad table generation and service processing model training method and device
CN114168595A (en) * 2021-12-09 2022-03-11 中国建设银行股份有限公司 Data analysis method and device
CN114238286A (en) * 2022-02-28 2022-03-25 连连(杭州)信息技术有限公司 Data warehouse data processing method and device, electronic equipment and storage medium
CN114238286B (en) * 2022-02-28 2022-08-05 连连(杭州)信息技术有限公司 Data warehouse data processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN109388637B (en) 2020-09-01

Similar Documents

Publication Publication Date Title
CN109388637A (en) Data warehouse information processing method, device, system, medium
US11119980B2 (en) Self-learning operational database management
US11068439B2 (en) Unsupervised method for enriching RDF data sources from denormalized data
Auer et al. Triplify: light-weight linked data publication from relational databases
US10698924B2 (en) Generating partitioned hierarchical groups based on data sets for business intelligence data models
CN110019397B (en) Method and device for data processing
US10180984B2 (en) Pivot facets for text mining and search
US20150006432A1 (en) Ontology-driven construction of semantic business intelligence models
US10002126B2 (en) Business intelligence data models with concept identification using language-specific clues
US9201700B2 (en) Provisioning computer resources on a network
US10885087B2 (en) Cognitive automation tool
US20150186808A1 (en) Contextual data analysis using domain information
US11514124B2 (en) Personalizing a search query using social media
US10360394B2 (en) System and method for creating, tracking, and maintaining big data use cases
US20150317374A1 (en) User-relevant statistical analytics using business intelligence semantic modeling
US20180293295A1 (en) Detection and creation of appropriate row concept during automated model generation
CN109840254A (en) A kind of data virtualization and querying method, device
US11880740B2 (en) Facilitating machine learning configuration
EP2472461A1 (en) Configurable catalog builder system
CN115640300A (en) Big data management method, system, electronic equipment and storage medium
US10671631B2 (en) Method, apparatus, and computer-readable medium for non-structured data profiling
CN113962597A (en) Data analysis method and device, electronic equipment and storage medium
Bellini et al. Managing Complexity of Data Models and Performance in Broker-Based Internet/Web of Things Architectures
Noh et al. Bigdata platform design and implementation model
Kyoo-sung et al. Bigdata platform design and implementation model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Room 221, 2nd floor, Block C, 18 Kechuang 11th Street, Beijing Daxing District, Beijing

Applicant after: Jingdong Digital Technology Holding Co., Ltd.

Address before: Room 221, 2nd floor, Block C, 18 Kechuang 11th Street, Daxing Economic and Technological Development Zone, Beijing, 100176

Applicant before: Beijing Jingdong Financial Technology Holding Co., Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder

Address after: Room 221, 2 / F, block C, 18 Kechuang 11th Street, Daxing District, Beijing, 100176

Patentee after: Jingdong Technology Holding Co.,Ltd.

Address before: Room 221, 2 / F, block C, 18 Kechuang 11th Street, Daxing District, Beijing, 100176

Patentee before: JINGDONG DIGITAL TECHNOLOGY HOLDINGS Co.,Ltd.

CP01 Change in the name or title of a patent holder