CN110297818A - Construct the method and device of data warehouse - Google Patents

Construct the method and device of data warehouse Download PDF

Info

Publication number
CN110297818A
CN110297818A CN201910563806.4A CN201910563806A CN110297818A CN 110297818 A CN110297818 A CN 110297818A CN 201910563806 A CN201910563806 A CN 201910563806A CN 110297818 A CN110297818 A CN 110297818A
Authority
CN
China
Prior art keywords
theme
source
data
designated key
subject
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910563806.4A
Other languages
Chinese (zh)
Other versions
CN110297818B (en
Inventor
王超群
林必红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dt Dream Technology Co Ltd
Original Assignee
Hangzhou Dt Dream Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dt Dream Technology Co Ltd filed Critical Hangzhou Dt Dream Technology Co Ltd
Priority to CN201910563806.4A priority Critical patent/CN110297818B/en
Publication of CN110297818A publication Critical patent/CN110297818A/en
Application granted granted Critical
Publication of CN110297818B publication Critical patent/CN110297818B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a kind of method for constructing data warehouse, the data warehouse includes one or more themes library, the described method includes: setting theme priority allocation list, the theme priority allocation list is for configuring priority of each designated key attribute in each specified data source;According to priority of each designated key attribute in each specified data source, tracing to the source for each corresponding subject data of the designated key attribute and the subject data is determined, obtain theme and trace to the source table;Subject heading list for characterizing the theme library is generated according to theme table of tracing to the source.Therefore, the disclosure realizes the Longitudinal Extension, extending transversely and trace to the source of data warehouse, also improves the reliability of building data warehouse.

Description

Construct the method and device of data warehouse
Technical field
This disclosure relates to computer communication technology field more particularly to a kind of method and device for constructing data warehouse.
Background technique
Data warehouse, English name are Data Warehouse, can be abbreviated as DW or DWH.Data warehouse is for institute, enterprise There is the decision-making process of rank, the strategy set of all types data support is provided.
In the related technology, data warehouse is subject-oriented, and the data in data warehouse are according to certain theme Domain carries out tissue.Wherein, emphasis side of concern when theme here refers to user using data warehouse progress decision Face.Such as: in the data warehouse of public security system, can divide the data into: people, several big themes such as thing, object, case.For people Theme library building, following steps can be divided into: (1) first according to people the characteristics of, according to the professional knowledge of public security, establish Play the wide table of people;(2) from existing business datum, the field for extracting and wanting in wide table is arranged;(3) when multiple traffic tables have When having the same field, the data in that highest table of confidence level are selected.
But in the building of the data warehouse in above-mentioned subject-oriented, needs to compare the data in multiple tables, need area Each of tables of data field is assigned to, if multiple tables are not only needed to distinguish priority, also be needed there are when the field of same meaning The validity of every data is distinguished, realizes that process is extremely complex, while being also unfavorable for the Longitudinal Extension of data warehouse, transverse direction It extends and traces to the source.
Summary of the invention
To overcome the problems in correlation technique, present disclose provides a kind of method and devices for constructing data warehouse.
According to the first aspect of the embodiments of the present disclosure, a kind of method constructing data warehouse, the data warehouse packet are provided Include one or more theme libraries, which comprises
Theme priority allocation list is set, and the theme priority allocation list is for configuring each designated key attribute each Priority in a specified data source;
According to priority of each designated key attribute in each specified data source, each specified master is determined Topic attribute corresponding subject data and the subject data are traced to the source, and are obtained theme and are traced to the source table;
Subject heading list for characterizing the theme library is generated according to theme table of tracing to the source.According to the of the embodiment of the present disclosure Two aspects, provide a kind of device for constructing data warehouse, and the data warehouse includes one or more themes library, described device packet It includes:
Setup module, is configured as setting theme priority allocation list, and the theme priority allocation list is each for configuring Priority of a designated key attribute in each specified data source;
Determining module is configured as the priority according to each designated key attribute in each specified data source, It determines tracing to the source for each corresponding subject data of the designated key attribute and the subject data, obtains theme and trace to the source table;
Generation module, the table that is configured as being traced to the source according to the theme generate the subject heading list for characterizing the theme library.
According to the third aspect of an embodiment of the present disclosure, a kind of non-transitorycomputer readable storage medium is provided, is deposited thereon Contain computer program, which is characterized in that the program realizes the building data that above-mentioned first aspect provides when being executed by processor The method in warehouse.
According to a fourth aspect of embodiments of the present disclosure, a kind of device constructing data warehouse, the data warehouse packet are provided One or more theme libraries are included, described device includes:
Processor;
Memory for storage processor executable instruction;
Wherein, the processor is configured to:
Theme priority allocation list is set, and the theme priority allocation list is for configuring each designated key attribute each Priority in a specified data source;
According to priority of each designated key attribute in each specified data source, each specified master is determined Topic attribute corresponding subject data and the subject data are traced to the source, and are obtained theme and are traced to the source table;
Subject heading list for characterizing the theme library is generated according to theme table of tracing to the source.
The technical scheme provided by this disclosed embodiment can include the following benefits:
The disclosure can be by being arranged theme priority allocation list, and the theme priority allocation list is each specified for configuring Priority of the subject attribute in each specified data source;It is excellent in each specified data source according to each designated key attribute First grade determines tracing to the source for the corresponding subject data of each designated key attribute and the subject data, obtains theme and traces to the source table;According to Theme table of tracing to the source is generated for characterizing the subject heading list in theme library, thus be conducive to the Longitudinal Extension of data warehouse, it is extending transversely and It traces to the source, improves the reliability of building data warehouse.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not The disclosure can be limited.
Detailed description of the invention
Fig. 1 is a kind of method flow diagram for constructing data warehouse shown according to an exemplary embodiment;
Fig. 2 is the method flow diagram of another building data warehouse shown according to an exemplary embodiment;
Fig. 3 is the method flow diagram of another building data warehouse shown according to an exemplary embodiment;
Fig. 4 is a kind of device block diagram for constructing data warehouse shown according to an exemplary embodiment;
Fig. 5 is the device block diagram of another building data warehouse shown according to an exemplary embodiment;
Fig. 6 is the device block diagram of another building data warehouse shown according to an exemplary embodiment;
Fig. 7 is shown according to an exemplary embodiment a kind of for constructing a structural representation of the device of data warehouse Figure.
Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment Described in embodiment do not represent all implementations consistent with this disclosure.On the contrary, they be only with it is such as appended The example of the consistent device and method of some aspects be described in detail in claims, the disclosure.
It is only to be not intended to be limiting the disclosure merely for for the purpose of describing particular embodiments in the term that the disclosure uses. The "an" of the singular used in disclosure and the accompanying claims book, " described " and "the" are also intended to including majority Form, unless the context clearly indicates other meaning.It is also understood that term "and/or" used herein refers to and wraps It may be combined containing one or more associated any or all of project listed.
It will be appreciated that though various information, but this may be described using term first, second, third, etc. in the disclosure A little information should not necessarily be limited by these terms.These terms are only used to for same type of information being distinguished from each other out.For example, not departing from In the case where disclosure range, the first information can also be referred to as the second information, and similarly, the second information can also be referred to as One information.Depending on context, word as used in this " if " can be construed to " ... when " or " when ... When " or " in response to determination ".
Fig. 1 is a kind of disclosure method flow diagram for constructing data warehouse shown according to an exemplary embodiment, described Data warehouse may include one or more theme libraries.Wherein, each theme library can be towards different themes, such as: people, Several big themes such as ground, thing, object, case.As shown in Figure 1, the method for the building data warehouse may comprise steps of 110-130:
In step 110, theme priority allocation list is set, and the theme priority allocation list is for configuring each specified master Inscribe priority of the attribute in each specified data source.
It is not that subject heading list is directly generated according to each specified data source when constructing data warehouse in the embodiment of the present disclosure, But for the ease of Longitudinal Extension and tracing to the source, intermediate list item is first set, such as: theme priority allocation list and theme are traced to the source table, Subject heading list is obtained by these intermediate list items again.
In one embodiment, the designated key attribute in above-mentioned steps 110 can be from each specified data What is extracted in source is used to describe the subject attribute in the theme library;The specified data source can be specify it is described for constructing The data source in theme library.
It in one embodiment, may include for describing in the theme priority allocation list in above-mentioned steps 110 State the first kind field of specified data source, the second class field for describing the designated key attribute and for describing State the third class field of priority of the designated key attribute in each specified data source.
It in one embodiment, further include reserved field in the theme priority allocation list in above-mentioned steps 110, it is described Reserved field is the field in the reserved data source, and/or reserved subject attribute for subsequent expansion.
Such as: by taking personnel's theme as an example, each specified data source includes: shown in case table and table 2 shown in following table 1 Permanent resident population's table.
Table 1
Suspicion personnel ID Permanent address 1 Native place 1 Height 1 Weight 1
1000001 A2 Zhejiang 1.70 68
1000002 B2 Guangdong 1.72 60
1000003 C2 Henan 1.80 77
1000004 D2 Shanghai 1.55 44
Table 2
Personnel ID Permanent address 2 Native place 2
1000001 A3 Zhejiang
1000002 Hubei
1000003 C3 Hunan
1000005 Yunnan
Each designated key attribute includes: permanent address, native place, height, weight.Also, the theme priority being arranged is matched Table is set, as described in Table 3.
Table 3
Personnel ID (Identity, identity number) in above-mentioned table 3 is crucial (key) field, it is indicated in source This field is identical in tables of data, could data integration in multiple data sources to coming together.Also, in addition to closing in above-mentioned table 3 Outside key (key) field, other fields have a corresponding priority, such as: for the permanent address 1 of table 1, priority is 90;For the permanent address 2 of table 2, priority 95, the priority number is bigger, shows that priority is higher.
In above-mentioned table 3, when theme priority allocation list is arranged, the minimum span between each priority of configuration is 5, Since it takes into account that being easy the new priority of insertion when thering is the data source of similar priority to come in below, i.e., carrying out in advance Priority is reserved.Certainly, which may be the numerical value greater than 5, such as: 10,100 etc..
In above-mentioned table 3, if specified data source expands, only need to add a new specified data source in table 3 at this time, And key value is filled in, and fill in corresponding value field and priority, thus expanding data source holds very much in the disclosure Easily, it is only necessary to allocation list is modified, consequently facilitating realizing the extending transversely of subject heading list.
In the step 120, the priority according to each designated key attribute in each specified data source, determines each finger Determine tracing to the source for the corresponding subject data of subject attribute and the subject data, obtains theme and trace to the source table.
In the embodiment of the present disclosure, when determining the corresponding subject data of each designated key attribute, generally choose excellent First grade is higher and effective source data.
In one embodiment, it may include each described for describing that the theme in above-mentioned steps 120, which is traced to the source in table, 4th field of the corresponding subject data of designated key attribute and the 5th word traced to the source for describing the subject data Section.
In step 130, subject heading list for characterizing theme library is generated according to theme table of tracing to the source.
In the embodiment of the present disclosure, due to that can not include the corresponding subject data of each designated key attribute in subject heading list Trace to the source, thus according to theme trace to the source table generate the subject heading list for characterizing theme library when, can directly remove each specified master The topic corresponding subject data of attribute is traced to the source.
As seen from the above-described embodiment, by the way that theme priority allocation list is arranged, the theme priority allocation list is for configuring Priority of each designated key attribute in each specified data source;According to each designated key attribute in each specified data Priority in source determines tracing to the source for the corresponding subject data of each designated key attribute and the subject data, obtains theme and trace back Source table;Table generation is traced to the source for characterizing the subject heading list in theme library, to be conducive to the Longitudinal Extension of data warehouse, cross according to theme To extending and tracing to the source, the reliability of building data warehouse is improved.
Fig. 2 is a kind of disclosure method flow diagram for constructing data warehouse shown according to an exemplary embodiment, the party Method can be used on the basis of method shown in Fig. 1, as shown in Fig. 2, may comprise steps of 210- when executing step 120 230:
In step 210, for any designated key attribute, according to the designated key attribute in each specified data source Priority, select the corresponding specified data source of highest priority.
In a step 220, in the corresponding specified data source of highest priority and the corresponding source of designated key attribute When data are valid data, then the corresponding source data of designated key attribute is determined as the corresponding theme of designated key attribute Data, and the corresponding specified data source of highest priority is determined as tracing to the source for the subject data obtain theme and trace to the source table.
In step 230, in the corresponding specified data source of highest priority and the corresponding source of designated key attribute When data are invalid data, then according to the designated key attribute in the priority in each specified data source, time Gao You is selected The corresponding specified data source of first grade determines pair until when to inquire the corresponding source data of designated key attribute be valid data The theme answered is traced to the source table.
In one embodiment, it may include for describing that the theme in above-mentioned steps 220 and step 230, which is traced to the source in table, 4th field of the corresponding subject data of each designated key attribute and for describing tracing to the source for the subject data 5th field.
Such as: by taking personnel's theme as an example, each specified data source includes: shown in case table shown in above-mentioned table 1 and table 2 Permanent resident population's table.Each designated key attribute includes: permanent address, native place, height, weight.Also, the theme priority being arranged Allocation list, as shown in Table 3 above.And obtained theme is traced to the source table, as described in Table 4.
Table 4
Its process for obtaining table 4 specifically:
(1) it in table 3, according to specified data source and critical field, is specified in data source from these, taking out has same person The data of member ID, such as the people that personnel ID in Tables 1 and 2 is 1000001;
(2) in table 3, for table 11 this field of permanent address priority be 90, for table 2 permanent address 2 this The priority of a field is 95;
(3) permanent address of 1000001 personnel ID in Tables 1 and 2 is taken out, according to priority ratio compared with taking-up is high Priority and effective data fill out the correspondence permanent address field (1000001. permanent addresses=A3) in table 4, while normal in table 4 " table 2 " is firmly write in address source, indicates this field from table 2;
(4) similarly, the permanent address for being 1000002 for personnel ID, since the data in table 2 are empty invalid (i.e. in table 2 Data be invalid data), so the data (data be valid data) i.e. in table 1 for taking out in table 1 fill out it is corresponding in table 4 Permanent address field (1000002. permanent addresses=B2), while " table 1 " is write in 4 permanent address source of table, indicate this word Section comes from table 1;
(5) and so on all data are all write.
It as seen from the above-described embodiment, can be according to the designated key attribute in each finger for any designated key attribute Determine the priority in data source, selects the corresponding specified data source of highest priority;In the corresponding specified data of highest priority In source, the corresponding number of effective sources evidence of the designated key attribute is determined as the corresponding subject data of designated key attribute;It will most The corresponding specified data source of high priority is determined as tracing to the source for the subject data, obtains theme and traces to the source table, to improve theme It traces to the source the formation efficiency of table, the theme for also improving generation is traced to the source the practicability of table.
Fig. 3 is a kind of disclosure method flow diagram for constructing data warehouse shown according to an exemplary embodiment, the party Method can be used on the basis of method shown in Fig. 1, as shown in figure 3, may comprise steps of 310- when executing step 130 330:
In the step 310, it deletes theme to trace to the source the tracing to the source of the subject data for including in table, obtains the interim table of theme.
In step 320, the setting interim table of theme arrives the mapping table of subject heading list, including the interim literary name of theme in the mapping table Mapping relations between section and theme literary name section.
In one embodiment, the first field data for including in the interim literary name section of the theme in above-mentioned steps 320 is The each designated key attribute for including in the interim table of theme;The second field data for including in the theme literary name section is institute It states each designated key attribute for including in subject heading list, includes include each in the mapping relations in the interim table of the theme The first mapping relations between each designated key attribute for including in designated key attribute and the subject heading list;
The third field data for including in the interim literary name section of theme is include each pre- in the interim table of the theme Stay subject attribute;The 4th field data for including in the theme literary name section is each reserved theme for including in the subject heading list Attribute;In the mapping relations include the interim table of the theme in include each reserved subject attribute and the subject heading list in wrap The second mapping relations between each reserved subject attribute included.
In a step 330, the corresponding subject heading list of the interim table of theme is determined according to mapping table.
Such as: by taking personnel's theme as an example, each specified data source includes: shown in case table shown in above-mentioned table 1 and table 2 Permanent resident population's table.Each designated key attribute includes: permanent address, native place, height, weight.Also, the theme priority being arranged Allocation list, as shown in Table 3 above.And obtained theme is traced to the source table, as shown in Table 4 above.In addition, the interim table of theme, as follows It states shown in table 5;The interim table of theme to subject heading list mapping table, as described in Table 6;Personnel's subject heading list as described in Table 7
Table 5
Table 6
Table 7
There is reserved field in above-mentioned table 4 and table 5, write the mapping relations of reserved field Yu true field in table 6 exactly, leads to The mapping for crossing table 5 and table 6 can be obtained by table 8.In addition, it is extending transversely in order to realize, mapping table is modified, allocation list is enabled;? That is retaining corresponding reserved field in table 7, corresponding reserved field is enabled in table 5.
As seen from the above-described embodiment, it is traced to the source the tracing to the source of the subject data for including in table by deleting theme, obtains theme and face When table;The interim table of theme is set to the mapping table of subject heading list, includes the interim literary name section of theme and theme literary name section in the mapping table Between mapping relations;The corresponding subject heading list of the interim table of theme is determined according to mapping table, thus improve the subject heading list of generation Accuracy.
Corresponding with the aforementioned building embodiment of the method for data warehouse, the disclosure additionally provides the device of building data warehouse Embodiment.
As shown in figure 4, Fig. 4 is a kind of disclosure device for constructing data warehouse shown according to an exemplary embodiment Block diagram, and the method for executing building data warehouse shown in FIG. 1, the data warehouse may include one or more themes Library.Wherein, each theme library can be towards different themes, such as: people, several big themes such as thing, object, case.Such as Fig. 4 institute Show, the device of the building data warehouse may include:
Setup module 41 is configured as setting theme priority allocation list, and the theme priority allocation list is for configuring Priority of each designated key attribute in each specified data source;
Determining module 42 is configured as preferential in each specified data source according to each designated key attribute Grade determines tracing to the source for each corresponding subject data of the designated key attribute and the subject data, obtains theme and traces to the source table;
Generation module 43, the table that is configured as being traced to the source according to the theme generate the subject heading list for characterizing the theme library.
In one embodiment, it establishes on the basis of device shown in Fig. 4, the designated key attribute is from each finger Determine the subject attribute for being used to describe the theme library extracted in data source;The specified data source is specified for constructing State the data source in theme library.
In one embodiment, it establishes on the basis of above-mentioned shown device, includes using in the theme priority allocation list In the first kind field, the second class field for describing the designated key attribute, Yi Jiyong that describe the specified data source In the third class field for describing priority of the designated key attribute in each specified data source.
In one embodiment, it establishes on the basis of above-mentioned shown device, further includes in the theme priority allocation list Reserved field, the reserved field are the fields in the reserved data source, and/or reserved subject attribute for subsequent expansion.
As seen from the above-described embodiment, by the way that theme priority allocation list is arranged, the theme priority allocation list is for configuring Priority of each designated key attribute in each specified data source;According to each designated key attribute in each specified data Priority in source determines tracing to the source for the corresponding subject data of each designated key attribute and the subject data, obtains theme and trace back Source table;Table generation is traced to the source for characterizing the subject heading list in theme library, to be conducive to the Longitudinal Extension of data warehouse, cross according to theme To extending and tracing to the source, the reliability of building data warehouse is improved.
In one embodiment, it establishes on the basis of device shown in Fig. 4, as shown in figure 5, the determining module 42 can wrap It includes:
Submodule 51 is chosen, is configured as any designated key attribute, according to the designated key attribute each Priority in a specified data source selects the corresponding specified data source of highest priority;
First determines submodule 52, is configured as working as in the corresponding specified data source of highest priority and this refers to Determine the corresponding source data of subject attribute be valid data when, then the corresponding source data of designated key attribute is determined as this and specified The corresponding subject data of subject attribute, and the corresponding specified data source of highest priority is determined as tracing back for the subject data Source obtains the theme and traces to the source table;
Second determines submodule 53, is configured as working as in the corresponding specified data source of highest priority and this refers to Determine the corresponding source data of subject attribute be invalid data when, then it is excellent in each specified data source according to the designated key attribute In first grade, time corresponding specified data source of high priority is selected, until inquiring the corresponding source number of the designated key attribute When according to for valid data, determine that the corresponding theme is traced to the source table.
In one embodiment, it establishes on the basis of Fig. 4 or Fig. 5 shown device, it includes being used in table that the theme, which is traced to the source, 4th field of the corresponding subject data of each designated key attribute is described and for describing tracing back for the subject data 5th field in source.
It as seen from the above-described embodiment, can be according to the designated key attribute in each finger for any designated key attribute Determine the priority in data source, selects the corresponding specified data source of highest priority;In the corresponding specified data of highest priority In source, the corresponding number of effective sources evidence of the designated key attribute is determined as the corresponding subject data of designated key attribute;It will most The corresponding specified data source of high priority is determined as tracing to the source for the subject data, obtains theme and traces to the source table, to improve theme It traces to the source the formation efficiency of table, the theme for also improving generation is traced to the source the practicability of table.
In one embodiment, it establishes on the basis of device shown in Fig. 4, as shown in fig. 6, the generation module 43 can wrap It includes:
Submodule 61 is deleted, is configured as deleting the theme and traces to the source the tracing to the source of the subject data for including in table, obtain To the interim table of theme;
Submodule 62 is set, is configured as the setting interim table of theme to the mapping table of subject heading list, includes in the mapping table Mapping relations between the interim literary name section of theme and theme literary name section;
Third determines submodule 63, is configured as determining the corresponding master of the interim table of the theme according to the mapping table Inscribe table.
In one embodiment, it establishes on the basis of device shown in Fig. 6, include in the interim literary name section of theme first Field data is each designated key attribute for including in the interim table of the theme;The second word for including in the theme literary name section Segment data is each designated key attribute for including in the subject heading list, includes in the interim table of the theme in the mapping relations Including each designated key attribute and the subject heading list in include each designated key attribute between the first mapping relations;
The third field data for including in the interim literary name section of theme is include each pre- in the interim table of the theme Stay subject attribute;The 4th field data for including in the theme literary name section is each reserved theme for including in the subject heading list Attribute;In the mapping relations include the interim table of the theme in include each reserved subject attribute and the subject heading list in wrap The second mapping relations between each reserved subject attribute included.
As seen from the above-described embodiment, it is traced to the source the tracing to the source of the subject data for including in table by deleting theme, obtains theme and face When table;The interim table of theme is set to the mapping table of subject heading list, includes the interim literary name section of theme and theme literary name section in the mapping table Between mapping relations;The corresponding subject heading list of the interim table of theme is determined according to mapping table, thus improve the subject heading list of generation Accuracy.
For device embodiment, since it corresponds essentially to embodiment of the method, so related place is referring to method reality Apply the part explanation of example.The apparatus embodiments described above are merely exemplary, wherein above-mentioned be used as separation unit The unit of explanation may or may not be physically separated, and component shown as a unit can be or can also be with It is not physical unit, it can it is in one place, or may be distributed over multiple network units.It can be according to actual The purpose for needing to select some or all of the modules therein to realize disclosure scheme.Those of ordinary skill in the art are not paying Out in the case where creative work, it can understand and implement.
The disclosure additionally provides a kind of non-transitorycomputer readable storage medium, is stored thereon with computer program, should Program be executed by processor it is as any into Fig. 4 such as Fig. 2 shown in construct data warehouse method.
The disclosure additionally provides a kind of device for constructing data warehouse, and the data warehouse includes one or more themes Library, described device include:
Processor;
Memory for storage processor executable instruction;
Wherein, the processor is configured to:
Theme priority allocation list is set, and the theme priority allocation list is for configuring each designated key attribute each Priority in a specified data source;
According to priority of each designated key attribute in each specified data source, each specified master is determined Topic attribute corresponding subject data and the subject data are traced to the source, and are obtained theme and are traced to the source table;
Subject heading list for characterizing the theme library is generated according to theme table of tracing to the source.
As shown in fig. 7, Fig. 7 is shown according to an exemplary embodiment a kind of for constructing the device 700 of data warehouse A structural schematic diagram.Referring to Fig. 7, it further comprises one or more processors that device 700, which includes processing component 722, with And the memory resource as representated by 716, it can be by the instruction of the execution of processing component 722, such as application program for storing. The application program stored in 716 may include it is one or more each correspond to one group of instruction module.In addition, place Reason component 722 is configured as executing instruction, to execute the method such as the described in any item building data warehouses of Fig. 2 to Fig. 4.
Device 700 can also include the power management that a power supply module 726 is configured as executive device 700, and one has Line or radio network interface 750 are configured as device 700 being connected to network and input and output (I/O) interface 758.Dress Setting 700 can operate based on the operating system for being stored in memory 716, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or similar.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to its of the disclosure Its embodiment.The disclosure is intended to cover any variations, uses, or adaptations of the disclosure, these modifications, purposes or Person's adaptive change follows the general principles of this disclosure and including the undocumented common knowledge in the art of the disclosure Or conventional techniques.The description and examples are only to be considered as illustrative, and the true scope and spirit of the disclosure are by following Claim is pointed out.
It should be understood that the present disclosure is not limited to the precise structures that have been described above and shown in the drawings, and And various modifications and changes may be made without departing from the scope thereof.The scope of the present disclosure is only limited by the accompanying claims.

Claims (18)

1. a kind of method for constructing data warehouse, which is characterized in that the data warehouse includes one or more themes library, described Method includes:
Theme priority allocation list is set, and the theme priority allocation list is for configuring each designated key attribute in each finger Determine the priority in data source;
According to priority of each designated key attribute in each specified data source, each designated key category is determined The corresponding subject data of property and the subject data are traced to the source, and are obtained theme and are traced to the source table;
Subject heading list for characterizing the theme library is generated according to theme table of tracing to the source.
2. the method according to claim 1, wherein the designated key attribute is from each specified data What is extracted in source is used to describe the subject attribute in the theme library;The specified data source is specified for constructing the theme The data source in library.
3. method according to claim 1 or 2, which is characterized in that include for retouching in the theme priority allocation list State the first kind field of the specified data source, the second class field for describing the designated key attribute and for retouching State the third class field of priority of the designated key attribute in each specified data source.
4. according to the method described in claim 3, it is characterized in that, further including pre- write down characters in the theme priority allocation list Section, the reserved field is the field in the reserved data source, and/or reserved subject attribute for subsequent expansion.
5. the method according to claim 1, wherein it is described according to each designated key attribute in each finger Determine the priority in data source, determines tracing to the source for the corresponding subject data of each designated key attribute and the subject data, obtain It traces to the source table to theme, comprising:
For any designated key attribute, according to priority of the designated key attribute in each specified data source, choosing The corresponding specified data source of highest priority out;
When in the corresponding specified data source of highest priority and the corresponding source data of designated key attribute be significant figure According to when, then the corresponding source data of designated key attribute is determined as the corresponding subject data of designated key attribute, and will most The corresponding specified data source of high priority is determined as tracing to the source for the subject data, obtains the theme and traces to the source table;
When in the corresponding specified data source of highest priority and the corresponding source data of designated key attribute be invalid number According to when, then according to the designated key attribute in the priority in each specified data source, select time corresponding institute of high priority Specified data source is stated, when inquiring the corresponding source data of designated key attribute is valid data, is determined corresponding described Theme is traced to the source table.
6. method according to claim 1 or 5, which is characterized in that it includes each for describing that the theme, which is traced to the source in table, 4th field of the corresponding subject data of the designated key attribute and the 5th to trace to the source for describing the subject data Field.
7. according to the method described in claim 4, it is characterized in that, described trace to the source table generation for characterizing according to the theme State the subject heading list in theme library, comprising:
It deletes the theme to trace to the source the tracing to the source of the subject data for including in table, obtains the interim table of theme;
The interim table of theme is set to the mapping table of subject heading list, includes the interim literary name section of theme and theme literary name section in the mapping table Between mapping relations;
The corresponding subject heading list of the interim table of the theme is determined according to the mapping table.
8. the method according to the description of claim 7 is characterized in that the first Field Count for including in the interim literary name section of the theme According to being each designated key attribute for including in the interim table of the theme;The second field data for including in the theme literary name section It is each designated key attribute for including in the subject heading list, including in the interim table of the theme in the mapping relations includes The first mapping relations between each designated key attribute for including in each designated key attribute and the subject heading list;
The third field data for including in the interim literary name section of theme is each reserved master for including in the interim table of the theme Inscribe attribute;The 4th field data for including in the theme literary name section is each reserved theme category for including in the subject heading list Property;Include including each reserved subject attribute for including in the interim table of the theme and in the subject heading list in the mapping relations Each reserved subject attribute between the second mapping relations.
9. a kind of device for constructing data warehouse, which is characterized in that the data warehouse includes one or more themes library, described Device includes:
Setup module is configured as setting theme priority allocation list, and the theme priority allocation list is for configuring each finger Determine priority of the subject attribute in each specified data source;
Determining module is configured as the priority according to each designated key attribute in each specified data source, determines Each corresponding subject data of the designated key attribute and the subject data are traced to the source, and are obtained theme and are traced to the source table;
Generation module, the table that is configured as being traced to the source according to the theme generate the subject heading list for characterizing the theme library.
10. device according to claim 9, which is characterized in that the designated key attribute is from each specified number According to the subject attribute for being used to describe the theme library extracted in source;The specified data source is specified for constructing the master The data source of exam pool.
11. device according to claim 9 or 10, which is characterized in that include being used in the theme priority allocation list It describes the first kind field of the specified data source, the second class field for describing the designated key attribute and is used for The third class field of priority of the designated key attribute in each specified data source is described.
12. device according to claim 11, which is characterized in that further include pre- write down characters in the theme priority allocation list Section, the reserved field is the field in the reserved data source, and/or reserved subject attribute for subsequent expansion.
13. device according to claim 9, which is characterized in that the determining module includes:
Submodule is chosen, is configured as any designated key attribute, according to the designated key attribute each specified Priority in data source selects the corresponding specified data source of highest priority;
First determines submodule, is configured as in the corresponding specified data source of highest priority and the designated key When the corresponding source data of attribute is valid data, then the corresponding source data of designated key attribute is determined as the designated key category The corresponding subject data of property, and the corresponding specified data source of highest priority is determined as tracing to the source for the subject data, it obtains It traces to the source table to the theme;
Second determines submodule, is configured as in the corresponding specified data source of highest priority and the designated key When the corresponding source data of attribute is invalid data, then the priority according to the designated key attribute in each specified data source In, time corresponding specified data source of high priority is selected, is up to inquiring the corresponding source data of designated key attribute When valid data, determine that the corresponding theme is traced to the source table.
14. the device according to claim 9 or 13, which is characterized in that it includes each for describing that the theme, which is traced to the source in table, 4th field of the corresponding subject data of a designated key attribute and to trace to the source for describing the subject data Five fields.
15. device according to claim 12, which is characterized in that the generation module includes:
Submodule is deleted, is configured as deleting the theme and traces to the source the tracing to the source of the subject data for including in table, obtain theme Interim table;
Submodule is set, is configured as the setting interim table of theme to the mapping table of subject heading list, faces in the mapping table including theme When literary name section and theme literary name section between mapping relations;
Third determines submodule, is configured as determining the corresponding subject heading list of the interim table of the theme according to the mapping table.
16. device according to claim 15, which is characterized in that the first field for including in the interim literary name section of theme Data are each designated key attribute for including in the interim table of the theme;The second Field Count for including in the theme literary name section According to being each designated key attribute for including in the subject heading list, including in the interim table of the theme in the mapping relations includes Each designated key attribute and the subject heading list in include each designated key attribute between the first mapping relations;
The third field data for including in the interim literary name section of theme is each reserved master for including in the interim table of the theme Inscribe attribute;The 4th field data for including in the theme literary name section is each reserved theme category for including in the subject heading list Property;Include including each reserved subject attribute for including in the interim table of the theme and in the subject heading list in the mapping relations Each reserved subject attribute between the second mapping relations.
17. a kind of non-transitorycomputer readable storage medium, is stored thereon with computer program, which is characterized in that the program The step of any one of claim 1~8 the method is realized when being executed by processor.
18. a kind of device for constructing data warehouse, which is characterized in that the data warehouse includes one or more themes library, institute Stating device includes:
Processor;
Memory for storage processor executable instruction;
Wherein, the processor is configured to:
Theme priority allocation list is set, and the theme priority allocation list is for configuring each designated key attribute in each finger Determine the priority in data source;
According to priority of each designated key attribute in each specified data source, each designated key category is determined The corresponding subject data of property and the subject data are traced to the source, and are obtained theme and are traced to the source table;
Subject heading list for characterizing the theme library is generated according to theme table of tracing to the source.
CN201910563806.4A 2019-06-26 2019-06-26 Method and device for constructing data warehouse Active CN110297818B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910563806.4A CN110297818B (en) 2019-06-26 2019-06-26 Method and device for constructing data warehouse

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910563806.4A CN110297818B (en) 2019-06-26 2019-06-26 Method and device for constructing data warehouse

Publications (2)

Publication Number Publication Date
CN110297818A true CN110297818A (en) 2019-10-01
CN110297818B CN110297818B (en) 2022-03-01

Family

ID=68029128

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910563806.4A Active CN110297818B (en) 2019-06-26 2019-06-26 Method and device for constructing data warehouse

Country Status (1)

Country Link
CN (1) CN110297818B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111143463A (en) * 2020-01-06 2020-05-12 中国工商银行股份有限公司 Method and device for constructing bank data warehouse based on topic model

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040049492A1 (en) * 2002-09-09 2004-03-11 Lucent Technologies Inc. Distinct sampling system and a method of distinct sampling for a database
CN1975772A (en) * 2006-12-22 2007-06-06 中国建设银行股份有限公司 Method and device for integrating information in multi-system
US20080282085A1 (en) * 2005-12-09 2008-11-13 Eurotech Spa Method to Search for Affinities Between Subjects and Relative Apparatus
US20110004622A1 (en) * 2007-10-17 2011-01-06 Blazent, Inc. Method and apparatus for gathering and organizing information pertaining to an entity
CN103853820A (en) * 2014-02-20 2014-06-11 北京用友政务软件有限公司 Data processing method and data processing system
CN105830053A (en) * 2014-01-16 2016-08-03 英特尔公司 An apparatus, method, and system for a fast configuration mechanism
CN106294521A (en) * 2015-06-12 2017-01-04 交通银行股份有限公司 Date storage method and data warehouse
US20170116306A1 (en) * 2015-10-23 2017-04-27 Numerify, Inc. Automated Definition of Data Warehouse Star Schemas
US20170116307A1 (en) * 2015-10-23 2017-04-27 Numerify, Inc. Automated Refinement and Validation of Data Warehouse Star Schemas
CN106933907A (en) * 2015-12-31 2017-07-07 北京国双科技有限公司 The processing method and processing device of tables of data extended counter
CN107657049A (en) * 2017-09-30 2018-02-02 深圳市华傲数据技术有限公司 A kind of data processing method based on data warehouse
CN107704590A (en) * 2017-09-30 2018-02-16 深圳市华傲数据技术有限公司 A kind of data processing method and system based on data warehouse
CN108520008A (en) * 2018-03-15 2018-09-11 链家网(北京)科技有限公司 The construction method and construction device of data warehouse model
CN109033173A (en) * 2018-06-21 2018-12-18 深圳市彬讯科技有限公司 It is a kind of for generating the data processing method and device of multidimensional index data
CN109145164A (en) * 2018-08-28 2019-01-04 百度在线网络技术(北京)有限公司 Data processing method, device, equipment and medium
CN109522312A (en) * 2018-11-27 2019-03-26 北京锐安科技有限公司 A kind of data processing method, device, server and storage medium

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040049492A1 (en) * 2002-09-09 2004-03-11 Lucent Technologies Inc. Distinct sampling system and a method of distinct sampling for a database
US20080282085A1 (en) * 2005-12-09 2008-11-13 Eurotech Spa Method to Search for Affinities Between Subjects and Relative Apparatus
CN1975772A (en) * 2006-12-22 2007-06-06 中国建设银行股份有限公司 Method and device for integrating information in multi-system
US20110004622A1 (en) * 2007-10-17 2011-01-06 Blazent, Inc. Method and apparatus for gathering and organizing information pertaining to an entity
CN105830053A (en) * 2014-01-16 2016-08-03 英特尔公司 An apparatus, method, and system for a fast configuration mechanism
CN103853820A (en) * 2014-02-20 2014-06-11 北京用友政务软件有限公司 Data processing method and data processing system
CN106294521A (en) * 2015-06-12 2017-01-04 交通银行股份有限公司 Date storage method and data warehouse
US20170116307A1 (en) * 2015-10-23 2017-04-27 Numerify, Inc. Automated Refinement and Validation of Data Warehouse Star Schemas
US20170116306A1 (en) * 2015-10-23 2017-04-27 Numerify, Inc. Automated Definition of Data Warehouse Star Schemas
CN106933907A (en) * 2015-12-31 2017-07-07 北京国双科技有限公司 The processing method and processing device of tables of data extended counter
CN107657049A (en) * 2017-09-30 2018-02-02 深圳市华傲数据技术有限公司 A kind of data processing method based on data warehouse
CN107704590A (en) * 2017-09-30 2018-02-16 深圳市华傲数据技术有限公司 A kind of data processing method and system based on data warehouse
CN108520008A (en) * 2018-03-15 2018-09-11 链家网(北京)科技有限公司 The construction method and construction device of data warehouse model
CN109033173A (en) * 2018-06-21 2018-12-18 深圳市彬讯科技有限公司 It is a kind of for generating the data processing method and device of multidimensional index data
CN109145164A (en) * 2018-08-28 2019-01-04 百度在线网络技术(北京)有限公司 Data processing method, device, equipment and medium
CN109522312A (en) * 2018-11-27 2019-03-26 北京锐安科技有限公司 A kind of data processing method, device, server and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHAVINKING: "什么是数据仓库主题", 《HTTPS://WWW.CNBLOGS.COM/WCWEN1990/P/7600251.HTML》 *
周世雄: "基于供应链的数据仓库***在服装行业的应用研究", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 *
张洪波: "基于供应链的数据仓库***研究", 《中国优秀硕博士学位论文全文数据库(硕士) 信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111143463A (en) * 2020-01-06 2020-05-12 中国工商银行股份有限公司 Method and device for constructing bank data warehouse based on topic model
CN111143463B (en) * 2020-01-06 2023-07-04 中国工商银行股份有限公司 Construction method and device of bank data warehouse based on topic model

Also Published As

Publication number Publication date
CN110297818B (en) 2022-03-01

Similar Documents

Publication Publication Date Title
CN104573115B (en) Support the realization method and system of the integrated interface of multi-type database operation
JP6608972B2 (en) Method, device, server, and storage medium for searching for group based on social network
CN108369497B (en) Learning from input patterns in example programming
CN103731536B (en) A kind of method of shared high in the clouds family address list
CN110162637B (en) Information map construction method, device and equipment
US20240061712A1 (en) Method, apparatus, and system for creating training task on ai training platform, and medium
CN102799625A (en) Method and system for excavating topic core circle in social networking service
CN112528067A (en) Graph database storage method, graph database reading method, graph database storage device, graph database reading device and graph database reading equipment
CN115525580A (en) Namespace setting method and device and readable storage medium
CN115733763A (en) Label propagation method and device for associated network and computer readable storage medium
CN106126115A (en) A kind of method and device of the disk of EVM(extended virtual machine)
CN106383826A (en) Database checking method and apparatus
CN110297818A (en) Construct the method and device of data warehouse
CN110119396A (en) Data managing method and Related product
CN103414756B (en) A kind of task distribution method, distribution node and system
CN110908644B (en) Configuration method and device of state node, computer equipment and storage medium
US10439897B1 (en) Method and apparatus for enabling customized control to applications and users using smart tags
CN109213565A (en) Management method, relevant device and the storage medium of isomery virtual computing resource
WO2021051569A1 (en) Data isolation method and apparatus, computer device and storage medium
CN113868508B (en) Writing material query method and device, electronic equipment and storage medium
CN109145633A (en) Track data method for secret protection, electronic equipment, storage medium and system
CN114968950A (en) Task processing method and device, electronic equipment and medium
CN110059080B (en) Data processing method and device
CN113691403A (en) Topological node configuration method, related device and computer program product
CN105516274A (en) Method and system for realizing SAN (Storage Network Area)-generic-provider based on cloud platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant