CN108549659B - Data warehouse management system and management method - Google Patents

Data warehouse management system and management method Download PDF

Info

Publication number
CN108549659B
CN108549659B CN201810201836.6A CN201810201836A CN108549659B CN 108549659 B CN108549659 B CN 108549659B CN 201810201836 A CN201810201836 A CN 201810201836A CN 108549659 B CN108549659 B CN 108549659B
Authority
CN
China
Prior art keywords
data
query
warehousing
file
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810201836.6A
Other languages
Chinese (zh)
Other versions
CN108549659A (en
Inventor
郁建林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongcheng Taixin Suzhou Technology Development Co ltd
Original Assignee
Zhongcheng Taixin Suzhou Technology Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongcheng Taixin Suzhou Technology Development Co ltd filed Critical Zhongcheng Taixin Suzhou Technology Development Co ltd
Priority to CN201810201836.6A priority Critical patent/CN108549659B/en
Publication of CN108549659A publication Critical patent/CN108549659A/en
Application granted granted Critical
Publication of CN108549659B publication Critical patent/CN108549659B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/604Tools and structures for managing or administering access control systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6227Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Automation & Control Theory (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention relates to the technical field of data warehouses, in particular to a data warehouse management system which comprises a data warehousing module, a data storage module, a data browsing module, a data query positioning module and a data downloading module, wherein the data warehousing module comprises an automatic scanning warehousing unit which is used for writing data into a data warehouse in an automatic scanning mode; the automatic scanning and warehousing mode comprises the following steps: scanning and warehousing according to an original tree folder mode, scanning and warehousing undifferentiated data, screening and warehousing files, and warehousing data at specific positions. According to the invention, file management is carried out by combining software and hardware, the traditional manual data management mode is changed, and the management efficiency of the data warehouse is effectively improved.

Description

Data warehouse management system and management method
Technical Field
The invention relates to the technical field of data warehouses, in particular to a data warehouse management system and a data warehouse management method.
Background
A data warehouse (dataware house, which may be abbreviated as DW or DWH) is a structured data environment. The data warehouse can provide data support for data analysis, data reporting, data mining and other applications.
Data warehouse management is a core content of data warehouse operation, and the existing data uploading and maintaining methods of data warehouses are generally as follows: the data warehouse manager uploads data regularly/irregularly by a manual mode, analyzes metadata of the data warehouse, then arranges a failed data table list according to an analysis result, provides the failed data table list to a corresponding technical responsible person, and the technical responsible person respectively performs failure confirmation on each table in the failed data table list and performs corresponding processing on each table after the failure confirmation, such as deleting a corresponding table. Namely, the existing data uploading and maintaining method of the data warehouse still stays in the manual management stage, the management workload of the data warehouse is large, and the management efficiency is low. With the increasing of large data capacity and types of various industries, the conventional data management system cannot meet the application requirements for storing and managing such data.
Therefore, it is desirable to provide a new data warehouse management system and method.
Disclosure of Invention
In view of the above problems in the prior art, an object of the present invention is to provide a data warehouse management system and a management method, so as to implement automatic warehouse entry and management of data.
In a first aspect, the present invention provides a data warehouse management system, which comprises a data warehousing module, a data storage module, a data browsing module, a data query positioning module, and a data downloading module, wherein,
the data storage module is used for writing data into a data warehouse; the data includes raster image data, vector data, and result-type document data. The data storage module is used for carrying out storage cataloguing operation on raster image data (remote sensing satellite image data, intermediate processing data, general format data GeoTiff and the like), vector data, achievement documents and other data, and the data is convenient to browse, inquire and apply.
The data warehousing cataloging mainly refers to that a user catalogs data to be warehoused according to business needs in a certain mode, and the cataloging process is a tree building process. The data may be cataloged by year, region, category, etc.
The data warehousing module comprises an automatic scanning warehousing unit, and the automatic scanning warehousing unit is used for writing data into a data warehouse in an automatic scanning mode; the automatic scanning and warehousing mode comprises the following steps: scanning and warehousing according to an original tree folder mode, scanning and warehousing undifferentiated data, screening and warehousing files, and warehousing data at specific positions;
the data storage module is used for storing the data which are put in storage;
the data browsing module is used for browsing the data in the data warehouse;
the data query positioning module is used for performing query operation on data in the data warehouse and positioning the storage position of the data;
and the data downloading module is used for selectively downloading the inquired list data according to the requirement based on the operation result of the data inquiry positioning module.
Preferably, the system further comprises a data storage medium module and a communication controller module, wherein the communication controller module is used for scheduling and controlling the data storage medium module.
Preferably, the data storage medium module is a hard disk storage cube, the hard disk storage cube is formed by stacking a plurality of hard disk cabinets, and the communication controller module is used for scheduling and controlling each hard disk cabinet.
Preferably, the data storage module further comprises a manual storage unit, a data processing unit and a data storage medium monitoring unit;
the manual storage unit is used for writing data into a data warehouse in a manual input mode;
the data processing unit is used for performing warehousing processing on the warehoused data;
the data storage medium module monitoring unit is used for monitoring the state of the data storage medium module in real time.
Preferably, the warehousing process includes: labeling, screening a rule set, file context perception association and file information retrieval.
Preferably, the scanning and warehousing according to the original tree-shaped folder mode means that the files are scanned and warehoused according to the tree structure of the folders; the step of scanning and warehousing the undifferentiated data refers to that the structure of a folder is not reserved, and all files are placed in a list to be scanned and warehoused; the file screening and warehousing refers to warehousing of specific files; the specific location data warehousing means warehousing data located in a specific storage location.
Preferably, the data storage mode of the data storage module includes hard disk cabinet data storage and offline hard disk storage, the data browsing mode includes enlargement, reduction, full map display, full map enlargement, full map reduction, roaming, pointer, and map refreshing, and the data query operation of the data query positioning module includes yard/container query, tag query, query in container, general file query, offline/online data query, and same file query.
Preferably, the downloading mode in the data downloading module includes: the method comprises the steps of single-selection file downloading, multi-selection file downloading, data packet file downloading, multi-file queuing downloading, offline data delayed downloading and downloading breakpoint continuous transmission.
Preferably, the yard/container query refers to querying data belonging to a certain rule set; the label query refers to querying data with a specific label; the in-container query refers to a specific query mode set for each rule set for query; the file general query refers to querying according to the attribute of the file; the offline/online data query refers to supporting indifferent query of offline data and online data; the same file query refers to the removal of duplicate files through the md5 code of the query file.
Preferably, the method further comprises the following steps: and the user authority management module is used for distributing and managing the user authority.
In a second aspect, the present invention provides a data warehouse management method, including the following steps:
s1, writing data into a data warehouse through an automatic scanning data storage medium; the data warehouse supports warehousing of general files, and specifically includes but is not limited to warehousing cataloguing operation of raster image data (remote sensing satellite image data, intermediate processing data, general format data GeoTiff and the like), vector data, achievement documents and other data, so that browsing and query application of the data is facilitated.
The data warehousing cataloging mainly refers to that a user catalogs data to be warehoused according to business needs in a certain mode, and the cataloging process is a tree building process; the data may be cataloged by year, region, category, etc.
The automatic scanning and warehousing mode comprises the following steps: scanning and warehousing according to an original tree folder mode, scanning and warehousing undifferentiated data, screening and warehousing files, and warehousing data at specific positions;
s2, storing the data in storage;
s3, browsing the data in the data warehouse;
s4, inquiring data in the data warehouse, and positioning the storage position of the data;
and S5, selectively downloading the inquired list data according to the requirement based on the operation result of the data inquiry positioning module.
Preferably, the step S1 further includes: and performing warehousing processing on the warehoused data, and monitoring the state of the data storage medium in real time.
Preferably, the warehousing process includes: labeling, screening a rule set, file context perception association and file information retrieval.
Preferably, the scanning and warehousing according to the original tree-shaped folder mode means that the files are scanned and warehoused according to the tree structure of the folders; the step of scanning and warehousing the undifferentiated data refers to that the structure of a folder is not reserved, and all files are placed in a list to be scanned and warehoused; the file screening and warehousing refers to warehousing of specific files; the specific location data warehousing means warehousing data located in a specific storage location.
Preferably, the data storage manner in step S2 includes hard disk cabinet data storage and offline hard disk storage, the data browsing manner in step S3 includes zooming in, zooming out, full-map display, full-map zooming in, full-map zooming out, roaming, pointer, and map refreshing, and the data query operation in step S4 includes yard/container query, tag query, query in container, general file query, offline/online data query, and same file query.
Preferably, the data downloading method in step S5 includes: the method comprises the steps of single-selection file downloading, multi-selection file downloading, data packet file downloading, multi-file queuing downloading, offline data delayed downloading and downloading breakpoint continuous transmission.
Preferably, the yard/container query refers to querying data belonging to a certain rule set; the label query refers to querying data with a specific label; the in-container query refers to a specific query mode set for each rule set for query; the file general query refers to querying according to the attribute of the file; the offline/online data query refers to supporting indifferent query of offline data and online data; the same file query refers to the removal of duplicate files through the md5 code of the query file.
Preferably, the step S1 is preceded by:
s0. assign and manage user rights.
The data warehouse management system disclosed by the invention is wide in application, and is suitable for large-scale data center support data management application and personal information management and application according to the size of scale.
The large data center management mainly refers to the management of national big data, and the personal information management mainly refers to the management of personal computer files.
The data warehouse management system of the invention has the following characteristics:
(1) the software and the hardware are combined to carry out file management, so that even past data can be easily put in a warehouse;
(2) the data warehouse does not make an upper limit requirement on the capacity of data storage;
(3) the data warehouse supports the unified management of online and offline data;
(4) a data warehouse user dynamically establishes a data catalog according to actual services to form a data tree;
(5) the data browsing modes have diversity and can carry out operations such as zooming in, zooming out, full-image display, full-image zooming in, full-image zooming out, roaming, pointer, map refreshing and the like;
(6) aiming at different users with different use authorities, an administrator can distribute authorities such as inquiry area authorities, file downloading authorities and the like according to service characteristics;
(7) the data warehouse creation and the tree structure manufacturing are simple and convenient, and the application efficiency is high;
(8) the data warehouse supports the warehousing of general files, supports more satellite data sources, and supports all general format data including format data such as GeoTiff, HDF, H5 and the like.
The invention has the following beneficial effects:
the invention realizes the business functions of warehousing, browsing, inquiring, positioning, downloading and the like of the general files, and simultaneously, a user administrator can distribute the use permission of other users according to the business requirements. According to the invention, file management is carried out by combining software and hardware, the traditional manual data management mode is changed, and the management efficiency of the data warehouse is effectively improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions and advantages of the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic block diagram of a data warehouse management system according to the present invention;
FIG. 2 is a flow chart of a data warehouse management method of the present invention;
fig. 3 is a data warehousing flow diagram of the data warehouse management system of the present invention.
FIG. 4 is a flow chart of warehousing a file using the data warehouse management system of the present invention;
FIG. 5 is a flow chart of opening or downloading a file using the data warehouse management system of the present invention;
FIG. 6 is a schematic structural diagram of a single hard disk enclosure module;
FIG. 7 is a block diagram of a hard disk memory cube and a communication controller.
In the figures, the reference numerals correspond to: 1-hard disk dock, 2-hard disk controller, 3-hard disk cabinet, 4-communication controller module and 5-hard disk storage cube.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention.
It is noted that the terms "comprises" and "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The data warehouse of the present invention has a plurality of hard disk cabinets, each having a plurality of hard disk bays, which are referred to herein as data docks (which may be analogous to the concept of docks in wharfs).
The terms referred to in the present invention are explained below:
data container (container for short): when all data come, the data are preliminarily classified, and each container has a label: such as lake Taihu blue algae, Jiuli riparian zone, Yangcheng lake waterweeds, KEY (KEY needs to be managed and has different lengths) used as a container in the monitoring fields of bridge reservoirs and the like. Data that has some inherent logic is called a data container, and a folder can be considered a data container.
The data container has the following characteristics:
1) the data container must have a name.
2) Data containers cannot span physical hard disks.
3) The data container cannot cross logical hard disks.
4) Packages, folders, and files may appear below the data container.
5) The same files, packages, and folders may be located in different data containers.
6) One data container may be located in multiple yards.
7) The data container cannot contain a data container.
Data yard (stock yard for short): when a plurality of containers are put together, the management of the data containers is realized, and the management can be divided into: land utilization, ecological resources, marine resources, and the like. The data store represents concepts such as "topic", "domain", and the like. The data yard has the following characteristics:
1) the data storage yard must have a name.
2) The data storage yard may span logical hard disks but not physical hard disks.
3) Only data containers can be present below the data yard.
4) Multiple data storage yards may contain the same container underneath.
5) The data yard, data containers and packages are maintained by a database.
6) The data yard, data container, and package may each have a plurality of key-value pairs associated therewith.
Data packet (packet for short): the package is a physical concept and represents the concept of a combined file, such as a compressed package after being split, an shp file and other combined files. The bag has the following characteristics:
1) the packet must have a name.
2) The data package may contain folders and files below.
3) A file or folder can only be located in one package at most.
4) A packet may be located in different containers.
5) The data packet cannot be added directly to the yard.
6) The data package does not correspond to the actual file on the hard disk, and the information of the data package is maintained by the data warehouse.
7) The data packets can only be contained in the data container.
8) A packet may not contain a packet.
9) Because the data container cannot cross a logical hard disk, the package cannot cross a logical hard disk either.
A data dock: and a plurality of storage yards are placed together, so that the centralized management of the plurality of storage yards is realized.
Label (TAB): the data tag label is imported, so that the query can be performed more quickly;
keyword (KEY): such as location, field, year, date, satellite, sensor, etc.
A rule set: for all files that the user has selected, the rule set determines the following:
1) and warehousing which files in the selected files.
2) The rule set carries some preset storage yards.
3) The rule set adds some special key-value pairs to the files belonging to the rule set.
4) The rule set extracts special information in the file.
The rule set has the following characteristics: the rule set supports selection and cancellation, a plurality of rule sets support combination, and the rule set supports reverse selection, namely, data which does not belong to the rule set is put in storage.
The rule sets include a common file system rule set, a spatial data rule set, an office document rule set, and the like.
Example 1
As shown in fig. 1, the present invention discloses a data warehouse management system, which comprises a data warehousing module, a data storage module, a data browsing module, a data query positioning module and a data downloading module, wherein,
the data storage module comprises a manual storage unit, an automatic scanning storage unit, a data processing unit and a disk monitoring unit, and is used for writing data into a data warehouse in a manual input or automatic scanning mode. The data includes, but is not limited to, raster image data, vector data, and production-class document data. The data storage module is used for carrying out storage cataloguing operation on raster image data (remote sensing satellite image data, intermediate processing data, general format data GeoTiff and the like), vector data, achievement documents and other data, and the data is convenient to browse, inquire and apply.
The data warehousing cataloging mainly refers to that a user catalogs data to be warehoused according to business needs in a certain mode, and the cataloging process is a tree building process. The data may be cataloged by year, region, category, etc.
The manual storage unit is used for writing data into a data warehouse in a manual input mode;
the automatic scanning and warehousing unit is used for writing data into a data warehouse in an automatic scanning mode, such as a hard disk, a flash memory, a U disk, a CF card, an SD card and the like; the automatic scanning mode comprises the following steps:
(1) scanning and warehousing according to an original tree-shaped folder mode, namely scanning and warehousing files according to a tree structure of a folder;
(2) scanning and warehousing undifferentiated data, namely, not reserving a folder structure, and enabling all files to be in a list;
(3) and screening and warehousing the files, wherein the files can be set to be warehoused: if only data in docx and xlsx format is put in storage, only data with certain symbols in the file name is put in storage, and only data with the modification date within a certain range is put in storage; only data with a file size within a certain range is put in storage, and the like;
(4) specific position data are put into a warehouse; if only the data of the C disk is put in storage, only the data of a plurality of folders is put in storage, and the like;
the data processing unit is used for performing warehousing processing on the warehoused data; the warehousing treatment comprises the following steps:
(1) labeling; if the data in a certain folder is selected to be put in storage, the data in the folder can be selected to be marked with a 'performance evaluation number' label and a 'tin-free city' label, and the label can be used in a plurality of places such as data searching and the like, and the data can be searched and queried through the label;
(2) screening a rule set; for example, a rule set of remote sensing data, the rule set only stores scanned remote sensing data, such as hdf files, GeoTif files, shp files and the like, wherein the GeoTif and hdf files can be automatically marked with a "raster data label", compressed packet data with a specific name is stored, and if a compressed packet with the name of "GF 1_ WFV1_ E119.6_ N31.3_20150520_ L1a 0000817197" is searched, a label of "GF 1_ WFV 1", "original raster data" and the like is automatically marked; for example, the office rule set only scans and stores files with formats such as docx, xlsx, doc, vsdx and the like; multiple rule sets can be overlapped;
(3) context-aware association of files; the following table 1 provides a data source list of a data warehouse, and if a file in the TIFF format of the high-grade one-number satellite data is retrieved, whether an xml metadata file matched with the file exists or not is automatically searched; for example, if a file in the ". shp" format is retrieved, files such as dbf, prj, sbn, sbx and shx associated with the file can be automatically searched; these related files are then put into a data package. Different rule sets have different context-aware rules;
TABLE 1
Satellite shorthand Description of the invention
GF1 High-resolution first-satellite 2 m/8 m/16 m
GF2 High-resolution second satellite 0.8 m/3.2 m
ZY3 Resource third satellite 2.1 m/5.8 m
ZY02C Resource No. 02C satellite 2.36 m/5 m/10 m
HJ1 Environment I satellite 30 m/100 m/150 m/300 m
TH Sky-painted first satellite 5-10 m
RE RapidEye satellite 5 m
SPOT6 SPOT6 satellite 1.5 m/6 m
MODIS MDOIS afternoon star, afternoon star 250 m/500 m/1000 m
NPP NPP satellite 375 m/375 m
FY-3 Fengyun three-number satellite 250 m/1000 m
LANDSAT Landsat satellite data 15 m/30 m/100 m
BJ2 Beijing second satellite 0.8 m/3.2 m
(4) Retrieving file information; different rule sets retrieve different information of different files; for example, the rule set of picture data may retrieve information related to the image data, such as the width, height, resolution, etc. of the picture file. The remote sensing data rule set can retrieve longitude and latitude information, wave band number, projection information, resolution ratio and other information of the remote sensing data. Some text files can directly read out text data and store the text data in a database, so that the subsequent query is facilitated. By retrieving the data information in the storage or in the background, the speed of related query can be increased.
And the data storage medium monitoring unit is used for monitoring the state of the data storage medium in real time. Specifically, after the disk is inserted, all files are automatically retrieved and data items in the database are updated, meanwhile, the data are set to be in an online state, and after the disk is pulled out, the data state is automatically changed to be in an offline state. The disk may be in multiple states, such as: online state, scanning warehousing state, query state, offline state, unavailable state, forbidden state, etc.;
the data storage module is used for storing the data in storage; the data storage mode of the data storage module comprises hard disk cabinet data storage and offline hard disk storage.
The data browsing module is used for browsing the data in the data warehouse; the data browsing modes include but are not limited to zooming in, zooming out, full-image display, full-image zooming in, full-image zooming out, roaming, pointer and map refreshing;
the data query positioning module is used for performing query operation on data in the data warehouse and positioning the storage position of the data; specifically, data query operation is performed on raster image data, vector data, result documents and other data which are put into a database and belong to a tree-type directory structure according to data types, imaging time, satellites and sensors, and double-click is performed on the queried data so as to position a file where the data is located.
The data query operation of the data query positioning module includes but is not limited to storage yard/container query, label query, container query, file general query, off-line/on-line data query, and same file query.
1) Yard/container query: that is, data belonging to a certain rule set is queried, for example, all data belonging to a remote sensing container are queried, and the query result may include an shp vector file, GF1 satellite data and the like; then, if the data of the Office container is inquired, the inquired result can be files in formats of docx, doc, xlsx, vsdx and the like;
2) and (3) label query: that is, the query has some data tags, such as query "performance assessment" tag, which lists all data with this tag, and further such as query "GF 1" and "raster data" tag, which displays raster data of all GF1 satellites;
3) querying in the container: each rule set has a unique query mode, such as 'spatial query', 'vector file query', 'projection type query' and the like all belong to a 'remote sensing data rule set', and all data with a projection type of UTM (unified transform) can be queried, such as all data in the Zhejiang range; for example, word queries and the like belong to the office rule set. The program supports a plurality of query modes in the rule set;
4) file general query: all files have some same attributes, such as file name, modification date, extension, creation date, size, etc.; the program supports the query of the types, such as the query of all files with the extension name of shp, the query of md5 codes and the like;
5) offline/online data query: supporting indifference query of offline data and online data;
6) and querying the same file: when the program is put in storage, the md5 codes of the files are calculated, and if the md5 codes of the two files are the same, the contents of the two files are completely the same, and the function can be used for clearing redundant data files.
The data downloading module is used for selectively downloading the inquired list data according to the requirement based on the operation result of the data inquiry positioning module; the downloading mode in the data downloading module comprises but is not limited to single-selection file downloading, multi-selection file downloading, data file packaging downloading, multi-file queuing downloading, offline data delayed downloading and downloading breakpoint continuous transmission.
The data warehouse management system also comprises a data storage medium module and a communication controller module, wherein the data storage medium module is electrically connected with the communication controller module; the data storage medium module is a hard disk storage cube, the hard disk storage cube is formed by stacking a plurality of hard disk cabinets, and the communication controller module is used for scheduling and controlling each hard disk cabinet.
The invention adopts a two-layer cascade mode to organize the hard disk. As shown in fig. 6, the first stage is a hard disk cabinet, one hard disk cabinet 3 is formed by stacking a plurality of hard disk docks 1, and a hard disk controller 2 (DCU for short) is used to control the hard disk cabinet 3, so that each hard disk dock 1 can be switched on and off under the control of DCU commands, and the state of the hard disk dock 1 can be read. Wherein the hard disk controller 2 is composed of a single chip microcomputer.
As shown in fig. 7, in the second stage, a plurality of hard disk cabinets 3 are stacked to form a hard disk storage cube 5, and a communication controller module 4 (CCU for short) performs scheduling control on each hard disk cabinet, so as to integrally control each hard disk dock 1 in the hard disk storage cube 5, control the hard disk dock 1 to be turned on and off, and obtain the current state of the hard disk dock.
Example 2
The invention discloses a data warehouse management system, which comprises a data warehousing module, a data storage module, a data browsing module, a data query positioning module, a data downloading module and a user authority management module, wherein,
the data storage module comprises a manual storage unit, an automatic scanning storage unit, a data processing unit and a disk monitoring unit, and is used for writing data into a data warehouse in a manual input or automatic scanning mode. The data includes, but is not limited to, raster image data, vector data, and production-class document data.
The manual storage unit is used for writing data into a data warehouse in a manual input mode;
the automatic scanning and warehousing unit is used for writing data into a data warehouse in an automatic scanning mode, such as a hard disk, a flash memory, a U disk, a CF card, an SD card and the like; the automatic scanning mode comprises the following steps:
(1) scanning and warehousing according to an original tree-shaped folder mode, namely scanning and warehousing files according to a tree structure of a folder;
(2) scanning and warehousing undifferentiated data, namely, not reserving a folder structure, and enabling all files to be in a list;
(3) and screening and warehousing the files, wherein the files can be set to be warehoused: if only data in docx and xlsx format is put in storage, only data with certain symbols in the file name is put in storage, and only data with the modification date within a certain range is put in storage; only data with a file size within a certain range is put in storage, and the like;
(4) specific position data are put into a warehouse; if only the data of the C disk is put in storage, only the data of a plurality of folders is put in storage, and the like;
the data processing unit is used for performing warehousing processing on the warehoused data; the warehousing treatment comprises the following steps:
(1) labeling; if the data in a certain folder is selected to be put in storage, the data in the folder can be selected to be marked with a 'performance evaluation number' label and a 'tin-free city' label, and the label can be used in a plurality of places such as data searching and the like, and the data can be searched and queried through the label;
(2) screening a rule set; for example, a rule set of remote sensing data, the rule set only stores scanned remote sensing data, such as hdf files, GeoTif files, shp files and the like, wherein the GeoTif and hdf files can be automatically marked with a "raster data label", compressed packet data with a specific name is stored, and if a compressed packet with the name of "GF 1_ WFV1_ E119.6_ N31.3_20150520_ L1a 0000817197" is searched, a label of "GF 1_ WFV 1", "original raster data" and the like is automatically marked; for example, the office rule set only scans and stores files with formats such as docx, xlsx, doc, vsdx and the like; multiple rule sets can be overlapped;
(3) context-aware association of files; if a file in the TIFF format of the high-score first-number satellite data is searched, automatically searching whether an xml metadata file matched with the file exists; for example, if a file in the ". shp" format is retrieved, files such as dbf, prj, sbn, sbx and shx associated with the file can be automatically searched; these related files are then put into a data package. Different rule sets have different context-aware rules;
(4) retrieving file information; different rule sets retrieve different information of different files; for example, the rule set of picture data may retrieve information related to the image data, such as the width, height, resolution, etc. of the picture file. The remote sensing data rule set can retrieve longitude and latitude information, wave band number, projection information, resolution ratio and other information of the remote sensing data. Some text files can directly read out text data and store the text data in a database, so that the subsequent query is facilitated. By retrieving the data information in the storage or in the background, the speed of related query can be increased.
And the data storage medium monitoring unit is used for monitoring the state of the data storage medium in real time. Specifically, after the disk is inserted, all files are automatically retrieved and data items in the database are updated, meanwhile, the data are set to be in an online state, and after the disk is pulled out, the data state is automatically changed to be in an offline state. The disk may be in multiple states, such as: online state, scanning warehousing state, query state, offline state, unavailable state, forbidden state, etc.;
the data storage module is used for storing the data in storage; the data storage mode of the data storage module comprises hard disk cabinet data storage and offline hard disk storage.
The data browsing module is used for browsing the data in the data warehouse; the data browsing modes include but are not limited to zooming in, zooming out, full-image display, full-image zooming in, full-image zooming out, roaming, pointer and map refreshing;
the data query positioning module is used for performing query operation on data in the data warehouse and positioning the storage position of the data; specifically, data query operation is performed on raster image data, vector data, result documents and other data which are put into a database and belong to a tree-type directory structure according to data types, imaging time, satellites and sensors, and double-click is performed on the queried data so as to position a file where the data is located.
The data query operation of the data query positioning module includes but is not limited to storage yard/container query, label query, container query, file general query, off-line/on-line data query, and same file query.
1) Yard/container query: that is, data belonging to a certain rule set is queried, for example, all data belonging to a remote sensing container are queried, and the query result may include an shp vector file, GF1 satellite data and the like; then, if the data of the Office container is inquired, the inquired result can be files in formats of docx, doc, xlsx, vsdx and the like;
2) and (3) label query: that is, the query has some data tags, such as query "performance assessment" tag, which lists all data with this tag, and further such as query "GF 1" and "raster data" tag, which displays raster data of all GF1 satellites;
3) querying in the container: each rule set has a unique query mode, such as 'spatial query', 'vector file query', 'projection type query' and the like all belong to a 'remote sensing data rule set', and all data with a projection type of UTM (unified transform) can be queried, such as all data in the Zhejiang range; for example, word queries and the like belong to the office rule set. The program supports a plurality of query modes in the rule set;
4) file general query: all files have some same attributes, such as file name, modification date, extension, creation date, size, etc.; the program supports the query of the types, such as the query of all files with the extension name of shp, the query of md5 codes and the like;
5) offline/online data query: supporting indifference query of offline data and online data;
6) and querying the same file: when the program is put in storage, the md5 codes of the files are calculated, and if the md5 codes of the two files are the same, the contents of the two files are completely the same, and the function can be used for clearing redundant data files.
The data downloading module is used for selectively downloading the inquired list data according to the requirement based on the operation result of the data inquiry positioning module;
the downloading mode in the data downloading module comprises the following steps:
1) downloading the radio files;
2) downloading a multi-choice file;
3) downloading a data package file, namely packaging and downloading the data file;
4) supporting multi-file queuing downloading, adjusting the downloading sequence, and controlling downloading (re-downloading, suspending downloading, stopping downloading, etc.);
5) offline data delayed downloading, namely if the file is in an offline state, the downloading is marked as a 'planning task', and the file can be automatically downloaded in a background mode when the data is online and the client service is operated (namely in a non-shutdown state) through background service;
6) downloading breakpoint resuming: for example, if the disk is pulled out during the transmission of the data, the downloading is temporarily stopped, after the disk is reconnected, the program verifies the consistency of the file through the content such as md5, and the like, if the file is not modified, the downloading is continued from the breakpoint, if the file is modified, the downloading can be selected to be re-downloaded or abandoned, and if the file is deleted, the direct downloading fails and an error message is returned.
The user right management module is used for allocating and managing user rights, and specifically comprises:
1) the authority inherits: when the authority is set, the subfile can be chosen to inherit the authority;
2) setting batch permissions: namely, the same authority is simultaneously applied to a plurality of selected files;
3) the permission role is built in: such as Guest, administeror, User, etc., a User may belong to one or more rights roles. If the user belongs to a certain authority role, the user has all the authorities of the role;
4) built-in super administrator user: a super Administrator user belongs to an administeror role, and the role has all permissions, including the permission for configuring the permissions of other users;
5) and (5) login account password management.
The data warehouse management system also comprises a data storage medium module and a communication controller module, wherein the data storage medium module is electrically connected with the communication controller module; the data storage medium module is a hard disk storage cube, the hard disk storage cube is formed by stacking a plurality of hard disk cabinets, and the communication controller module is used for scheduling and controlling each hard disk cabinet.
The data warehouse management system of the invention has the following characteristics:
(1) the software and the hardware are combined to carry out file management, so that even past data can be easily put in a warehouse;
(2) the data warehouse does not make an upper limit requirement on the capacity of data storage;
(3) the data warehouse supports the unified management of online and offline data;
(4) a data warehouse user dynamically establishes a data catalog according to actual services to form a data tree;
(5) the data browsing modes have diversity and can carry out operations such as zooming in, zooming out, full-image display, full-image zooming in, full-image zooming out, roaming, pointer, map refreshing and the like;
(6) aiming at different users with different use authorities, an administrator can distribute authorities such as inquiry area authorities, file downloading authorities and the like according to service characteristics;
(7) the data warehouse creation and the tree structure manufacturing are simple and convenient, and the application efficiency is high;
(8) the data warehouse supports the warehousing of general files, supports more satellite data sources, and supports all general format data including format data such as GeoTiff, HDF, H5 and the like.
Example 3
As shown in fig. 2, the present invention further provides a data warehouse management method, which includes the following steps:
s1, writing data into a data warehouse by adopting a manual input or automatic scanning mode; the data includes but is not limited to raster image data, vector data and result type document data; as shown in fig. 3, the data storage specifically includes: performing warehousing and cataloguing operation on raster image data (remote sensing satellite image data, intermediate processing data, general format data GeoTiff and the like), vector data, result documents and other data, so as to conveniently realize browsing and query application of the data;
the data warehousing cataloging mainly refers to that a user catalogs data to be warehoused according to business needs in a certain mode, and the cataloging process is a tree building process. The data may be cataloged by year, region, category, etc.
The automatic scanning mode comprises the following steps:
(1) scanning and warehousing according to an original tree-shaped folder mode, namely scanning and warehousing files according to a tree structure of a folder;
(2) scanning and warehousing undifferentiated data, namely, not reserving a folder structure, and enabling all files to be in a list;
(3) and screening and warehousing the files, wherein the files can be set to be warehoused: if only data in docx and xlsx format is put in storage, only data with certain symbols in the file name is put in storage, and only data with the modification date within a certain range is put in storage; only data with a file size within a certain range is put in storage, and the like;
(4) specific position data are put into a warehouse; if only the data of the C disk is put in storage, only the data of a plurality of folders is put in storage, and the like;
then, performing warehousing processing on the warehoused data, and monitoring the state of the data storage medium in real time;
the warehousing treatment comprises the following steps:
(1) labeling; if the data in a certain folder is selected to be put in storage, the data in the folder can be selected to be marked with a 'performance evaluation number' label and a 'tin-free city' label, and the label can be used in a plurality of places such as data searching and the like, and the data can be searched and queried through the label;
(2) screening a rule set; for example, a rule set of remote sensing data, the rule set only stores scanned remote sensing data, such as hdf files, GeoTif files, shp files and the like, wherein the GeoTif and hdf files can be automatically marked with a "raster data label", compressed packet data with a specific name is stored, and if a compressed packet with the name of "GF 1_ WFV1_ E119.6_ N31.3_20150520_ L1a 0000817197" is searched, a label of "GF 1_ WFV 1", "original raster data" and the like is automatically marked; for example, the office rule set only scans and stores files with formats such as docx, xlsx, doc, vsdx and the like; multiple rule sets can be overlapped;
(3) context-aware association of files; if a file in the TIFF format of the high-grade first-number satellite data is searched, automatically searching whether an xml metadata file matched with the file exists or not; for example, if a file in the ". shp" format is retrieved, files such as dbf, prj, sbn, sbx and shx associated with the file can be automatically searched; these related files are then put into a data package. Different rule sets have different context-aware rules;
(4) retrieving file information; different rule sets retrieve different information of different files; for example, the rule set of picture data may retrieve information related to the image data, such as the width, height, resolution, etc. of the picture file. The remote sensing data rule set can retrieve longitude and latitude information, wave band number, projection information, resolution ratio and other information of the remote sensing data. Some text files can directly read out text data and store the text data in a database, so that the subsequent query is facilitated. By retrieving the data information in the storage or in the background, the speed of related query can be increased.
After the disk is inserted, all files are automatically retrieved and data items in the database are updated, meanwhile, the data are set to be in an online state, and after the disk is pulled out, the data state is automatically changed to be in an offline state. The disk may be in multiple states, such as: online state, scanning warehousing state, query state, offline state, unavailable state, forbidden state, etc.;
the invention can also write the data into the data warehouse in a manual input mode;
s2, storing the data in storage; the data storage mode comprises hard disk cabinet data storage and offline hard disk storage;
s3, browsing the data stored in the data warehouse; the data browsing modes comprise zooming in, zooming out, displaying the whole image, zooming in the whole image, zooming out the whole image, roaming, a pointer and map refreshing;
s4, inquiring data in the data warehouse, and positioning the storage position of the data; the data query operation comprises yard/container query, label query, in-container query, file general query, off-line/on-line data query and same file query;
1) yard/container query: that is, data belonging to a certain rule set is queried, for example, all data belonging to a remote sensing container are queried, and the query result may include an shp vector file, GF1 satellite data and the like; then, if the data of the Office container is inquired, the inquired result can be files in formats of docx, doc, xlsx, vsdx and the like;
2) and (3) label query: that is, the query has some data tags, such as query "performance assessment" tag, which lists all data with this tag, and further such as query "GF 1" and "raster data" tag, which displays raster data of all GF1 satellites;
3) querying in the container: each rule set has a unique query mode, such as 'spatial query', 'vector file query', 'projection type query' and the like all belong to a 'remote sensing data rule set', and all data with a projection type of UTM (unified transform) can be queried, such as all data in the Zhejiang range; for example, word queries and the like belong to the office rule set. The program supports a plurality of query modes in the rule set;
4) file general query: all files have some same attributes, such as file name, modification date, extension, creation date, size, etc.; the program supports the query of the types, such as the query of all files with the extension name of shp, the query of md5 codes and the like;
5) offline/online data query: supporting indifference query of offline data and online data;
6) and querying the same file: when the program is put in storage, the md5 codes of the files are calculated, and if the md5 codes of the two files are the same, the contents of the two files are completely the same, and the function can be used for clearing redundant data files.
S5, selectively downloading the inquired list data according to the requirement based on the inquiry result; the data downloading mode comprises the following steps: the method comprises the steps of single-selection file downloading, multi-selection file downloading, data packet file downloading, multi-file queuing downloading, offline data delayed downloading and downloading breakpoint continuous transmission.
Example 4
The invention provides a data warehouse management method, which comprises the following steps:
s0. assigning and managing user rights;
s1, writing data into a data warehouse by adopting a manual input or automatic scanning mode; the data includes but is not limited to raster image data, vector data and result type document data; the method specifically comprises the following steps: performing warehousing and cataloguing operation on raster image data (remote sensing satellite image data, intermediate processing data, general format data GeoTiff and the like), vector data, result documents and other data, so as to conveniently realize browsing and query application of the data;
then, performing warehousing processing on the warehoused data, and monitoring the state of the data storage medium in real time;
the data warehousing cataloging mainly refers to that a user catalogs data to be warehoused according to business needs in a certain mode, and the cataloging process is a tree building process. The data may be cataloged by year, region, category, etc.
A file warehousing button is arranged on a use interface of the data warehouse, as shown in fig. 4, the file warehousing button is clicked firstly, a user inputs verification information, the system verifies whether the system has warehousing authority, and if the system does not have the authority, the flow is ended; and if the user has the authority, selecting one or more files by the user and selecting a copy path to copy and store in a warehouse.
The automatic scanning mode comprises the following steps:
(1) scanning and warehousing according to an original tree-shaped folder mode, namely scanning and warehousing files according to a tree structure of a folder;
(2) scanning and warehousing undifferentiated data, namely, not reserving a folder structure, and enabling all files to be in a list;
(3) and screening and warehousing the files, wherein the files can be set to be warehoused: if only data in docx and xlsx format is put in storage, only data with certain symbols in the file name is put in storage, and only data with the modification date within a certain range is put in storage; only data with a file size within a certain range is put in storage, and the like;
(4) specific position data are put into a warehouse; if only the data of the C disk is put in storage, only the data of a plurality of folders is put in storage, and the like;
then, performing warehousing processing on the warehoused data, and monitoring the state of the data storage medium in real time;
the warehousing treatment comprises the following steps:
(1) labeling; if the data in a certain folder is selected to be put in storage, the data in the folder can be selected to be marked with a 'performance evaluation number' label and a 'tin-free city' label, and the label can be used in a plurality of places such as data searching and the like, and the data can be searched and queried through the label;
(2) screening a rule set; for example, a rule set of remote sensing data, the rule set only stores scanned remote sensing data, such as hdf files, GeoTif files, shp files and the like, wherein the GeoTif and hdf files can be automatically marked with a "raster data label", compressed packet data with a specific name is stored, and if a compressed packet with the name of "GF 1_ WFV1_ E119.6_ N31.3_20150520_ L1a 0000817197" is searched, a label of "GF 1_ WFV 1", "original raster data" and the like is automatically marked; for example, the office rule set only scans and stores files with formats such as docx, xlsx, doc, vsdx and the like; multiple rule sets can be overlapped;
(3) context-aware association of files; if a file in the TIFF format of the high-grade first-number satellite data is searched, automatically searching whether an xml metadata file matched with the file exists or not; for example, if a file in the ". shp" format is retrieved, files such as dbf, prj, sbn, sbx and shx associated with the file can be automatically searched; these related files are then put into a data package. Different rule sets have different context-aware rules;
(4) retrieving file information; different rule sets retrieve different information of different files; for example, the rule set of picture data may retrieve information related to the image data, such as the width, height, resolution, etc. of the picture file. The remote sensing data rule set can retrieve longitude and latitude information, wave band number, projection information, resolution ratio and other information of the remote sensing data. Some text files can directly read out text data and store the text data in a database, so that the subsequent query is facilitated. By retrieving the data information in the storage or in the background, the speed of related query can be increased.
After the disk is inserted, all files are automatically retrieved and data items in the database are updated, meanwhile, the data are set to be in an online state, and after the disk is pulled out, the data state is automatically changed to be in an offline state. The disk may be in multiple states, such as: online state, scanning warehousing state, query state, offline state, unavailable state, forbidden state, etc.;
the invention can also write the data into the data warehouse in a manual input mode;
s2, storing the data in storage; the data storage mode comprises hard disk cabinet data storage and offline hard disk storage;
s3, browsing the data stored in the data warehouse; the data browsing modes comprise zooming in, zooming out, displaying the whole image, zooming in the whole image, zooming out the whole image, roaming, a pointer and map refreshing;
s4, inquiring data in the data warehouse, and positioning the storage position of the data; the data query operation comprises yard/container query, label query, in-container query, file general query, off-line/on-line data query and same file query;
1) yard/container query: that is, data belonging to a certain rule set is queried, for example, all data belonging to a remote sensing container are queried, and the query result may include an shp vector file, GF1 satellite data and the like; then, if the data of the Office container is inquired, the inquired result can be files in formats of docx, doc, xlsx, vsdx and the like;
2) and (3) label query: that is, the query has some data tags, such as query "performance assessment" tag, which lists all data with this tag, and further such as query "GF 1" and "raster data" tag, which displays raster data of all GF1 satellites;
3) querying in the container: each rule set has a unique query mode, such as 'spatial query', 'vector file query', 'projection type query' and the like all belong to a 'remote sensing data rule set', and all data with a projection type of UTM (unified transform) can be queried, such as all data in the Zhejiang range; for example, word queries and the like belong to the office rule set. The program supports a plurality of query modes in the rule set;
4) file general query: all files have some same attributes, such as file name, modification date, extension, creation date, size, etc.; the program supports the query of the types, such as the query of all files with the extension name of shp, the query of md5 codes and the like;
5) offline/online data query: supporting indifference query of offline data and online data;
6) and querying the same file: when the program is put in storage, the md5 codes of the files are calculated, and if the md5 codes of the two files are the same, the contents of the two files are completely the same, and the function can be used for clearing redundant data files.
S5, selectively downloading the inquired list data according to the requirement based on the inquiry result; the data downloading mode comprises the following steps: the method comprises the steps of single-selection file downloading, multi-selection file downloading, data packet file downloading, multi-file queuing downloading, offline data delayed downloading and downloading breakpoint continuous transmission.
As shown in fig. 5, when a user needs to open or download a file, first, whether the file has a right is verified, if the file does not have a right, the process is ended, if the file has a right, whether the file is online is determined, and the file that is not online is browsed and downloaded in a manner of loading a disk.
It should be noted that the embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A data warehouse management system is characterized by comprising a data warehousing module, a data storage module, a data browsing module, a data query positioning module and a data downloading module, wherein,
the data storage module is used for writing data into a data warehouse;
the data storage module comprises an automatic scanning storage unit, a data storage medium module monitoring unit and a data processing unit, wherein the automatic scanning storage unit is used for writing data into a data warehouse in an automatic scanning mode; the automatic scanning and warehousing mode comprises the following steps: scanning and warehousing according to an original tree-shaped folder mode, scanning and warehousing undifferentiated data, screening and warehousing files, and warehousing data at specific positions, wherein the scanning and warehousing according to the original tree-shaped folder mode refers to scanning and warehousing the files according to a tree structure of a folder; the step of scanning and warehousing the undifferentiated data refers to that the structure of a folder is not reserved, and all files are placed in a list to be scanned and warehoused; the file screening and warehousing refers to warehousing of specific files; the step of storing the data at the specific position refers to storing the data at the specific storage position; the data processing unit is used for performing warehousing processing on the warehoused data; the warehousing treatment comprises the following steps: labeling, screening a rule set, file context perception association and file information retrieval; the data storage medium module monitoring unit is used for monitoring the state of the data storage medium in real time;
the data storage module is used for storing the data which are put in storage;
the data browsing module is used for browsing the data in the data warehouse;
the data query positioning module is used for performing query operation on data in the data warehouse and positioning the storage position of the data;
and the data downloading module is used for selectively downloading the inquired list data according to the requirement based on the operation result of the data inquiry positioning module.
2. The management system of claim 1, further comprising a data storage media module and a communication controller module, the communication controller module being configured to perform scheduling control on the data storage media module.
3. The management system according to claim 2, wherein the data storage medium module is a hard disk storage cube, the hard disk storage cube is formed by stacking a plurality of hard disk cabinets, and the communication controller module is configured to perform scheduling control on each hard disk cabinet;
the data warehousing module also comprises a manual warehousing unit,
and the manual storage unit is used for writing the data into the data warehouse in a manual input mode.
4. The management system according to claim 3, wherein the data storage modes of the data storage module include hard disk cabinet data storage and offline hard disk storage, the data browsing modes include zooming in, zooming out, full-map display, full-map zooming in, full-map zooming out, roaming, pointer and map refreshing, and the data query operation of the data query positioning module includes stock dump/container query, tag query, query in container, general file query, offline/online data query and same file query; the downloading mode in the data downloading module comprises the following steps: the method comprises the steps of single-selection file downloading, multi-selection file downloading, data packet file downloading, multi-file queuing downloading, offline data delayed downloading and downloading breakpoint continuous transmission.
5. The management system according to claim 4, wherein the yard/container query refers to querying data belonging to a certain rule set; the label query refers to querying data with a specific label; the in-container query refers to a specific query mode set for each rule set for query; the file general query refers to querying according to the attribute of the file; the offline/online data query refers to supporting indifferent query of offline data and online data; the same file query refers to the removal of duplicate files through the md5 code of the query file.
6. The management system according to claim 1, further comprising: and the user authority management module is used for distributing and managing the user authority.
7. A data warehouse management method applied to the data warehouse management system according to any one of claims 1 to 6, comprising the steps of:
s1, writing data into a data warehouse in an automatic scanning mode; the automatic scanning and warehousing mode comprises the following steps: scanning and warehousing according to an original tree-shaped folder mode, scanning and warehousing undifferentiated data, screening and warehousing files, and warehousing data at specific positions, wherein the scanning and warehousing according to the original tree-shaped folder mode refers to scanning and warehousing the files according to a tree structure of a folder; the step of scanning and warehousing the undifferentiated data refers to that the structure of a folder is not reserved, and all files are placed in a list to be scanned and warehoused; the file screening and warehousing refers to warehousing of specific files; the step of storing the data at the specific position refers to storing the data at the specific storage position; performing warehousing processing on the warehoused data; the warehousing treatment comprises the following steps: labeling, screening a rule set, file context perception association and file information retrieval; monitoring the state of the data storage medium in real time;
s2, storing the data in storage;
s3, browsing the data in the data warehouse;
s4, inquiring data in the data warehouse, and positioning the storage position of the data;
and S5, selectively downloading the inquired list data according to the requirement based on the operation result of the data inquiry positioning module.
8. The method according to claim 7, wherein the data storage manner in step S2 includes hard disk cabinet data storage and offline hard disk storage, the data browsing manner in step S3 includes zoom-in, zoom-out, full map display, full map zoom-in, full map zoom-out, roaming, pointer, map refresh, the data query operation in step S4 includes stock yard/container query, tag query, query in container, general file query, offline/online data query, same file query; the data downloading method in step S5 includes: the method comprises the steps of single-selection file downloading, multi-selection file downloading, data packet file downloading, multi-file queuing downloading, offline data delayed downloading and downloading breakpoint continuous transmission.
9. The method of claim 8, wherein the yard/container query refers to querying data belonging to a certain rule set; the label query refers to querying data with a specific label; the in-container query refers to a specific query mode set for each rule set for query; the file general query refers to querying according to the attribute of the file; the offline/online data query refers to supporting indifferent query of offline data and online data; the same file query refers to the removal of duplicate files through the md5 code of the query file.
10. The method according to claim 7 or 9, wherein the step S1 is preceded by: s0. assign and manage user rights.
CN201810201836.6A 2018-03-12 2018-03-12 Data warehouse management system and management method Active CN108549659B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810201836.6A CN108549659B (en) 2018-03-12 2018-03-12 Data warehouse management system and management method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810201836.6A CN108549659B (en) 2018-03-12 2018-03-12 Data warehouse management system and management method

Publications (2)

Publication Number Publication Date
CN108549659A CN108549659A (en) 2018-09-18
CN108549659B true CN108549659B (en) 2021-08-06

Family

ID=63516102

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810201836.6A Active CN108549659B (en) 2018-03-12 2018-03-12 Data warehouse management system and management method

Country Status (1)

Country Link
CN (1) CN108549659B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110647766A (en) * 2019-09-19 2020-01-03 上海易点时空网络有限公司 Method and system for ensuring file downloading safety of data warehouse
CN110941586A (en) * 2019-10-25 2020-03-31 深圳市毕美科技有限公司 Engineering design data management method and system
CN114372104A (en) * 2022-01-10 2022-04-19 苏州久知联信息技术有限公司 Electronic file metadata acquisition tool and method with good compatibility
CN116796772A (en) * 2023-08-25 2023-09-22 北京思谨科技有限公司 Intelligent file cabinet control system of dynamic RFID

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101873226A (en) * 2010-06-21 2010-10-27 中兴通讯股份有限公司 Data storage method and device for statistical form system
CN102722529A (en) * 2012-05-18 2012-10-10 苏州万图明电子软件有限公司 Business information query system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102339323B (en) * 2011-11-11 2015-12-16 江苏鸿信***集成有限公司 A kind of method of carrying out data pick-up for DB2 data warehouse, dispatching and representing
US20140201192A1 (en) * 2013-01-15 2014-07-17 Syscom Computer Engineering Co. Automatic data index establishment method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101873226A (en) * 2010-06-21 2010-10-27 中兴通讯股份有限公司 Data storage method and device for statistical form system
CN102722529A (en) * 2012-05-18 2012-10-10 苏州万图明电子软件有限公司 Business information query system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
面向大型装备状态分析的分布式实时数据仓库构建技术;刘彦均 等;《计算机集成制造***》;20171015;第2326-2329页 *

Also Published As

Publication number Publication date
CN108549659A (en) 2018-09-18

Similar Documents

Publication Publication Date Title
CN108549659B (en) Data warehouse management system and management method
US9639529B2 (en) Method and system for searching stored data
US20170220614A1 (en) Consistent ring namespaces facilitating data storage and organization in network infrastructures
US8555018B1 (en) Techniques for storing data
US20150378721A1 (en) Methods for managing applications using semantic modeling and tagging and devices thereof
EP3369010A1 (en) Reducing resource consumption associated with storage and operation of containers
CN102930035A (en) Driving content items from multiple different content sources
CN108197260A (en) A kind of document file management system
CN104881466A (en) Method and device for processing data fragments and deleting garbage files
CN103020235A (en) Autonomous network stream transmission
CN110019048A (en) Document handling method, device, system and server based on MongoDB
US7080102B2 (en) Method and system for migrating data while maintaining hard links
CN105183768A (en) File management method, apparatus and terminal device
JPH04232563A (en) Document controlling method
CN105824723A (en) Method and system for backup of data of public cloud storage account
Rabinovici-Cohen et al. PDS cloud: long term digital preservation in the cloud
EP2884408B1 (en) Content management systems for content items and methods of operating content management systems
US20130346405A1 (en) Systems and methods for managing data items using structured tags
CN115114359A (en) User data processing method and device
CN107408239B (en) Architecture for managing mass data in communication application through multiple mailboxes
CN109756484A (en) Control method, control device, gateway and the medium of gateway based on object storage
US6952699B2 (en) Method and system for migrating data while maintaining access to data with use of the same pathname
CN110489060A (en) A kind of mixed file construction method and its system based on FUSE technology
US9542457B1 (en) Methods for displaying object history information
US10735504B2 (en) System and method for distributed workbook storage

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant