CN107066205A - A kind of data-storage system - Google Patents

A kind of data-storage system Download PDF

Info

Publication number
CN107066205A
CN107066205A CN201611257420.3A CN201611257420A CN107066205A CN 107066205 A CN107066205 A CN 107066205A CN 201611257420 A CN201611257420 A CN 201611257420A CN 107066205 A CN107066205 A CN 107066205A
Authority
CN
China
Prior art keywords
map
reduce
data
storage
tasks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611257420.3A
Other languages
Chinese (zh)
Other versions
CN107066205B (en
Inventor
惠润海
杨浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZHONGKE SUGON INFORMATION INDUSTRY CHENGDU Co.,Ltd.
Dawning Information Industry Beijing Co Ltd
Original Assignee
Dawning Information Industry Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dawning Information Industry Beijing Co Ltd filed Critical Dawning Information Industry Beijing Co Ltd
Priority to CN201611257420.3A priority Critical patent/CN107066205B/en
Publication of CN107066205A publication Critical patent/CN107066205A/en
Application granted granted Critical
Publication of CN107066205B publication Critical patent/CN107066205B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • G06F16/1824Distributed file systems implemented using Network-attached Storage [NAS] architecture
    • G06F16/183Provision of network file services by network file servers, e.g. by using NFS, CIFS
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention proposes a kind of data-storage system, and the data-storage system includes:Hadoop clusters, with the component being arranged in Hadoop clusters, nfs server module, wherein, component includes Map Reduce frameworks, and Map Reduce frameworks are used to perform Map Reduce flows, and Map Reduce flows include Map tasks and Reduce tasks, it is arranged on the disk array module in nfs server module, and pass through disk array module and nfs server module composition shared storage device, so as to provide storage for Hadoop clusters, and store the result of each Map task to shared storage device, to remove shuffle processes, so as to optimize the flow of Map tasks and Reduce tasks;And the file cutting used in Hadoop clusters is multiple pieces by component, and different computer nodes are dealt into by each piece, it is achieved thereby that load balancing.The present invention by the data-storage system so that system in cost performance, reliability, can safeguard, obtained larger improvement in terms of performance.

Description

A kind of data-storage system
Technical field
The present invention relates to the communications field, it particularly relates to a kind of data-storage system.
Background technology
In recent years, Hadoop increased income, and big data project is increasingly mature, and it brings feasible to each big data application industry Solution, the parallel processing framework Map-Reduce of Hadoop clusters is to structuring and the equal energy of the processing of semi-structured data Many nodal parallels are enough realized, the speed of Data Analysis Services can be largely lifted.
Meanwhile, the default storage of Hadoop clusters is used under the distributed file system HDFS carried, default situations, should HDFS is stored using three copies, still, for big data application, and many copies of HDFS acquiescences are stored with several defects:
Big data application system generally not only only does big data analysis, also numerous other types of business datum, because This HDFS is difficult the demand for meeting various application scenarios, especially small documents storage scenarios, therefore, it is necessary to by once during analysis Data are imported, and are imported data among HDFS, are caused great inconvenience;
The memory space utilization rate of HDFS three copies is 33.3%, and for big data is stored and is analyzed, cost is Fairly expensive;
HDFS belongs to open source projects, the reliability of file system, it is maintainable in terms of there is more problem, be not suitable for Store the critical data in production environment.
The problem of in correlation technique, effective solution is not yet proposed at present.
The content of the invention
The problem of in correlation technique, the present invention proposes a kind of data-storage system, passes through disk array RAID and the The mode of two combination of protocols, substitute the mode of HHDFS tri- copies storage so that the reliability of system, can safeguard in terms of Larger improvement is arrived, so as to solve asking for distributed file system HDFS cost, reliability and ease for use in the prior art Topic.
The technical proposal of the invention is realized in this way:
According to an aspect of the invention, there is provided a kind of data-storage system.
The data-storage system includes:Hadoop clusters, and component, the nfs server mould being arranged in Hadoop clusters Block, wherein, component includes Map-Reduce frameworks, and Map-Reduce frameworks are used to perform Map-Reduce flows, and Map-Reduce flows include Map tasks and Reduce tasks, are arranged on the disk array module in nfs server module, and And by disk array module and nfs server module composition shared storage device, so that storage is provided for Hadoop clusters, And store the result of each Map task to shared storage device, to remove shuffle processes, so as to optimize Map tasks With the flow of Reduce tasks;And the file cutting used in Hadoop clusters is multiple pieces by component, and by each block Different computer nodes are dealt into, it is achieved thereby that load balancing.
According to one embodiment of present invention, component further comprises:NFS sharing storage modules, HDFS storage agreements turn Shuffle stage modules, Map-Reduce task scheduling modules are gone in mold changing block, Map-Reduce flows.
According to one embodiment of present invention, disk array uses RAID5 or RAID6 storage mode, and will The file used in Hadoop clusters is cut into 64MB block.
The advantageous effects of the present invention are:
The mode that the present invention is combined by using nfs server and disk array constitutes shared storage, substitutes prior art The mode of the middle copies of HDFS tri- storage, so as to reduce cost, improves the cost performance of system, and the text that Hadoop is used Part cutting is multiple pieces, and is uniformly distributed to each calculate node, it is achieved thereby that load balancing, in addition, also optimizing Map- Reduce flows, it eliminates shuffle processes, so as to reduce the process of data interaction, improves task processing time, Jin Erliao Improve systematic function.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to institute in embodiment The accompanying drawing needed to use is briefly described, it should be apparent that, drawings in the following description are only some implementations of the present invention Example, for those of ordinary skill in the art, on the premise of not paying creative work, can also be obtained according to these accompanying drawings Obtain other accompanying drawings.
Fig. 1 is the schematic diagram of data-storage system according to embodiments of the present invention;
Fig. 2 is the layout schematic diagram of mechanism of data-storage system according to embodiments of the present invention;
Fig. 3 is Map-Reduce tasks carryings process schematic of the prior art;
Fig. 4 is Map-Reduce tasks carrying process schematics according to embodiments of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.It is based on Embodiment in the present invention, the every other embodiment that those of ordinary skill in the art are obtained belongs to what the present invention was protected Scope.
There is provided data-storage system for embodiments in accordance with the present invention.
As shown in Figures 1 to 4, data-storage system according to embodiments of the present invention includes:Hadoop clusters, and be arranged on Component, nfs server module in Hadoop clusters, wherein, component includes Map-Reduce frameworks, and Map-Reduce frames Frame is used to perform Map-Reduce flows, and Map-Reduce flows include Map tasks and Reduce tasks, are arranged on NFS Disk array module in server module, and set by disk array module and the shared storage of nfs server module composition It is standby, so as to provide storage for Hadoop clusters, and the result of each Map task is stored to shared storage device, to go Fall shuffle processes, so as to optimize the flow of Map tasks and Reduce tasks;And component will be used in Hadoop clusters File cutting be multiple pieces, and different computer nodes are dealt into by each piece, it is achieved thereby that load balancing.
In this embodiment, as shown in figure 1, disk array RAID is arranged in nfs server, and disk array is passed through The combination of module and nfs server module constitutes shared storage, so that storage is provided for Hadoop clusters, in addition, such as Fig. 2 institutes Show, the file cutting used in Hadoop clusters is multiple pieces (or section) by component, and by each piece be dealt into 3 it is different Computer node, it is achieved thereby that load balancing, as shown in Figure 3 and Figure 4, the result of each Map task is stored to shared and deposited Equipment is stored up,, can so as to optimize the flow of Map tasks and Reduce tasks, it is of course possible to understand to remove shuffle processes Size and the computer node of distribution according to the actual requirements to block is configured, and the present invention is not limited this.
By the such scheme of the present invention, the mode combined by using nfs server and disk array constitutes shared deposit Storage, substitutes the mode of the copies of HDFS tri- storage in the prior art, so as to reduce cost, improves the cost performance of system, and The file cutting that Hadoop is used is multiple pieces, and is uniformly distributed to each calculate node, it is achieved thereby that load balancing, this Outside, Map-Reduce flows are also optimized, it eliminates shuffle processes, so as to reduce the process of data interaction, improves task Processing time, and then improve systematic function.
According to one embodiment of present invention, component further comprises:NFS sharing storage modules, HDFS storage agreements turn Shuffle stage modules, Map-Reduce task scheduling modules are gone in mold changing block, Map-Reduce flows, wherein, it is above-mentioned NFS sharing storage modules are used to nfs server and disk array being arranged to shared storage;Above-mentioned HDFS stores protocol conversion mould Block is used for the protocol data that HDFS protocol data is converted to NFS, so as to realize the access to disk array;Above-mentioned Map- Reduce flows go Shuffle stage modules to be used to remove Shuffle flows;Above-mentioned Map-Reduce task scheduling modules are used for The task scheduling of Map tasks and Reduce tasks.
According to one embodiment of present invention, the disk array uses the RAID5 (independent disks of distributed parity Structure) or RAID6 (disk structure of the parity check code of two kinds of storages) storage mode, and will be used in Hadoop clusters File be cut into 64MB block.
In order to preferably describe the present invention, it is described in detail below by a specific embodiment.
The problem of in order to solve the cost of distributed file system HDFS presence, reliability and ease for use in the prior art, Set forth herein the three copy storage modes substituted using disk array RAID storage modes in HDFS, its one side eliminates number According to the process imported and exported, on the other hand, the traditional RAID5 that the disk array can be arranged in traditional magnetic disk array and RAID6, so as to improve memory space utilization rate, reduces cost.
As shown in figure 1, because disk array can be conducted interviews by NFS protocol (or NFS Network File Systems), therefore It is NFS access protocols by the protocol conversion of HDFS application layers by adding protocol conversion module in Hadoop clusters, so that will Hadoop storage, which is accessed, is converted to the access to the disk array in nfs server, specifically, 1 pair of computer node Hadoop clusters are conducted interviews, and by the Hadoop of component 1 application layer protocol, (or Hadoop clusters are accessed protocol conversion module 1 Agreement) data be converted to the access data of NFS protocol, so as to be conducted interviews to the disk array RAID of the nfs server, its His component 2, the situation of component 3 are similar, are not described in detail herein
In addition, the data storage acquiescence in Hadoop clusters is to be stored in using three copy modes in HDFS systems, Hadoop each component, such as MapReduce frameworks, HBase systems, dependence copy mechanism progress are fault-tolerant, for example, when first Where copy during node failure, Hadoop component can access triplicate data above automatically, still, using disk array Substitute after HDFS, just there is no the concept of copy for file, although what the RAID mechanism of disk array ensure that data itself can By property, but can not ensure copy automatic switchover mechanism inside Hadoop fault tolerant mechanism can normal work, still, due to magnetic Disk array storage is using NFS protocol export, therefore the data that all calculate nodes are seen are completely the same, be therefore, it can It is the memory node of a duplicate of the document to think any one node, as shown in Fig. 2 the original stored according to Hadoop files Then, cutting is carried out according to the object of fixed length to the file that need to store, every piece after such as cutting is 64MB (million), meanwhile, in order to protect Card MapReduce tasks can be distributed to different calculate nodes, meanwhile, can using every piece (section) specify 3 calculate nodes as Node where its stored copies, so, each component of Hadoop inside is taken after the data layout of file, according to acquiescence Algorithm carries out task distribution, it is not necessary to make any change, so that using NFS shared characteristic, data storage is distributed into Row pseudo-random distribution, it is ensured that the harmony of Map-Reduce task schedulings, also, it is to be understood that cutting after block size and The number of the computer node of stored copies can be set according to the actual requirements, and this is not limited by the present invention.
In addition, when selecting calculate node for the copy of each object, using pseudo-random algorithm, it is ensured that each to calculate section The selected probability of point is basically identical, so as to ensure Hadoop system in task scheduling, can make full use of every in system One calculate node, does not result in part of nodes situation hungry to death.
In addition, as shown in figure 3, during MapReduce tasks carryings in Hadoop clusters, wherein, the MapReduce Task includes:Map stages and Reduce stages, the Map stages are responsible for carrying out cutting processing to input file, then collect and divide again Group is handled to the Reduce stages, to reach efficient Distributed Calculation efficiency, and, it is necessary to will before each Map stages terminate Multiple destination files on disk are written to before the stage and carry out merger, a destination file is merged into, and the Reduce stages , it is necessary to pull the destination file of Map tasks from each Map tasks end before starting, and all Map results are subjected to merger, shape Into final destination file, enter Reduce calculation stages, the above since the stage after Map to Reduce before whole processing Process, referred to as Shuffle processing procedures, still, for the MapReduce tasks that task amount is larger, in above flow There are substantial amounts of I/O (input/output) operations, especially data pull stage during Shuffle, Reduce jobs nodes need From Map jobs nodes by network transmission pulling data, the time of process consumption accounts for more than 10% ratio in whole operation Weight, however, as shown in figure 4, for set forth herein use disk array framework, because all data are stored shared On, therefore the process of the network transmission can omit completely, the operation without carrying out data pull, so as to utilize NFS files system The shared characteristic of system, optimizes the shuffle processes of Hadoop clusters, it is to avoid data transfer, improves task processing time.
To sum up, it is shared by disk array and NFS network files this paper presents in big data storage and analysis system The mode of combination of protocols, substitutes the Hadoop copies of HDFS tri- so that system can be safeguarded in cost, data reliability, system, property Larger improvement can be obtained in terms of these, specifically:It is first, as shown in table 1 below for space availability ratio and cost, In which it is assumed that the naked space costs of 1TB are P.
Table 1
As it can be seen from table 1 HDFS memory space utilization rate is 33.3%, purchase 300TB storage is such as assumed, it is actual Free space only has 100TB, and using set forth herein disk array RAID combination NFS network files it is shared by the way of, storage Space availability ratio is up to 90%, and carrying cost saves about 67%.
Secondly as HDFS is not standard storage interface, it is therefore desirable to which the data of analysis must be imported and exported, to dividing Analysis efficiency causes large effect, and use set forth herein scheme, after creation data is produced in front end, can directly use Hadoop processing, it is not necessary to import and export, greatly facilitates the transmission of data, in addition, being assisted using disk array combination NFS After view, the Shuffle processes to MapReduce tasks have carried out local optimum herein, to reduce in Map task nodes and Data transfer is carried out by http protocol between Reduce task nodes, so as to improve the efficiency of whole processing procedure.
Again, open source software is to realize function as main purpose, thus its engineering process and enterprise-level product comparatively, Many weak points are had, therefore often there is more hidden danger in the stability and reliability of system, due to Hadoop system HDFS storage systems are constantly among modification, and stability equally exists certain risk, therefore creation data can not be deposited directly Being placed on has very big risk on HDFS, and the development of decades is passed through in disk array RAID storages, and reliability is entirely enterprise-level Standard, is adapted to the storage of creation data, therefore, in the way of disk array RAID and second protocol combination, substitutes HDFS tri- secondary The mode of this storage, is far above distributed file system HDFS in data reliability.
In summary, by means of the above-mentioned technical proposal of the present invention, combined by using nfs server and disk array Mode constitutes shared storage, substitutes the mode of the copies of HDFS tri- storage in the prior art, so as to reduce cost, improves system Cost performance, and the file cutting that Hadoop is used is multiple pieces, and each calculate node is uniformly distributed to, so as to realize Load balancing, in addition, also optimizing Map-Reduce flows, it eliminates shuffle processes, so as to reduce data interaction Process, improves task processing time, and then improve systematic function.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention God is with principle, and any modification, equivalent substitution and improvements made etc. should be included in the scope of the protection.

Claims (3)

1. a kind of data-storage system, the storage system includes Hadoop clusters, and is arranged in the Hadoop clusters Component, nfs server module, wherein, the component includes Map-Reduce frameworks, and the Map-Reduce frameworks are used for Map-Reduce flows are performed, and the Map-Reduce flows include Map tasks and Reduce tasks, it is characterised in that
The disk array module in the nfs server module is arranged on, and passes through the disk array module and the NFS Server module constitutes shared storage device, so as to provide storage for the Hadoop clusters, and each described Map is appointed The result of business is stored to the shared storage device, to remove shuffle processes, so as to optimize Map tasks and Reduce The flow of business;And
The file cutting used in Hadoop clusters is multiple pieces by the component, and is dealt into different calculating by each piece Machine node, it is achieved thereby that load balancing.
2. according to the storage system described in claim 1, it is characterised in that the component further comprises:The shared storage moulds of NFS Shuffle stage modules, Map-Reduce is gone to appoint in block, HDFS storages protocol conversion module, the Map-Reduce flows Business scheduler module.
3. storage system according to claim 1, it is characterised in that the disk array is deposited using RAID5's or RAID6 Storage mode, and the file used in Hadoop clusters is cut into 64MB block.
CN201611257420.3A 2016-12-30 2016-12-30 Data storage system Active CN107066205B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611257420.3A CN107066205B (en) 2016-12-30 2016-12-30 Data storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611257420.3A CN107066205B (en) 2016-12-30 2016-12-30 Data storage system

Publications (2)

Publication Number Publication Date
CN107066205A true CN107066205A (en) 2017-08-18
CN107066205B CN107066205B (en) 2020-06-05

Family

ID=59624054

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611257420.3A Active CN107066205B (en) 2016-12-30 2016-12-30 Data storage system

Country Status (1)

Country Link
CN (1) CN107066205B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108776690A (en) * 2018-06-05 2018-11-09 上海孚典智能科技有限公司 The method of HDFS Distribution and Centralization blended data storage systems based on separated layer handling
CN110297812A (en) * 2019-06-13 2019-10-01 深圳市比比赞科技有限公司 File memory method, the method for file synchronization, computer equipment and storage medium
CN112328176A (en) * 2020-11-04 2021-02-05 北京计算机技术及应用研究所 Intelligent scheduling method based on multi-control disk array NFS sharing
WO2022116766A1 (en) * 2020-12-04 2022-06-09 中兴通讯股份有限公司 Data storage system and construction method therefor

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101873342A (en) * 2010-06-02 2010-10-27 深圳市迪菲特科技股份有限公司 Data access method, data access system and disk array storage system
CN102521687A (en) * 2011-12-01 2012-06-27 中国资源卫星应用中心 Miniaturized universal platform for preprocessing remote-sensing satellite data
CN102915257A (en) * 2012-09-28 2013-02-06 曙光信息产业(北京)有限公司 TORQUE(tera-scale open-source resource and queue manager)-based parallel checkpoint execution method
CN103747060A (en) * 2013-12-26 2014-04-23 惠州华阳通用电子有限公司 Distributed monitor system and method based on streaming media service cluster
US20140358977A1 (en) * 2013-06-03 2014-12-04 Zettaset, Inc. Management of Intermediate Data Spills during the Shuffle Phase of a Map-Reduce Job

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101873342A (en) * 2010-06-02 2010-10-27 深圳市迪菲特科技股份有限公司 Data access method, data access system and disk array storage system
CN102521687A (en) * 2011-12-01 2012-06-27 中国资源卫星应用中心 Miniaturized universal platform for preprocessing remote-sensing satellite data
CN102915257A (en) * 2012-09-28 2013-02-06 曙光信息产业(北京)有限公司 TORQUE(tera-scale open-source resource and queue manager)-based parallel checkpoint execution method
US20140358977A1 (en) * 2013-06-03 2014-12-04 Zettaset, Inc. Management of Intermediate Data Spills during the Shuffle Phase of a Map-Reduce Job
CN103747060A (en) * 2013-12-26 2014-04-23 惠州华阳通用电子有限公司 Distributed monitor system and method based on streaming media service cluster

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
何文婷,等: "《支持Hadoop大数据访问的pNFS框架研究与实现》", 《计算机应用研究》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108776690A (en) * 2018-06-05 2018-11-09 上海孚典智能科技有限公司 The method of HDFS Distribution and Centralization blended data storage systems based on separated layer handling
CN108776690B (en) * 2018-06-05 2020-07-07 上海孚典智能科技有限公司 Method for HDFS distributed and centralized mixed data storage system based on hierarchical governance
CN110297812A (en) * 2019-06-13 2019-10-01 深圳市比比赞科技有限公司 File memory method, the method for file synchronization, computer equipment and storage medium
CN112328176A (en) * 2020-11-04 2021-02-05 北京计算机技术及应用研究所 Intelligent scheduling method based on multi-control disk array NFS sharing
CN112328176B (en) * 2020-11-04 2024-01-30 北京计算机技术及应用研究所 Intelligent scheduling method based on NFS sharing of multi-control disk array
WO2022116766A1 (en) * 2020-12-04 2022-06-09 中兴通讯股份有限公司 Data storage system and construction method therefor

Also Published As

Publication number Publication date
CN107066205B (en) 2020-06-05

Similar Documents

Publication Publication Date Title
Das et al. Big data analytics: A framework for unstructured data analysis
US8677366B2 (en) Systems and methods for processing hierarchical data in a map-reduce framework
CN107066205A (en) A kind of data-storage system
CN103106249B (en) A kind of parallel data processing system based on Cassandra
Singh et al. Hadoop: addressing challenges of big data
US20130227379A1 (en) Efficient checksums for shared nothing clustered filesystems
CN106790572A (en) The system and method that a kind of distributed information log is collected
Ngu et al. B+-tree construction on massive data with Hadoop
Saxena et al. Practical real-time data processing and analytics: distributed computing and event processing using Apache Spark, Flink, Storm, and Kafka
Khan et al. Data model for big data in cloud environment
Li et al. The overview of big data storage and management
CN107632780A (en) A kind of roll of strip implementation method and its storage architecture based on distributed memory system
CN106156049A (en) A kind of method and system of digital independent
CN102880832B (en) A kind of implementation method of the system of the data magnanimity management under cluster
Tomar et al. Integration of cloud computing and big data technology for smart generation
Feng et al. Review of hadoop performance optimization
CN107395446A (en) Daily record real time processing system
Bokhari et al. An effective model for big data analytics
Li et al. Design of the mass multimedia files storage architecture based on Hadoop
Kaur Big data: A review of challenges, tools and techniques
Liu et al. Research on it architecture of heterogeneous big data
Chakraborty et al. A proposal for high availability of HDFS architecture based on threshold limit and saturation limit of the namenode
CN108062311A (en) A kind of method and system of access service device web data
Wang et al. Research on the architecture of Open Education based on cloud computing
US9639630B1 (en) System for business intelligence data integration

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20211011

Address after: 100089 building 36, courtyard 8, Dongbeiwang West Road, Haidian District, Beijing

Patentee after: Dawning Information Industry (Beijing) Co.,Ltd.

Patentee after: ZHONGKE SUGON INFORMATION INDUSTRY CHENGDU Co.,Ltd.

Address before: 100193 No. 36 Building, No. 8 Hospital, Wangxi Road, Haidian District, Beijing

Patentee before: Dawning Information Industry (Beijing) Co.,Ltd.