CN113791742B - 一种高性能的数据湖***及数据存储方法 - Google Patents
一种高性能的数据湖***及数据存储方法 Download PDFInfo
- Publication number
- CN113791742B CN113791742B CN202111368382.XA CN202111368382A CN113791742B CN 113791742 B CN113791742 B CN 113791742B CN 202111368382 A CN202111368382 A CN 202111368382A CN 113791742 B CN113791742 B CN 113791742B
- Authority
- CN
- China
- Prior art keywords
- array
- data
- file
- storage method
- data lake
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 39
- 238000013500 data storage Methods 0.000 title claims abstract description 25
- 238000003491 array Methods 0.000 claims abstract description 25
- 208000022417 sinus histiocytosis with massive lymphadenopathy Diseases 0.000 claims description 16
- 238000012545 processing Methods 0.000 abstract description 5
- 238000002474 experimental method Methods 0.000 description 10
- 238000012360 testing method Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000007792 addition Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/11—File system administration, e.g. details of archiving or snapshots
- G06F16/116—Details of conversion of file system types or formats
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
- G06F16/134—Distributed indices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/258—Data format conversion from or to a database
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/061—Improving I/O performance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/061—Improving I/O performance
- G06F3/0611—Improving I/O performance in relation to response time
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/0643—Management of files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Human Computer Interaction (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (8)
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111368382.XA CN113791742B (zh) | 2021-11-18 | 2021-11-18 | 一种高性能的数据湖***及数据存储方法 |
NL2033534A NL2033534B1 (en) | 2021-11-18 | 2022-11-15 | High-performance data lake system and data storage method |
US17/988,834 US11789899B2 (en) | 2021-11-18 | 2022-11-17 | High-performance data lake system and data storage method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111368382.XA CN113791742B (zh) | 2021-11-18 | 2021-11-18 | 一种高性能的数据湖***及数据存储方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113791742A CN113791742A (zh) | 2021-12-14 |
CN113791742B true CN113791742B (zh) | 2022-03-25 |
Family
ID=78955413
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111368382.XA Active CN113791742B (zh) | 2021-11-18 | 2021-11-18 | 一种高性能的数据湖***及数据存储方法 |
Country Status (3)
Country | Link |
---|---|
US (1) | US11789899B2 (zh) |
CN (1) | CN113791742B (zh) |
NL (1) | NL2033534B1 (zh) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102521367A (zh) * | 2011-12-16 | 2012-06-27 | 清华大学 | 面向海量数据的分布式处理方法 |
CN107122170A (zh) * | 2017-03-22 | 2017-09-01 | 武汉斗鱼网络科技有限公司 | 一种数据数组的大容量存储方法及装置 |
CN111061806A (zh) * | 2019-11-21 | 2020-04-24 | 中国航空无线电电子研究所 | 面向分布式的海量地理瓦片的存储方法与网络化访问方法 |
CN111291047A (zh) * | 2020-01-16 | 2020-06-16 | 北京明略软件***有限公司 | 一种时空数据存储方法、装置、存储介质及电子设备 |
CN111367984A (zh) * | 2020-03-11 | 2020-07-03 | 中国工商银行股份有限公司 | 高时效的数据加载入数据湖的方法及*** |
CN111400301A (zh) * | 2019-01-03 | 2020-07-10 | 阿里巴巴集团控股有限公司 | 一种数据查询方法、装置及设备 |
CN113297057A (zh) * | 2020-03-26 | 2021-08-24 | 阿里巴巴集团控股有限公司 | 内存分析方法、装置及*** |
Family Cites Families (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8032701B1 (en) * | 2004-03-26 | 2011-10-04 | Emc Corporation | System and method for managing provisioning of storage resources in a network with virtualization of resources in such a network |
JP4758429B2 (ja) * | 2005-08-15 | 2011-08-31 | 株式会社ターボデータラボラトリー | 共有メモリ型マルチプロセッサシステム及びその情報処理方法 |
US9639403B2 (en) * | 2013-03-15 | 2017-05-02 | Genband Us Llc | Receive-side scaling in a computer system using sub-queues assigned to processing cores |
KR101663547B1 (ko) * | 2016-02-26 | 2016-10-07 | 주식회사 아미크 | 데이터베이스의 아카이빙 방법 및 장치, 아카이빙된 데이터베이스의 검색 방법 및 장치 |
WO2018039264A1 (en) * | 2016-08-22 | 2018-03-01 | Oracle International Corporation | System and method for metadata-driven external interface generation of application programming interfaces |
CN106383886B (zh) * | 2016-09-21 | 2019-08-30 | 深圳市博瑞得科技有限公司 | 一种基于大数据分布式编程框架的大数据预统***及方法 |
EP3535974A1 (en) * | 2016-12-08 | 2019-09-11 | Zhejiang Dahua Technology Co., Ltd | Methods and systems for video synopsis |
US10831773B2 (en) * | 2017-03-01 | 2020-11-10 | Next Pathway Inc. | Method and system for parallelization of ingestion of large data sets |
CN106982356B (zh) * | 2017-04-08 | 2020-12-22 | 复旦大学 | 一种分布式大规模视频流处理*** |
WO2019183062A1 (en) * | 2018-03-19 | 2019-09-26 | Facet Labs, Llc | Interactive dementia assistive devices and systems with artificial intelligence, and related methods |
US20190370599A1 (en) * | 2018-05-29 | 2019-12-05 | International Business Machines Corporation | Bounded Error Matching for Large Scale Numeric Datasets |
US10810224B2 (en) * | 2018-06-27 | 2020-10-20 | International Business Machines Corporation | Computerized methods and programs for ingesting data from a relational database into a data lake |
US11182354B1 (en) * | 2018-11-27 | 2021-11-23 | Tekion Corp | Data analysis and processing engine |
US11119980B2 (en) * | 2018-11-30 | 2021-09-14 | International Business Machines Corporation | Self-learning operational database management |
CN109886074B (zh) * | 2018-12-27 | 2020-11-13 | 浙江工业大学 | 一种基于视频流处理的电梯乘客数并行检测方法 |
CN109889907B (zh) * | 2019-04-08 | 2021-06-01 | 北京东方国信科技股份有限公司 | 一种基于html5的视频osd的显示方法及装置 |
KR20200122900A (ko) * | 2019-04-19 | 2020-10-28 | 고려대학교 산학협력단 | 감시 영상 기반의 차량 추적 시스템 |
US20200394455A1 (en) * | 2019-06-15 | 2020-12-17 | Paul Lee | Data analytics engine for dynamic network-based resource-sharing |
CN110704193B (zh) * | 2019-10-12 | 2022-12-16 | 中国电子科技集团公司第三十八研究所 | 一种适合向量处理的多核软件架构的实现方法及装置 |
US11063612B1 (en) * | 2020-03-02 | 2021-07-13 | International Business Machines Corporation | Parallelizing encoding of binary symmetry-invariant product codes |
US11210271B1 (en) * | 2020-08-20 | 2021-12-28 | Fair Isaac Corporation | Distributed data processing framework |
CN114218595A (zh) * | 2021-12-21 | 2022-03-22 | 田明太 | 一种云计算平台中文件保护方法及*** |
-
2021
- 2021-11-18 CN CN202111368382.XA patent/CN113791742B/zh active Active
-
2022
- 2022-11-15 NL NL2033534A patent/NL2033534B1/en active
- 2022-11-17 US US17/988,834 patent/US11789899B2/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102521367A (zh) * | 2011-12-16 | 2012-06-27 | 清华大学 | 面向海量数据的分布式处理方法 |
CN107122170A (zh) * | 2017-03-22 | 2017-09-01 | 武汉斗鱼网络科技有限公司 | 一种数据数组的大容量存储方法及装置 |
CN111400301A (zh) * | 2019-01-03 | 2020-07-10 | 阿里巴巴集团控股有限公司 | 一种数据查询方法、装置及设备 |
CN111061806A (zh) * | 2019-11-21 | 2020-04-24 | 中国航空无线电电子研究所 | 面向分布式的海量地理瓦片的存储方法与网络化访问方法 |
CN111291047A (zh) * | 2020-01-16 | 2020-06-16 | 北京明略软件***有限公司 | 一种时空数据存储方法、装置、存储介质及电子设备 |
CN111367984A (zh) * | 2020-03-11 | 2020-07-03 | 中国工商银行股份有限公司 | 高时效的数据加载入数据湖的方法及*** |
CN113297057A (zh) * | 2020-03-26 | 2021-08-24 | 阿里巴巴集团控股有限公司 | 内存分析方法、装置及*** |
Also Published As
Publication number | Publication date |
---|---|
CN113791742A (zh) | 2021-12-14 |
NL2033534B1 (en) | 2024-01-08 |
NL2033534A (en) | 2023-06-12 |
US20230153267A1 (en) | 2023-05-18 |
US11789899B2 (en) | 2023-10-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102867071B (zh) | 一种网管海量历史数据管理方法 | |
CA2997061C (en) | Method and system for parallelization of ingestion of large data sets | |
US8914415B2 (en) | Serial and parallel methods for I/O efficient suffix tree construction | |
CN109712674B (zh) | 注释数据库索引结构、快速注释遗传变异的方法及*** | |
US10545960B1 (en) | System and method for set overlap searching of data lakes | |
US6266665B1 (en) | Indexing and searching across multiple sorted arrays | |
US20140052727A1 (en) | Data processing for database aggregation operation | |
CN111061758A (zh) | 数据存储方法、装置及存储介质 | |
CN103019855A (zh) | MapReduce作业执行时间预测方法 | |
WO2014122441A1 (en) | Improvements relating to use of columnar databases | |
US9183320B2 (en) | Data managing method, apparatus, and recording medium of program, and searching method, apparatus, and medium of program | |
CN111625520A (zh) | 一种通用的异构数据库字段类型的映射方法及*** | |
JP4511469B2 (ja) | 情報処理方法及び情報処理システム | |
CN113791742B (zh) | 一种高性能的数据湖***及数据存储方法 | |
Liu et al. | Parallel and space-efficient construction of Burrows-Wheeler transform and suffix array for big genome data | |
CN112434085A (zh) | 基于Roaring Bitmap的用户数据统计方法 | |
CN112835932B (zh) | 业务表的批量处理方法及装置、非易失性存储介质 | |
US20230273875A1 (en) | Method for searching free blocks in bitmap data, and related components | |
JP4772506B2 (ja) | 情報処理方法、情報処理システムおよびプログラム | |
CN114443670B (zh) | 数据的存储、读取方法及装置 | |
US20240088913A1 (en) | Graph data compression method and apparatus | |
CN113704340A (zh) | 数据处理方法、装置、服务器及存储介质 | |
CN101414309A (zh) | 大规模数据信息排重处理*** | |
CN114817390A (zh) | 一种基于Sqoop程序的数据处理方法及装置 | |
JP5419069B2 (ja) | データベース装置、データベースの管理方法、データベースのデータ構造、データベースの管理プログラムおよびそれを記録したコンピュータ読み取り可能な記録媒体 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB03 | Change of inventor or designer information | ||
CB03 | Change of inventor or designer information |
Inventor after: Liu Hao Inventor after: Tu Yonggang Inventor after: Chen Zhiling Inventor after: Zhang Tao Inventor after: Wang Peng Inventor after: Wang Qiuye Inventor after: Yu Chenxi Inventor after: Chen Wei Inventor after: Liu Yinlong Inventor after: Liu Zhefeng Inventor before: Liu Hao Inventor before: Wang Peng Inventor before: Tu Yonggang Inventor before: Zhang Tao Inventor before: Chen Zhiling Inventor before: Yu Chenxi Inventor before: Chen Wei Inventor before: Liu Zhefeng Inventor before: Liu Yinlong Inventor before: Wang Qiuye |
|
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20220303 Address after: 314001 Building 29, Xianghu villa, Qixing street, Nanhu District, Jiaxing City, Zhejiang Province Applicant after: Nanhu Laboratory Applicant after: Beijing big data advanced technology research institute Address before: 314001 Building 29, Xianghu villa, Qixing street, Nanhu District, Jiaxing City, Zhejiang Province Applicant before: Nanhu Laboratory |
|
GR01 | Patent grant | ||
GR01 | Patent grant |