KR102423284B1

KR102423284B1 - Data distributed storage system based on Inter Planetary File System

Info

Publication number: KR102423284B1
Application number: KR1020210156855A
Authority: KR
Inventors: 백종윤
Original assignee: 백종윤
Priority date: 2021-11-15
Filing date: 2021-11-15
Publication date: 2022-07-19

Abstract

A distributed file system (IPFS)-based file distributed storage system according to the present invention comprises: an encryption module which encrypts an input file; a distributed storage module which divides the encrypted file into a plurality of data fragments and distributes and stores the divided data fragments in a plurality of IPFS nodes connected to each other through an IPFS network; and a blockchain management module which forms a block storing transaction information including download details and upload details of the data and a blockchain connected to the plurality of the blocks, creates a copy of the blockchain, and transmits the copy to the IPFS nodes. According to the present invention, the IPFS-based film distributed storage system can distribute and store files based on a distributed file system (IPFS), thereby quickly and efficiently storing high-capacity files, can detect duplication of stored files, thereby enabling a storage to be efficiently used, and can enable files to be semi-permanently preserved in the IPFS nodes when a user desires to preserve the files in the IPFS nodes.

Description

분산형 파일 시스템 기반의 파일 분산 저장 시스템{Data distributed storage system based on Inter Planetary File System}Data distributed storage system based on Inter Planetary File System

본 발명은 분산형 파일 시스템 기반의 파일 분산 저장 시스템에 관한 것으로서, 보다 상세히 설명하면 모든 노드를 연결해 파일을 데이터 조각으로 분산 저장하고 공유할 수 있도록 하여 보다 빠른 속도로 파일의 업로드 및 다운로드를 가능케 한, 분산형 파일 시스템 기반의 파일 분산 저장 시스템에 관한 것이다.The present invention relates to a file distributed storage system based on a distributed file system. In more detail, it is possible to connect all nodes to distribute and store files as data pieces and to share them so that files can be uploaded and downloaded at a faster speed. , relates to a distributed file system-based file distributed storage system.

분산형 파일 시스템(IPFS:Inter Planetary File System)은 분산 파일 시스템에 데이터를 저장하고 공유하는 프로토콜이자 피어 투 피어(P2P: peer-to-peer) 네트워크이다.Inter Planetary File System (IPFS) is a protocol and peer-to-peer (P2P) network for storing and sharing data in a distributed file system.

IPFS는 프로토콜 랩(Protocol Labs)의 창시자 후안 베넷(Juan Benet)에 의해 만들어졌다. 2015년 2월에 알파 버전으로 출시되었으며, 같은 해 10월 테크크런치(TechCrunch)의 기사는 IPFS를 “입으로 빠르게 확산된다”고 비유하여 기술하기도 하였다.IPFS was created by Juan Benet, founder of Protocol Labs. It was released as an alpha version in February 2015, and an article in TechCrunch in October of the same year described IPFS as a metaphor for “spreads rapidly by mouth.”

이러한 IPFS 기반의 파일 저장 시스템은 탈중앙화를 핵심 개념으로 하는 P2P 네트워크의 일종이라 할 수 있는데, 여기서 P2P란 서버나 클라이언트 없이 개인 컴퓨터 사이를 연결하는 통신망을 말하며, 연결된 각각의 컴퓨터가 서버이자 클라이언트 역할을 하며 정보를 공유하는 방식이다. 다수의 노드가 같은 데이터를 공유하고 검증하는 방식을 통해 디지털 상에 신뢰관계를 형성하게 된다. 이러한 환경은 중개자 없이 P2P로 편리하게 계약을 체결하고 수정할 수 있는 스마트 컨트랙트를 실현 가능하게 한다.This IPFS-based file storage system can be said to be a kind of P2P network with decentralization as the core concept. and way of sharing information. Through the method of sharing and verifying the same data by multiple nodes, a trust relationship is formed in the digital world. This environment makes it possible to realize smart contracts that can conveniently conclude and modify contracts with P2P without intermediaries.

IPFS는 모든 컴퓨터 노드를 연결하고자 하는 분산된 P2P 파일 시스템이며, IPFS Web은 기존의 HTTP Web의 문제점을 해결하고 보완한 새로운 웹이며, IPFS의 특징으로는, 중앙화된 서버 없이 노드들의 P2P 통신으로 실현된 더욱 빠르고 안전하고 개방된 네트워크라는 점을 들 수 있다.IPFS is a distributed P2P file system that tries to connect all computer nodes, IPFS Web is a new web that solves and supplements the problems of the existing HTTP Web. It is a faster, more secure, and more open network.

대형 서버의 연결이 차단되면 치명적인 결과를 낳는 과거 HTTP Web과는 달리, IPFS에서는 몇몇 노드들이 연결이 끊어지더라도 시스템이 안정적으로 유지될 수 있다.Unlike the HTTP Web of the past, which has fatal results when a large server is disconnected, IPFS can maintain a stable system even if some nodes are disconnected.

또한, 고용량의 파일을 빠르고 효율적으로 전달할 수 있으며(BitSwap), 파일들의 중복을 알 수 있기 때문에 저장소도 효율적으로 사용할 수 있다(Merkle DAG, contents-addressed). IPFS 상에 업로드된 파일의 이름은 영원히 기록되며, 만약 IPFS 상에서 보존하고 싶은 파일은 반영구적으로 보존이 가능하다(pinning). 또한, 파일의 버전 관리(Git)가 가능하다.In addition, high-capacity files can be delivered quickly and efficiently (BitSwap), and because the duplication of files is known, storage can be used efficiently (Merkle DAG, contents-addressed). The name of the file uploaded on IPFS is recorded forever, and if you want to keep the file on IPFS, it can be preserved semi-permanently (pinning). Also, file version control (Git) is possible.

이러한 파일 분산 저장 시스템은 필연적으로 블록체인 네트워크와 관련이 있다고 할 수 있는데, 블록체인 네트워크는 블록체인(Block Chain)은 네트워크에 참여하는 모든 사용자가 관리 대상이 되는 모든 데이터를 분산하여 저장하는 데이터 분산처리기술을 의미한다. It can be said that such a file distributed storage system is inevitably related to a blockchain network. A blockchain network is a data distribution that distributes and stores all data managed by all users participating in the network. processing technology.

블록체인 네트워크는 데이터에 대한 거래 정보가 담긴 원장(原帳)을 거래 주체나 특정 기관에서 보유하는 것이 아니라 네트워크 참여자 모두가 나누어 가지는 기술이라는 점에서 '분산원장기술(DLT:Distributed Ledger Technology)' 또는 '공공거래장부'라고도 한다. 블록체인은 거래 내용이 담긴 블록(Block)을 사슬처럼 연결(chain)한 것이라 하여 붙여진 명칭이다.A blockchain network is a technology that is shared by all network participants rather than a transaction subject or a specific institution holding a ledger containing transaction information for data. Also called 'public transaction ledger'. Blockchain is a name given to the fact that blocks containing transaction contents are linked like a chain.

이러한 블록체인은 거래 내용의 위변조와 같은 해킹을 막기 위한 기술이며, 거래에 참여하는 모든 사용자에게 거래 내역을 보내 주며 거래 때마다 이를 대조해 데이터 위조를 막는 방식을 사용한다.This block chain is a technology to prevent hacking such as forgery and forgery of transaction contents, and it uses a method to prevent data forgery by sending transaction details to all users participating in the transaction and collating them for each transaction.

이러한 블록체인 네트워크와 관련된 선행기술로서, 한국 등록특허 제 10-230191호에 ‘컨텐츠에 대한 블록 체인의 블록 갱신 방법 및 서버’가 개시되어 있다.As a prior art related to such a block chain network, Korean Patent Registration No. 10-230191 discloses a 'block chain update method and server for content'.

상기 선행기술은 사용자 단말로 컨텐츠를 제공하기 위해, 서버는 컨텐츠를 복수의 IPFS 파일들로 분할하고, 복수의 IPFS 파일들 각각에 대한 파일 해시를 생성하고, 복수의 사용자 단말들을 IPFS 네트워크에 등록시킴으로써 등록 및 복수의 파일 해시들에 대한 해시 메타 정보를 생성하고, 해시 메타 정보에 기초하여 복수의 IPFS 파일들이 복수의 사용자 단말들 간에 전송된 거래 내역을 사용자 단말로부터 수신하고, 거래 내역에 기초하여 컨텐츠에 대한 블록체인의 블록을 갱신하는 블록 갱신 방법 및 서버에 관한 것이다.In the prior art, in order to provide content to a user terminal, the server divides the content into a plurality of IPFS files, generates a file hash for each of the plurality of IPFS files, and registers the plurality of user terminals to the IPFS network. Register and generate hash meta information for a plurality of file hashes, receive a transaction history in which a plurality of IPFS files are transmitted between a plurality of user terminals based on the hash meta information from the user terminal, and content based on the transaction history It relates to a block update method and server to update the block of the block chain.

상술한 선행기술은 비행기 내부에서의 비행 중에 제공되는 엔터테인먼트 컨텐츠의 관리를 IPFS를 통해 분할 저장 및 관리하는 기술에 관한 것으로서, 전반적인 데이터에 대한 분산 관리 및 저장 구성보다는 엔터테인먼트 컨텐츠에 대한 분산 관리에 초점을 둔 것이라 할 수 있다.The above-mentioned prior art relates to a technology for dividing and storing and managing the management of entertainment contents provided during flight inside an airplane through IPFS, and focuses on distributed management of entertainment contents rather than distributed management and storage configuration of overall data. it can be said to have been

따라서, 다양한 데이터들에 대한 분산형 파일 시스템을 기반으로 한 분산 저장을 가능케 하여 파일 전송 속도를 높이며 저장된 파일에 대한 보안성을 높일 수 있도록 한, 파일 분산 저장 시스템을 개발할 필요성이 대두되는 실정이다.Accordingly, there is a need to develop a distributed file storage system that enables distributed storage of various data based on a distributed file system to increase file transfer speed and increase security for stored files.

본 발명은 고용량의 파일을 빠르고 효율적으로 저장할 수 있도록 할 뿐 아니라 저장소를 효율적으로 사용할 수 있으며, 파일에 대한 보안성을 높인 파일 분산 저장 시스템을 제공하는 것을 주요 목적으로 한다.It is a main object of the present invention to provide a file distribution storage system that not only allows a high-capacity file to be stored quickly and efficiently, but also can use storage efficiently, and has improved security for files.

본 발명의 다른 목적은, 파일에 대한 접근 및 다운로드에 있어서도 검증 과정을 거칠 수 있게 하여 저장된 파일에 대한 보안성을 높일 수 있도록 하는 것이다.Another object of the present invention is to increase the security of a stored file by allowing a verification process to be performed even when accessing and downloading a file.

본 발명의 또 다른 목적은, 신규한 방식의 암호화를 적용하여 본 발명을 통해 저장되는 파일에 대한 보안성을 높이는 것이다.Another object of the present invention is to increase the security of files stored through the present invention by applying a novel encryption method.

본 발명의 추가 목적은, 암호화 구성에 있어 암호화 패턴을 생성할 수 있도록 하여 보안성을 강화하는 것이다.It is a further object of the present invention to enhance security by making it possible to generate an encryption pattern in an encryption configuration.

본 발명의 추가 목적은, 전송 속도를 높임과 동시에 복호화 효율을 높이도록 하는 것이다.It is a further object of the present invention to increase a transmission rate and a decoding efficiency at the same time.

상기 목적을 달성하기 위하여, 본 발명에 따른 분산형 파일 시스템(IPFS) 기반의 파일 분산 저장 시스템은, 입력된 파일을 암호화하는 암호화 모듈; 상기 암호화된 파일을 복수의 데이터 조각으로 분할 처리하고, IPFS 네트워크를 통해 상호 연결된 복수의 IPFS 노드에 상기 분할된 데이터 조각을 분산 저장하는 분산 저장 모듈; 상기 데이터의 다운로드 내역 및 업로드 내역을 포함하는 트랜젝션 정보가 저장되는 블록 및, 복수개의 상기 블록이 연결된 블록체인을 형성하고, 상기 블록체인의 사본을 생성하여 상기 IPFS 노드에 전송하는 블록체인 관리 모듈;을 포함하는 것을 특징으로 한다.In order to achieve the above object, a distributed file system (IPFS)-based file distribution storage system according to the present invention, an encryption module for encrypting an input file; a distributed storage module that divides the encrypted file into a plurality of data fragments, and distributes and stores the divided data fragments in a plurality of IPFS nodes interconnected through an IPFS network; a block chain management module for forming a block in which transaction information including download and upload details of the data is stored and a block chain in which a plurality of the blocks are connected, creating a copy of the block chain and transmitting it to the IPFS node; It is characterized in that it includes.

나아가, 상기 암호화 모듈은, 암호화 알고리즘을 통해 입력된 파일을 암호화하여 해시(hash)값을 생성하고, 상기 트랜젝션 정보는, 상기 해시값을 포함하는 것을 특징으로 한다.Furthermore, the encryption module generates a hash value by encrypting a file input through an encryption algorithm, and the transaction information includes the hash value.

더하여, 상기 시스템은, 상기 파일을 이루는 데이터에 보이드 데이터(void data)를 삽입하여 암호화를 수행하는 것을 특징으로 한다.In addition, the system is characterized in that the encryption is performed by inserting void data into the data constituting the file.

또한, 상기 암호화 모듈은, 상기 IPFS 네트워크를 통해 저장된 파일의 전체 용량을 파악하는 저장 용량 파악부 및, 상기 전체 용량에 난수를 반영하여 상기 파일을 이루는 데이터에 삽입될 보이드 데이터(void data)의 데이터패턴을 생성하는 패턴 생성부 및, 상기 데이터패턴에 따라 상기 파일을 이루는 데이터에 상기 보이드 데이터를 삽입하는 암호화 수행부를 포함하고, 상기 트랜젝션 정보는, 상기 데이터패턴을 포함하는 것을 특징으로 한다.In addition, the encryption module includes a storage capacity determining unit for identifying the total capacity of a file stored through the IPFS network, and reflecting a random number in the total capacity, data of void data to be inserted into data constituting the file and a pattern generating unit for generating a pattern, and an encryption performing unit for inserting the void data into data constituting the file according to the data pattern, wherein the transaction information includes the data pattern.

또한, 상기 그룹 설정부는, 상기 파일을 이루는 데이터를 오프닝 데이터(Opening data)와, 상기 오프닝 데이터에 종속되는 적어도 하나의 어사인드 데이터(Assigned data)로 그룹화하고, 상기 암호화 수행부는, 삽입되는 보이드 데이터마다 그룹 별 오프닝 데이터의 위치정보를 함께 삽입하는 기능을 포함하는 것을 특징으로 한다.In addition, the group setting unit groups the data constituting the file into opening data and at least one assigned data subordinate to the opening data, and the encryption performing unit includes the inserted void data It is characterized in that it includes a function of inserting the position information of the opening data for each group together.

본 발명의 분산형 파일 시스템(IPFS) 기반의 파일 분산 저장 시스템에 따르면,According to the distributed file system (IPFS)-based file distributed storage system of the present invention,

1) 분산형 파일 시스템(IPFS) 기반으로 파일을 분산 저장함으로써 고용량의 파일을 빠르고 효율적으로 저장할 수 있고, 나아가 저장된 파일의 중복을 알 수 있어 저장소를 효율적으로 사용할 수 있으며, IPFS 노드 상에서 파일의 보존을 원하는 경우 반영구적 보존을 가능케 할 수 있도록 하며,1) By distributing files based on the Distributed File System (IPFS), it is possible to quickly and efficiently store high-capacity files, and furthermore, to know the duplication of stored files, the storage can be used efficiently, and files are preserved on the IPFS node. to enable semi-permanent preservation,

2) 해시값 검증 과정을 더하여 파일에 대한 접근 및 다운로드에 있어서도 검증 과정을 거칠 수 있게 하여 저장된 파일에 대한 보안성을 높일 수 있도록 하고,2) In addition to the hash value verification process, it is possible to go through the verification process even when accessing and downloading the file, so that the security of the stored file can be improved,

3) 의미 없는 보이드 데이터 삽입을 통해 파일을 암호화 처리함으로써, 저장된 파일에 외부 침입자가 접근하여 파일에 대한 해킹을 수행하는 것을 방지할 수 있도록 함과 동시에 보안성을 높이며,3) By encrypting the file through the insertion of meaningless void data, it prevents an external intruder from accessing the stored file and performing hacking on the file, and at the same time increases security,

4) 보이드 데이터를 패턴화하여 삽입할 수 있도록 하여 암호화 성능을 높일 수 있도록 함과 동시에 복호화 효율을 함께 높일 뿐 아니라,4) By patterning and inserting void data, the encryption performance can be increased, and the decryption efficiency is also increased.

5) 파일을 이루는 데이터를 그룹화하되, 그룹에 속한 데이터들을 오프닝 데이터, 어사인드 데이터, 추측 데이터로 부호화하여 모든 데이터가 동일한 용량을 차지하는 것이 아닌 용량의 차등을 두도록 설정함으로써 하나의 그룹이 차지하는 용량을 최소화할 수 있도록 함과 동시에 전송속도를 높이며, 복호화 성능을 높인 효과가 있다.5) Group the data constituting the file, but encode the data belonging to the group as opening data, assign data, and guess data, so that all data does not occupy the same capacity but differs in capacity. It can be minimized, and at the same time, the transmission speed is increased, and there is an effect of improving the decoding performance.

도 1은 본 발명의 시스템에 대한 개략적인 구성을 나타낸 개념도.
도 2는 본 발명의 시스템의 전체 구성을 도시한 블록도.
도 3은 그룹의 예시를 나타낸 개념도.
도 4는 그룹의 다른 실시예를 나타낸 개념도.1 is a conceptual diagram showing a schematic configuration of a system of the present invention.
Fig. 2 is a block diagram showing the overall configuration of the system of the present invention;
3 is a conceptual diagram illustrating an example of a group.
4 is a conceptual diagram showing another embodiment of the group.

이하 첨부된 도면을 참조하여 본 발명의 바람직한 실시예를 상세하게 설명하도록 한다. 첨부된 도면은 축척에 의하여 도시되지 않았으며, 각 도면의 동일한 참조 번호는 동일한 구성 요소를 지칭한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. The accompanying drawings are not drawn to scale, and like reference numbers in each figure refer to like components.

도 1은 본 발명의 시스템에 대한 개략적인 구성을 나타낸 개념도이다.1 is a conceptual diagram showing a schematic configuration of a system of the present invention.

도 1을 참조하여 설명하면, 본 발명의 분산형 파일 시스템(IPFS) 기반의 파일 분산 저장 시스템은 바람직하게 복수의 IPFS 노드(2) 및, 해당 IPFS 노드(2)에 대한 관제 역할을 수행하는 메인 서버(1)를 포함한다.1, the distributed file system (IPFS)-based file distributed storage system of the present invention is preferably a plurality of IPFS nodes (2) and a main that performs a control role for the corresponding IPFS nodes (2) It includes a server (1).

더불어 여기서 설명하는 파일이라 함은 일반적으로 인터넷 상에서 주고받을 수 있는 다양한 파일들을 의미하며, 확장자의 종류에는 관계없이 상호 전송이 가능하여 이용이 가능한 모든 파일을 의미한다. 즉 그 종류와 관계없이 다양한 확장자를 지닌 모든 파일 또는 데이터들이 파일이 될 수 있다.In addition, the files described herein generally refer to various files that can be sent and received over the Internet, and refer to all files available for mutual transmission regardless of the type of extension. That is, all files or data with various extensions can be files regardless of their type.

IPFS 노드(2)(IPFS node), 줄여서 노드는 파일 분산 저장 시스템에 있어 파일이 분할된 데이터 조각에 대한 저장 처리뿐 아니라 해당 파일의 다운로드 내역 또는 업로드 내역을 포함할 수 있는 트랜젝션 정보가 포함된 블록 및 해당 블록이 쌓인 블록체인에 대한 사본을 저장하는 기능을 수행한다.IPFS node (2) (IPFS node), abbreviated as a node, is a block containing transaction information that may include not only the storage processing for data fragments in which the file is divided, but also the download or upload history of the file in a file distributed storage system. And it performs the function of storing a copy of the block chain on which the block is stacked.

이러한 IPFS 노드(2)는 블록체인 네트워크에 연결된 모든 컴퓨팅 장치를 의미하는 것으로, 이러한 블록체인 복사본, 검증 엔진, P2P 네트워크 배포 기능(브로드캐스트) 등을 포함할 수 있으며, 조금 더 Light한 기능들로만 구성된 클라이언트들도 포함될 수 있다.These IPFS nodes (2) refer to all computing devices connected to the blockchain network, and may include such blockchain copies, verification engines, P2P network distribution functions (broadcasts), etc. Clients may also be included.

이러한 IPFS 노드(2)를 구체적 예를 들어 설명하면, 레퍼런스 클라이언트(Reference client)는 합의 알고리즘이 작업 증명(POW, Proof of Work)인 경우 전체 블록체인 중 전부 또는 적어도 일부의 블록을 저장하는 블록체인 데이터베이스(Blockchain Database), 트랜잭션(Transaction)을 블록체인 네트워크에 브로드캐스트(Broadcast)하는 네트워크 라우팅 모듈(Network Routing Module)을 포함하는 노드(3)를 의미할 수 있다. 풀노드(Full Node)는 블록체인 데이터베이스, 네트워크 라우팅 모듈을 포함하는 노드를 의미할 수 있다. 솔로 마이너 노드(Solo Miner Node)는, 마이닝 모듈, 블록체인 데이터베이스, 네트워크 라우팅 모듈을 포함하는 노드를 의미할 수 있다. 마이닝 노드(Mining Node)는, 마이닝 풀(Mining Pool)의 노드를 의미하는 풀 마이닝 노드(Pool Mining Node)에 연결되는 게이트웨이 라우터(Gateway Router)와 마이닝 모듈을 포함하는 가벼운 노드를 의미할 수 있다. 따라서 IPFS 노드(2)는 입력된 파일이 암호화된 상태에서 분할 처리된 데이터 조각에 대한 저장 처리 및 블록체인에 대한 저장을 수행하는 주체라 할 수 있다.If the IPFS node 2 is described as a specific example, the reference client is a block chain that stores all or at least some blocks of the entire block chain when the consensus algorithm is Proof of Work (POW). It may mean a node 3 including a database (Blockchain Database) and a network routing module that broadcasts a transaction to a block chain network. A full node may mean a node including a blockchain database and a network routing module. A Solo Miner Node may refer to a node including a mining module, a blockchain database, and a network routing module. A mining node may mean a light node including a gateway router connected to a pool mining node, which means a node of a mining pool, and a mining module. Therefore, the IPFS node 2 can be said to be a subject that performs storage processing for data fragments divided and processed in a state in which the input file is encrypted, and storage for the block chain.

이때 바람직하게는 파일을 업로드하는 공유자 단말(3), 그리고 파일을 다운로드하는 요청자 단말(4) 모두가 IPFS 노드(2)가 되어 데이터 조각의 분산 저장 역할을 수행할 수도 있고, 혹은 공유자 단말(3) 및 요청자 단말(4) 중 일부가 IPFS 노드(2)가 되거나 혹은 별도의 데이터 조각 저장 역할을 수행하는 IPFS 노드(2)가 존재하는 것 역시 가능하다.At this time, preferably, both the sharer terminal 3 for uploading the file and the requestor terminal 4 for downloading the file may become the IPFS node 2 and serve as a distributed storage of data fragments, or the sharer terminal 3 ) and a part of the requestor terminal 4 become the IPFS node 2 , or it is also possible that the IPFS node 2 performing a separate data fragment storage role exists.

여기서 공유자 단말(3) 및 요청자 단말(4)은 파일에 대한 업로드 및 다운로드를 수행할 수 있는 개인 PC일 수 있으며, 혹은 태블릿PC나 스마트폰일 수도 있다.Here, the sharer terminal 3 and the requestor terminal 4 may be personal PCs capable of uploading and downloading files, or may be a tablet PC or a smart phone.

더불어 IPFS 노드(2)는 입력된 파일에 대한 위변조 검증 역할을 겸비할 수 있는데, 이때 위변조검증은 해시값을 매개로 진행될 수 있다. 이러한 해시값 검증은, IPFS 노드(2) 각각이 저장한 블록체인에 포함되는 해시값을 각각 비교 처리하여 해시값이 일치함에 따라서 해당 파일의 위변조가 일어나지 않았음을 판정하는 것이다.In addition, the IPFS node 2 may serve as a forgery verification role for the input file, and in this case, the forgery verification may be performed through a hash value. This hash value verification is to compare and process the hash values included in the block chain stored by each IPFS node 2, and as the hash values match, it is determined that forgery of the corresponding file has not occurred.

메인 서버(1)는 IPFS 노드(2)에 대한 관제 및 관리를 수행하는 것으로서, 이때 바람직하게는 메인 서버(1)가 피어 리더(Peer Leader) 역할을 수행하는 것이라 할 수 있다. 따라서 IPFS 노드(2) 중 피어 리더의 역할을 수행하는 것이 메인 서버(1)가 될 수 있으며, 이때 메인 서버(1)는 반등 노드일 수 있다. 혹은 별도의 메인 서버(1)를 구비하여 메인 서버(1)를 통해 IPFS 노드(2)에 대한 관리 역할을 수행하게 하는 것도 가능하다. 따라서 메인 서버(1)는 IPFS 네트워크를 구성하며 피어 리더의 역할을 수행할 수 있는 것이라면 별도의 제한을 두지 않는다.The main server 1 performs control and management of the IPFS node 2, and in this case, it can be said that the main server 1 preferably performs the role of a peer leader. Therefore, the main server 1 may be the main server 1 among the IPFS nodes 2 that plays the role of a peer leader, and in this case, the main server 1 may be a rebound node. Alternatively, it is also possible to provide a separate main server 1 to perform a management role for the IPFS node 2 through the main server 1 . Therefore, the main server 1 constitutes an IPFS network and there is no special limitation as long as it can perform the role of a peer leader.

다시 말해 본 발명의 파일 분산 저장 시스템에서 파일의 암호화, 분할 처리, 그리고 IPFS 노드(2)로의 분산 저장을 수행하는 주체가 곧 메인서버라 할 수 있다.In other words, in the distributed file storage system of the present invention, the subject that performs encryption, division processing, and distributed storage of files to the IPFS node 2 is the main server.

따라서 메인 서버(1)는 IPFS 네트워크를 구성하여 IPFS 노드(2)들을 멤버로 하는 IPFS 클러스터를 구성하고, 해당 IPFS 클러스터에 포함되는 IPFS 노드(2)에 파일을 분할 처리한 데이터 조각을 분산 저장시키는 기능을 수행한다 할 수 있다.Therefore, the main server (1) configures an IPFS network to form an IPFS cluster with IPFS nodes (2) as members, and distributes and stores the data fragments processed by dividing the files in the IPFS nodes (2) included in the IPFS cluster. function can be performed.

더불어 메인 서버(1)는 IPFS 노드(2)의 운영 상태(데이터 전송 가능/불가능 확인)을 파악하는 능동적 감시를 수행함과 동시에 데이터의 업로드/다운로드 상태를 모니터링하는 수동적 감시를 통해 IPFS 노드(2) 및 그를 통한 데이터 분산 저장, 그리고 데이터의 업로드 및 다운로드에 대한 관제 및 모니터링을 수행하는 것이라 할 수 있다.In addition, the main server (1) performs active monitoring to determine the operational status of the IPFS node (2) (checking whether data transmission is possible/impossible), and at the same time performs passive monitoring to monitor the upload/download status of the IPFS node (2) and data distributed storage through it, and control and monitoring of data upload and download.

이러한 메인 서버(1)는 통신을 수행하기 위해 통신부 및 전송수단을 구비한 상태에서 CPU와 저장수단을 구비한 하드웨어를 의미하는 것으로, 이 CPU에서 수행될 소프트웨어에 의해 후술할 일련의 모듈 및 이의 구체적 기능이 도출될 수 있다.The main server 1 refers to hardware having a CPU and a storage means in a state in which a communication unit and a transmission means are provided to perform communication, and a series of modules and specifics thereof to be described later by software to be executed in the CPU. function can be derived.

이와 같은 메인 서버(1)는 중앙처리장치(CPU) 및 메모리와 하드디스크와 같은 저장수단을 구비한 하드웨어 기반에서 중앙처리장치에서 수행될 수 있는 프로그램, 즉 소프트웨어가 설치되어 이 소프트웨어를 실행할 수 있는데 이러한 소프트웨어에 대한 일련의 구체적 구성을 '모듈' 및 '부', '인터페이스'라는 구성 단위로서 후술할 예정이다. Such a main server 1 has a central processing unit (CPU) and a program that can be executed in the central processing unit on a hardware basis having storage means such as a memory and a hard disk, that is, software is installed and this software can be executed. A series of specific configurations for such software will be described later as structural units called 'modules', 'parts', and 'interfaces'.

이러한 메인 서버(1)는 이 내부에서 처리되는 신호(또는, 데이터)를 일시적 및/또는 영구적으로 저장하는 램(RAM: Random Access Memory, 미도시) 및 롬(ROM: Read-Only Memory, 미도시), 프로세서를 포함할 수 있다. 또한 중앙관제서버(3)는 그래픽 처리부, 램 및 롬 중 적어도 하나를 포함하는 시스템온칩(SoC: system on chip) 형태로 구현될 수 있다.The main server 1 is a RAM (Random Access Memory, not shown) and ROM (Read-Only Memory, not shown) that temporarily and/or permanently store signals (or data) processed therein. ), and may include a processor. In addition, the central control server 3 may be implemented in the form of a system on chip (SoC) including at least one of a graphic processing unit, RAM, and ROM.

프로세서는 하나 이상의 코어(core, 미도시) 및 그래픽 처리부(미도시) 및/또는 다른 구성 요소와 신호를 송수신하는 연결 통로(예를 들어, 버스(bus) 등)를 포함할 수 있다The processor may include one or more cores (not shown) and a graphic processing unit (not shown) and/or a connection path (eg, a bus, etc.) for transmitting and receiving signals to and from other components.

메모리에는 후술할 모듈 내지 부의 실행 및 제어를 위한 프로그램들(하나 이상의 인스트럭션들)을 저장할 수 있다. 메모리에 저장된 프로그램들은 기능에 따라 복수 개의 모듈들로 구분될 수 있다.The memory may store programs (one or more instructions) for executing and controlling a module or a unit to be described later. Programs stored in the memory may be divided into a plurality of modules according to functions.

본 발명의 실시예와 관련하여 설명된 방법 또는 알고리즘의 단계들은 하드웨어로 직접 구현되거나, 하드웨어에 의해 실행되는 소프트웨어 모듈로 구현되거나, 또는 이들의 결합에 의해 구현될 수 있다. 소프트웨어 모듈은 RAM(Random Access Memory), ROM(Read Only Memory), EPROM(Erasable Programmable ROM), EEPROM(Electrically Erasable Programmable ROM), 플래시 메모리(Flash Memory), 하드 디스크, 착탈형 디스크, CD-ROM, 또는 본 발명이 속하는 기술 분야에서 잘 알려진 임의의 형태의 컴퓨터 판독가능 기록매체에 상주할 수도 있다.The steps of the method or algorithm described in relation to the embodiment of the present invention may be implemented directly in hardware, implemented as a software module executed by hardware, or implemented by a combination thereof. A software module may include random access memory (RAM), read only memory (ROM), erasable programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, hard disk, removable disk, CD-ROM, or It may reside in any type of computer-readable recording medium well known in the art to which the present invention pertains.

즉, 본 발명의 구성 요소들은 하드웨어인 컴퓨터와 결합되어 실행되기 위해 프로그램(또는 어플리케이션)으로 구현되어 매체에 저장될 수 있다. 본 발명의 구성 요소들은 소프트웨어 프로그래밍 또는 소프트웨어 요소들로 실행될 수 있으며, 이와 유사하게, 실시 예는 데이터 구조, 프로세스들, 루틴들 또는 다른 프로그래밍 구성들의 조합으로 구현되는 다양한 알고리즘을 포함하여, C, C++, 자바(Java), 어셈블러(assembler) 등과 같은 프로그래밍 또는 스크립팅 언어로 구현될 수 있다. 기능적인 측면들은 하나 이상의 프로세서들에서 실행되는 알고리즘으로 구현될 수 있다.That is, the components of the present invention may be implemented as a program (or an application) to be executed in combination with a computer, which is hardware, and stored in a medium. Components of the present invention may be implemented as software programming or software components, and similarly, embodiments may include various algorithms implemented as data structures, processes, routines, or combinations of other programming constructs, including C, C++ , Java, assembler, etc. may be implemented in a programming or scripting language. Functional aspects may be implemented in an algorithm running on one or more processors.

이러한 '모듈' 또는 '부' 또는 '인터페이스'의 구성은 메인 서버(1)의 저장수단에 설치 및 저장된 상태에서 CPU 및 메모리를 매개로 실행되는 소프트웨어 또는 FPGA 내지 ASIC과 같은 하드웨어의 일 구성을 의미한다. 이때, '모듈' 또는 '부', '인터페이스'라는 구성은 하드웨어에 한정되는 의미는 아니고, 어드레싱할 수 있는 저장 매체에 있도록 구성될 수도 있고 하나 또는 그 이상의 프로세서들을 재생시키도록 구성될 수도 있다.The configuration of these 'modules' or 'parts' or 'interfaces' refers to a configuration of software or hardware such as FPGA or ASIC that is installed and stored in the storage means of the main server 1 and is executed via the CPU and memory. do. In this case, the configuration of 'module', 'unit', and 'interface' is not limited to hardware, and may be configured to be in an addressable storage medium or may be configured to reproduce one or more processors.

일예로서 '모듈' 또는 '부' 또는 '인터페이스'는 소프트웨어 구성요소들, 객체지향 소프트웨어 구성요소들, 클래스 구성요소들 및 태스크 구성요소들과 같은 구성요소들과, 프로세스들, 함수들, 속성들, 프로시저들, 서브루틴들, 프로그램 코드의 세그먼트들, 드라이버들, 펌웨어, 마이크로 코드, 회로, 데이터, 데이터베이스, 데이터 구조들, 테이블들, 어레이들 및 변수들을 포함한다.As an example, 'module' or 'part' or 'interface' refers to components such as software components, object-oriented software components, class components and task components, processes, functions, and properties. , procedures, subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, database, data structures, tables, arrays and variables.

이러한 '모듈' 또는 '부' 또는 '인터페이스'에서 제공되는 기능은 더 작은 수의 구성요소들 및 '부' 또는'모듈'들로 결합되거나 추가적인 구성요소들과 '부' 또는 '모듈'들로 더 분리될 수 있다.Functions provided by these 'modules' or 'units' or 'interfaces' may be combined into a smaller number of components and 'units' or 'modules' or additional components and 'units' or 'modules' can be further separated.

이하, 이와 같은 거시적 구성을 기반으로 분산형 파일 시스템(IPFS) 기반의 파일 분산 저장 시스템에 대한 세부 구성 및 기능을 설명하도록 한다.Hereinafter, a detailed configuration and function of a distributed file system (IPFS)-based file distributed storage system based on such a macroscopic configuration will be described.

도 2는 본 발명의 시스템의 전체 구성을 도시한 블록도이다.2 is a block diagram showing the overall configuration of the system of the present invention.

도 2를 참조하여 설명하면, 본 발명의 분산형 파일 시스템(IPFS) 기반의 파일 분산 저장 시스템은 암호화 모듈(100), 분산 저장 모듈(200), 블록체인 관리 모듈(300)을 포함하는 것을 특징으로 한다.Referring to FIG. 2 , the distributed file system (IPFS)-based file distributed storage system of the present invention includes an encryption module 100 , a distributed storage module 200 , and a block chain management module 300 . do it with

암호화 모듈(100)은 입력된 파일을 암호화하는 기능을 수행한다. 이때 파일 입력 주체는 IPFS 네트워크에 포함된 IPFS 노드(2) 중 어느 하나일 수 있으며, 혹은 다른 단말에 의해 파일이 입력되는 것도 가능하다. 여기서 파일을 입력한 단말을 특정하여 공유자 단말(3)이라고 지칭할 수도 있다.The encryption module 100 performs a function of encrypting the input file. In this case, the file input subject may be any one of the IPFS nodes 2 included in the IPFS network, or a file may be input by another terminal. Here, the terminal to which the file is input may be specified and referred to as the sharer terminal 3 .

즉 암호화 모듈(100)은 공유자 단말(3)로부터 입력된 파일을 암호화 처리하는 것이라 할 수 있으며, 이때 공유자 단말(3)은 그 자체가 IPFS 네트워크에 포함된 IPFS 노드(2) 중 어느 하나일 수도 있고, IPFS 노드(2)의 기능을 수행하지 않을 수도 있다.That is, the encryption module 100 may be said to encrypt the file input from the sharer terminal 3, and in this case, the sharer terminal 3 may itself be any one of the IPFS nodes 2 included in the IPFS network. And, it may not perform the function of the IPFS node (2).

나아가 입력된 파일을 암호화하는 하는 암호화 방식에 있어서는 제한을 두지 않을므로, 공지의 다양한 암호화 알고리즘을 이용하여 입력된 파일에 대한 암호화를 수행할 수 있다.Furthermore, since there is no restriction on the encryption method for encrypting the input file, the input file can be encrypted using various known encryption algorithms.

이때 공지의 암호화 알고리즘이라 함은 공개키 암호화 또는 비밀키 암호화 등이 있을 수 있으며, 그 외의 다양한 방식을 통하여 암호화를 수행할 수도 있다. 나아가 바람직하게 여기서 공개키 암호화 방식을 통해 암호화가 이루어지는 경우, IPFS 노드(2)들은 각각의 암호화적 연관관계를 가질 수 있는 개인키(private key)와 공개키(public key)의 쌍을 가질 수 있으며, 이때 파일은 개인키를 통해 암호화된다.In this case, the known encryption algorithm may include public key encryption or private key encryption, and encryption may be performed through various other methods. Further preferably, when encryption is performed through a public key encryption method, the IPFS nodes 2 may have a pair of a private key and a public key that can have respective cryptographic correlations, , the file is encrypted with the private key.

나아가 또 다른 공지의 암호화 알고리즘으로써 MD5(Message-digest algorithm 5), SHA(Secure Hash Algorithm) 등 다양한 암호화 알고리즘이 적용될 수 있음은 물론이다.Furthermore, as another known encryption algorithm, it goes without saying that various encryption algorithms such as MD5 (Message-digest algorithm 5) and SHA (Secure Hash Algorithm) may be applied.

보다 바람직하게는 상기와 같은 암호화 알고리즘을 통해 파일을 암호화하여 해시(hash)값을 생성할 수 있다. 여기서 해시값이라 함은 복사한 디지털 증거의 동일성을 입증하기 위해 파일 특성을 축약한 암호같은 수치로, 일반적으로 16진수의 코드 값으로 산출된다. 업로드된 파일의 위변조가 일어나지 않으면 해시값의 변화는 일어나지 않는다. 이러한 해시값은 파일마다의 고유한 전자 지문이라도 표현할 수 있으며, 이를 통해 진위 여부를 확인하고 파일의 위변조를 식별할 수 있다.More preferably, a hash value may be generated by encrypting the file through the encryption algorithm as described above. Here, the hash value is a password-like number that abbreviates the file characteristics to prove the identity of the copied digital evidence, and is generally calculated as a hexadecimal code value. If forgery of the uploaded file does not occur, the hash value does not change. Such a hash value can express even a unique electronic fingerprint for each file, and through this, authenticity can be checked and forgery of a file can be identified.

따라서 보다 바람직하게는 파일에 대한 입력이 공유자 단말(3)로부터 이루어진다 하였을 때, 공유자 단말(3)로부터 입력된 파일을 공유자 단말(3)의 개인키를 사용하여 암호화하고 그에 따라 해시값을 생성할 수 있다.Therefore, more preferably, when the input to the file is made from the sharer terminal 3, the file input from the sharer terminal 3 is encrypted using the private key of the sharer terminal 3, and a hash value is generated accordingly. can

분산 저장 모듈(200)은 암호화된 파일을 복수의 데이터 조각으로 분할 처리하고, IPFS 네트워크를 통해 상호 연결된 복수의 IPFS 노드(2)에 분할된 데이터 조각을 분산 저장하는 기능을 수행한다.The distributed storage module 200 divides the encrypted file into a plurality of data fragments and performs a function of distributing and storing the divided data fragments in a plurality of IPFS nodes 2 interconnected through an IPFS network.

IPFS 네트워크는 상술한 메인 서버(1)에 의해 형성된 것이며, 이때 메인 서버(1)가 구축한 IPFS 네트워크에 포함되어 IPFS 클러스터를 형성하는 각각의 IPFS 노드(2)에 파일이 분할 처리되어 형성된 데이터 조각이 분산 저장된다.The IPFS network is formed by the above-described main server 1, and at this time, the data fragments formed by dividing files into each IPFS node 2 that are included in the IPFS network built by the main server 1 and form an IPFS cluster. This is distributed and stored.

이러한 IPFS 노드(2)를 통한 분산 저장은 저장 기능을 수행하는 중앙 데이터베이스 없이 IPFS 노드(2) 간의 P2P 통신으로 파일의 분산 저장을 실행하는 것이다. 나아가 상술한 구성에서 업로드된 파일에 대한 암호화가 수행되어 해시값이 생성된다 하는 경우, 생성된 해시값과 연계된 파일이 분할되어 복수의 데이터 조각이 형성되고, 이러한 데이터 조각이 각각의 IPFS 노드(2)에 분산 저장되는 것이다. Distributed storage through such IPFS nodes 2 is to execute distributed storage of files through P2P communication between IPFS nodes 2 without a central database performing a storage function. Furthermore, in the above configuration, when encryption is performed on the uploaded file to generate a hash value, the file associated with the generated hash value is divided to form a plurality of data fragments, and these data fragments are stored in each IPFS node ( 2) is distributed and stored.

블록체인 관리 모듈(300)은 트랜젝션 정보가 저장되는 블록 및 복수의 블록을 연결 처리하여 블록체인을 형성하고, 블록체인의 사본을 IPFS 노드(2)에 전송하는 역할을 포함한다.The block chain management module 300 forms a block chain by connecting and processing a block in which transaction information is stored and a plurality of blocks, and includes a role of transmitting a copy of the block chain to the IPFS node 2 .

이때 트랜젝션 정보는 본 발명의 파일 분산 저장 시스템에 있어 업로드 및 다운로드된 파일의 다운로드 내역 및 업로드 내역을 기본적으로 포함할 수 있을 뿐 아니라, 파일의 암호화를 통해 생성된 해시값, 나아가 암호화가 공유자 단말(3)의 개인키를 통해 이루어진 경우 해당 개인키와 대응된 공유자 단말(3)의 공개키를 포함한다.At this time, the transaction information may basically include the download history and upload history of uploaded and downloaded files in the file distribution storage system of the present invention, as well as the hash value generated through encryption of the file, furthermore, the encryption is performed by the sharer terminal ( In the case of 3) using the private key, the public key of the sharer terminal 3 corresponding to the corresponding private key is included.

다시 말해 트랜젝션 정보는 파일의 다운로드 내역 및 업로드 내역, 암호화를 통해 생성된 해시값, 나아가 암호화가 개인키를 통해 이루어진 경우 해당 개인키와 대응되는 공개키를 저장하는 것이며, 추가적으로 데이터 조각이 분산 저장된 위치에 대한 정보 및 동일한 파일의 다른 데이터 조각을 저장하고 있는 다른 IPFS 노드(2)들에 대한 정보를 포함한다.In other words, the transaction information stores the file download history and upload history, the hash value generated through encryption, and furthermore, when encryption is performed through a private key, the public key corresponding to the corresponding private key is stored. and information on other IPFS nodes (2) storing different pieces of data in the same file.

나아가 이와 같은 트랜젝션 정보가 모여 블록이 형성되는데, 이때 블록은 적어도 한 개 이상의 트랜젝션 정보가 모여 형성된 것이며, 이러한 블록이 복수 개로 연결되면 블록체인이 형성된다. 이때 블록의 크기에는 제한이 없으므로 역동적인 블록 크기를 지원할 수 있어 블록 크기에 따라 하나의 블록에 저장되는 트랜젝션 정보의 수는 가변적일 수 있다. 따라서 복수 개의 트랜젝션 정보가 하나의 블록을 이루고, 이러한 블록이 쌓이고 서로 연결되어 블록체인을 형성되는 것이다. 이때 블록체인의 원장은 바람직하게 메인 서버(1)에 저장되고, 사본의 경우 IPFS 네트워크에 포함된 IPFS 노드(2) 각각에 전송되어 파일에 대한 분산 저장 및 관리를 수행할 수 있다.Furthermore, such transaction information is gathered to form a block. In this case, a block is formed by gathering at least one piece of transaction information, and when a plurality of such blocks are connected, a block chain is formed. At this time, since there is no limit to the size of the block, a dynamic block size can be supported, so the number of transaction information stored in one block may vary according to the block size. Therefore, a plurality of transaction information constitutes one block, and these blocks are stacked and connected to each other to form a block chain. At this time, the blockchain ledger is preferably stored in the main server 1, and in the case of a copy, it is transmitted to each of the IPFS nodes 2 included in the IPFS network to perform distributed storage and management of files.

따라서 이와 같은 본 발명의 분산형 파일 시스템(IPFS) 기반의 파일 분산 저장 시스템을 통하여, 고용량의 파일을 빠르고 효율적으로 저장할 수 있고, 나아가 저장된 파일의 중복을 알 수 있어 저장소를 효율적으로 사용할 수 있으며, IPFS 노드(2) 상에서 파일의 보존을 원하는 경우 반영구적 보존을 가능케 할 수 있는 효과를 제공한다.Therefore, through the distributed file system (IPFS)-based file distribution storage system of the present invention, a high-capacity file can be stored quickly and efficiently, and furthermore, it is possible to efficiently use the storage by knowing the redundancy of the stored file, If you want to preserve a file on the IPFS node (2), it provides the effect of enabling semi-permanent preservation.

더불어 IPFS 노드(2)에 저장된 데이터 조각으로부터 파일을 복원 처리하기 위해, 본 발명의 시스템은 해시 복원 모듈(400) 및 파일 복원 모듈(500)을 포함할 수 있다.In addition, in order to restore a file from a piece of data stored in the IPFS node 2 , the system of the present invention may include a hash restoration module 400 and a file restoration module 500 .

해시 복원 모듈(400)은 공유자 단말(3)의 개인키를 이용하여 트랜젝션 정보에 저장된 해시값을 복원하는 기능을 수행한다. 상술한 바와 같이 해시값의 경우 트랜젝션 정보에 등기되고 파일이 분할 처리된 데이터 조각은 IPFS 노드(2)에 분산 저장된다 하였는데, 이때 암호화에 이용된 개인키를 입력받아 개인키를 이용하여 해시값을 복원하는 것이다. 즉 개인키의 암호화를 이용해 해시값을 찾는 것이라 할 수 있으며, 여기서 해시값의 복원이라 함은 개인키를 통해 해시값을 알아내는 것이라 할 수 있다.The hash restoration module 400 performs a function of restoring a hash value stored in the transaction information using the private key of the sharer terminal 3 . As described above, in the case of a hash value, it is said that the data fragments registered in the transaction information and the file is divided and stored in the IPFS node 2, and at this time, the private key used for encryption is received and the hash value is calculated using the private key. is to restore That is, it can be said to find a hash value using encryption of the private key, and restoration of the hash value can be said to find out the hash value through the private key.

파일 복원 모듈(500)은 복원된 해시값을 기반으로 IPFS 분산 저장된 데이터 조각을 모아 파일을 복원 처리하는 것이다. 다시 말해 같은 해시값을 갖는, 즉 같은 지문을 갖는 데이터 조각을 수집하고, 그를 통해 파일을 복원하는 것인데, 이는 IPFS 노드(2)에 저장된 데이터 조각에 대한 정보 중에서 특정 해시값을 갖는 데이터 조각만을 수집하고, 수집된 데이터 조각을 합쳐 파일을 복원 처리하는 것이다.The file restoration module 500 collects pieces of IPFS distributed and stored data based on the restored hash value to restore the file. In other words, data fragments having the same hash value, that is, the same fingerprint, are collected, and files are restored through them, which collects only the data fragments having a specific hash value among the information about the data fragments stored in the IPFS node (2). Then, the collected data fragments are combined to restore the file.

나아가 이때 해시값 복원을 위해 개인키를 입력한 입력 주체가 파일에 대한 다운로드를 요청한 요청자 단말(4)인 경우, 복원된 파일을 요청자 단말(4)에 전송 처리될 수 있어 개인키를 알고 있는 요청자 단말(4)의 경우 개인키를 통해 해당 개인키로 파일을 쉽게 다운로드 받을 수 있다.Furthermore, if the input subject who input the private key to restore the hash value at this time is the requestor terminal 4 that has requested the download of the file, the restored file can be transmitted to the requester terminal 4 and the requestor who knows the private key In the case of the terminal 4, the file can be easily downloaded with the corresponding private key through the private key.

따라서 이와 같은 파일 복원 및 전송 기능을 통해 파일을 암호화하여 저장하는 본 발명의 파일 분산 저장 시스템에 있어, 파일에 대한 접근 및 다운로드에 있어서도 검증 과정을 거칠 수 있게 하여 저장된 파일에 대한 보안성을 높일 수 있도록 함은 기본이다.Therefore, in the file distributed storage system of the present invention that encrypts and stores files through such a file restoration and transmission function, it is possible to increase the security of the stored files by making it possible to go through a verification process even in accessing and downloading files. It is basic to make it happen.

나아가 본 발명의 파일 분산 저장 시스템은, 입력된 파일에 대한 신규한 암호화 방식을 통해 보안성을 보다 높이기 위해, 입력된 파일을 이루는 데이터에 보이드 데이터(void data)를 삽입하여 암호화를 수행하는 신규한 방식의 암호화를 수행할 수 있다.Furthermore, the file distributed storage system of the present invention is a novel encryption method by inserting void data into data constituting the input file to further enhance security through a novel encryption method for the input file. encryption method can be performed.

이때 보이드 데이터라 함은 의미 없는 빈 데이터를 의미한다. 일반적으로 0이나 1로 채워진 비트열로 구성되어 있으며, 아무런 의미가 없는 것을 특징으로 한다. 따라서 이와 같은 보이드 데이터 삽입을 통하여 입력된 파일을 암호화할 수 있을 뿐 아니라 IPFS 노드(2) 사이에서의 반복적인 파일의 송수신에 있어 전송 속도를 일정 수준으로 유지하기 위한 역할을 수행한다.In this case, the void data means empty data without meaning. In general, it consists of a bit string filled with 0 or 1, and is characterized by no meaning. Therefore, it is possible to encrypt the input file through the insertion of the void data as described above, and it serves to maintain the transmission speed at a constant level in the repeated transmission and reception of files between the IPFS nodes 2 .

따라서 이와 같은 의미없는 보이드 데이터 삽입을 통해 파일을 암호화 처리함으로써, 본 발명의 분산 저장 시스템 상에 저장된 파일에 외부 침입자가 접근하여 파일에 대한 해킹을 수행하는 것을 방지할 수 있다.Therefore, by encrypting the file through such meaningless void data insertion, it is possible to prevent an external intruder from accessing and hacking the file stored on the distributed storage system of the present invention.

나아가 이러한 보이드 데이터는 별도의 주기와 같은 패턴 없이 불규칙적으로 삽입될 수도 있으나, 바람직하게는 패턴화된 보이드 데이터를 삽입 처리하는 것도 가능한데, 이를 위해서는 보이드 데이터가 삽입될 패턴을 생성하는 구성이 겸비되어야 한다. 이를 위해 암호화 모듈(100)은 저장 용량 파악부(110), 패턴 생성부(120), 암호화 수행부(130)를 포함할 수 있다.Furthermore, such void data may be irregularly inserted without a pattern such as a separate period, but preferably, it is also possible to insert patterned void data. . To this end, the encryption module 100 may include a storage capacity determining unit 110 , a pattern generating unit 120 , and an encryption performing unit 130 .

저장 용량 파악부(110)는 IPFS 네트워크를 통해 저장된 파일의 전체 용량을 파악하는 기능을 수행한다. 이는 본 발명의 IPFS 네트워크를 통해 분산 저장된 파일의 총 용량을 의미하는 것이며, 개별 IPFS 노드(2)에 저장된 데이터 조각의 총 용량을 의미하기 보다는 본 발명의 IPFS 네트워크를 통해 분산 저장 및 관리되고 있는 모든 파일의 총 용량을 의미하는 것이라 할 수 있다.The storage capacity determining unit 110 performs a function of identifying the total capacity of a file stored through the IPFS network. This means the total capacity of the files distributed and stored through the IPFS network of the present invention, rather than the total capacity of data fragments stored in individual IPFS nodes 2, all distributed and stored and managed through the IPFS network of the present invention. It can be said to mean the total capacity of the file.

여기서 용량은 일반적으로 MB, GB, TB단위 중 어느 하나일 수 있으며, 바람직하게는 용량 관리 단위로서 주로 이용되는 GB나 TB중 어느 하나를 이용할 수 있다.Here, the capacity may generally be any one of MB, GB, and TB units, and preferably, any one of GB or TB mainly used as a capacity management unit may be used.

패턴 생성부(120)는, 상술한 저장 용량 파악부(110)에서 파악된 파일의 전체 용량에 난수를 반영하여 데이터에 삽입될 보이드 데이터(void data)의 데이터패턴을 생성하는 기능을 수행한다.The pattern generating unit 120 performs a function of generating a data pattern of void data to be inserted into the data by reflecting the random number to the total capacity of the file identified by the above-described storage capacity determining unit 110 .

데이터패턴이라 함은 파일을 이루는 데이터에 보이드 데이터가 삽입되는 주기 및 삽입되는 개수를 포함하는 의미로서, 몇 개의 데이터마다 보이드 데이터를 삽입할 것인지를 의미하는 것이다. 예를 들어, ‘XX개의 데이터 마다 보이드 데이터를 삽입한다.’가 주기정보가 될 수 있으며, 여기서 주기정보의 설정 방식에 있어서는 제한을 두지 않는다.The data pattern includes the period and the number of insertions of void data in the data constituting the file, and indicates how many pieces of data to insert the void data. For example, 'Insert void data for every XX pieces of data' may be period information, and there is no restriction on the setting method of period information.

더불어 개수라고 함은, 주기마다 삽입되는 실제 보이드 데이터의 개수를 의미한다. “1주기에는 2개의 보이드 데이터를 삽입하고, 2주기에는 3개의 보이드 데이터를 삽입한다.” 등의 방식으로 개수가 지정될 수도 있으며, 가장 단순하게는 모든 주기마다 동일한 개수의 보이드 데이터가 삽입될 수 있도록 동일한 개수를 지정하는 것 역시 가능하다. 이와 같이 보이드 데이터가 삽입되는 개수를 지정하는 방식이나 그 개수에 있어서는 제한이 없다.In addition, the number means the number of actual void data inserted in each period. “In the first cycle, two void data are inserted, and in the second cycle, three void data are inserted.” The number may be designated in such a way that, most simply, it is also possible to designate the same number so that the same number of void data can be inserted in every cycle. As described above, there is no limitation in the method of designating the number of inserted void data or the number.

더불어 본 발명의 파일 분산 저장 시스템을 통해 저장된 파일의 총 용량에 따라 보이드 데이터의 데이터패턴을 생성한다 하였는데, 바람직하게는 용량이 클수록 삽입되는 보이드 데이터의 개수를 줄이고, 용량이 작을수록 삽입되는 보이드 데이터의 개수를 보다 유동적으로 가변할 수 있도록 함으로써 본 발명의 파일 분산 저장 시스템에 저장되는 파일 용량이 너무 커지는 것에 의한 과부하를 방지할 수 있도록 한다. 나아가 여기에 난수를 반영하여 암호화가 이루어지므로, 암호화 성능을 보다 높일 수 있게 된다. 이때 난수는 가장 바람직하게는 랜덤함수, 즉 rand() 함수를 통해 얻을 수 있는 난수를 이용할 수 있으나, 생성 방식이나 범위에 있어서는 제한을 두지 않는다. In addition, it is said that the data pattern of void data is generated according to the total capacity of the stored file through the file distribution storage system of the present invention. It is possible to prevent overload due to an excessively large file capacity stored in the file distributed storage system of the present invention by allowing the number of . Furthermore, since the encryption is performed by reflecting the random number, the encryption performance can be further improved. At this time, the random number is most preferably a random function, that is, a random number obtained through the rand() function may be used, but there is no limitation in the generation method or range.

마지막으로 암호화 수행부(130)는 패턴 생성부(120)를 통해 생성된 데이터패턴에 따라 파일을 이루는 데이터에 보이드 데이터를 삽입하게 된다. 다시 말하자면, 패턴 생성부(120)를 통해 생성된 데이터패턴에 따라, 즉 설정된 주기 및 개수에 맞게 저장된 파일을 이루는 데이터에 보이드 데이터를 삽입하여 암호화 처리하게 된다.Finally, the encryption performing unit 130 inserts the void data into the data constituting the file according to the data pattern generated through the pattern generating unit 120 . In other words, according to the data pattern generated by the pattern generating unit 120, that is, the void data is inserted into the data constituting the stored file according to the set period and number, and the encryption process is performed.

따라서 이와 같은 방식으로 신규한 방식을 적용하여 본 발명의 파일 분산 저장 시스템에 저장되는 파일을 암호화할 수 있어 보안성을 강화할 수 있게 된다.Accordingly, by applying the novel method in this way, it is possible to encrypt the files stored in the file distributed storage system of the present invention, thereby enhancing security.

이와 같은 방식으로 암호화가 수행되는 경우, 트랜젝션 정보는 본 발명에 따라 분산 저장된 파일의 다운로드 내역 및 업로드 내역, 그리고 해시값을 포함할 수 있을 뿐 아니라 파일을 이루는 데이터에 삽입되어 암호화에 이용되는 보이드 데이터의 데이터패턴을 함께 포함하여 복호화에 이용할 수 있도록 한다.When encryption is performed in this way, the transaction information may include download history and upload history of files distributed and stored according to the present invention, and hash values, as well as void data inserted into data constituting the file and used for encryption. of data patterns are included so that they can be used for decryption.

다시 말해, 보이드 데이터가 삽입되는 주기 및 삽입되는 개수인 데이터패턴에 대한 정보를 트랜젝션 정보에 포함하여 복호화에 이용할 수 있도록 함으로써 데이터 암호화 및 복호화 효율을 높일 수 있도록 하며, 데이터패턴에 따라 보이드 데이터를 삽입하는 신규한 방식의 암호화를 통해 보안성을 강화할 수 있다.In other words, it is possible to increase data encryption and decryption efficiency by including information on data patterns, which is the period and number of insertions of void data, in the transaction information so that it can be used for decryption, and inserting void data according to the data pattern Security can be strengthened through a new encryption method.

도 3은 그룹의 예시를 나타낸 개념도이다.3 is a conceptual diagram illustrating an example of a group.

도 3을 참조하여 설명하면, 본 발명의 본 발명의 파일 분산 저장 시스템에 저장되는 파일을 이루는 데이터를 그룹화하여 그룹화된 데이터의 사이사이에 보이드 데이터를 삽입함으로써, 보이드 데이터의 삽입 위치를 보다 규격화할 수 있다. 이를 위해 암호화 모듈(100)은 그룹 설정부를 포함할 수 있다.3, by grouping the data constituting the file stored in the file distribution storage system of the present invention of the present invention and inserting the void data between the grouped data, the insertion position of the void data can be more standardized. can To this end, the encryption module 100 may include a group setting unit.

그룹 설정부는 파일을 이루는 데이터를 오프닝 데이터(Opening data)와 상기 오프닝 데이터에 종속되는 적어도 하나의 어사인드 데이터(Assigned data)으로 그룹화하여 복수의 그룹을 생성하는 기능을 포함한다.The group setting unit includes a function of creating a plurality of groups by grouping data constituting the file into opening data and at least one assigned data subordinate to the opening data.

여기서 파일을 이루는 데이터를 생성된 시간에 따라, 즉 파일을 이루는 데이터가 생성된 시간에 따라 시간 순서로 나열하고, 일정 시간마다 위치하는 데이터를 오프닝 데이터(Opening data)로 지정(즉, 부호화)한다.Here, the data constituting the file is arranged in chronological order according to the creation time, that is, according to the time when the data constituting the file is created, and the data located at a predetermined time is designated (that is, encoded) as opening data. .

그 후 어느 하나의 오프닝 데이터를 기준으로 시간 순서에 따라 다음 오프닝 데이터까지 위치하는 적어도 하나 이상의 데이터들을 각각 어사인드 데이터(Assigned data)로 지정(즉, 부호화)하여 기준으로 지정된 오프닝 데이터에 종속시키면, 하나의 오프닝 데이터와 해당 오프닝 데이터에 종속된 적어도 하나의 어사인드 데이터를 포함하는 그룹을 형성할 수 있다.After that, if at least one or more data located up to the next opening data in chronological order based on any one opening data is designated (that is, encoded) as assigned data, respectively, and subordinated to the opening data designated as a standard, A group including one opening data and at least one assign data dependent on the opening data may be formed.

이것은 MPEG 영상 포맷에서 활용되는 방식과 유사한 구성으로서, 특정 시간마다 혹은 특정 개수의 데이터를 묶어 그룹을 형성하되, 이 그룹에서 가장 앞선 시간에 위치하는 데이터를 오프닝 데이터로 지정하고, 그룹 내에서 오프닝 데이터를 제외한 나머지 데이터들을 어사인드 데이터로 지정하여 시간 순서 상 앞에 위치한 오프닝 데이터에 나머지 어사인드 데이터를 종속시켜 하나의 그룹을 생성하는 기능을 제공한다.This is a configuration similar to the method used in the MPEG video format. A group is formed at a specific time or by grouping a specific number of data, but the data located at the earliest time in the group is designated as the opening data, and the opening data within the group is designated as the opening data. It provides a function of creating one group by designating the remaining data except for as assigned data and subordinate the remaining assigned data to the opening data located earlier in the time sequence.

예를 들어, 상술한 바와 같이 하나의 그룹은 하나의 오프닝 데이터과 복수 개의 어사인드 데이터로 구성된다고 하였는데, 이때 그룹 설정부는 ‘OAAAA’와 같은 그룹을 생성할 수 있다. 여기서 그룹의 길이 및 해당 그룹에 포함되는 데이터의 수는 본 발명의 시스템 관리자에 의해 설정될 수 있다.For example, as described above, one group is composed of one opening data and a plurality of assign data. In this case, the group setting unit may create a group such as 'OAAAA'. Here, the length of the group and the number of data included in the group may be set by the system administrator of the present invention.

더불어 이와 같이 그룹이 형성되는 경우, 암호화 수행부(130)는 삽입되는 보이드 데이터마다 그룹 별 오프닝 데이터의 위치정보를 함께 삽입한다. 따라서 각각의 보이드 데이터가 어느 그룹의 어느 오프닝 데이터에 삽입되었는지를 확인할 수 있으므로, 삽입된 보이드 데이터의 위치 역시 파악할 수 있게 되어 복호화를 가능케 하는 것이다.In addition, when the group is formed in this way, the encryption performing unit 130 inserts the position information of the opening data for each group together for each inserted void data. Accordingly, since it is possible to check which group of which opening data each void data is inserted into, the location of the inserted void data can also be grasped, thereby enabling decoding.

여기서 오프닝 데이터의 위치정보라 함은 파일을 이루는 데이터 중에서 해당 오프닝 데이터가 몇 번째 데이터인지에 대한 정보를 의미한다. 또한 보이드 데이터가 각각의 주기마다 삽입되면서 오프닝 데이터의 위치 역시 변조되기 마련인데, 이를 파악함과 동시에 추가적인 암호화를 꾀할 수 있어 해킹으로부터의 보안성을 높일 수 있게 된다.Here, the position information of the opening data means information on which number of data the corresponding opening data is among the data constituting the file. In addition, as the void data is inserted in each cycle, the position of the opening data is also altered, and it is possible to increase the security from hacking by identifying this and performing additional encryption at the same time.

도 4는 그룹의 다른 실시예를 나타낸 개념도이다.4 is a conceptual diagram illustrating another embodiment of the group.

도 4를 참조하여 설명하면, 상술한 그룹 설정부는 파일을 이루는 데이터를 그룹화함에 있어 오프닝 데이터 및 어사인드 데이터, 그리고 추측 데이터를 포함하는 그룹을 설정할 수 있다.Referring to FIG. 4 , the above-described group setting unit may set a group including opening data, assign data, and guess data in grouping data constituting a file.

여기서 오프닝 데이터(Opening data)는 상술한 바와 같이 파일을 이루는 데이터가 생성된 시간에 따라 시간 순서로 나열하고, 일정 시간마다 위치하는 데이터를 오프닝 데이터(Opening data)로 지정(즉, 부호화)하여 그룹을 이루는 첫 번째 데이터를 오프닝 데이터로 지정하게 된다. 바람직하게 파일을 이루는 데이터 중에서 오프닝 데이터가 가장 큰 용량을 차지할 수 있다.Here, the opening data is arranged in chronological order according to the time at which the data constituting the file is generated as described above, and data located at a predetermined time is designated (that is, encoded) as the opening data to be grouped. The first data constituting the ? is designated as the opening data. Preferably, the opening data may occupy the largest capacity among data constituting the file.

나아가 어사인드 데이터(Assigned data)의 경우 해당 오프닝 데이터에 종속되는 것이며, 이때 그룹을 이루는 마지막 데이터를 어사인드 데이터로 지정하는 것이다. 즉 다음 오프닝 데이터의 바로 직전에 위치하는 데이터만을 어사인드 데이터로 지정하여, 그룹을 이루는 마지막 데이터를 어사인드 데이터로 지정한다. 어사인드 데이터는 오프닝 데이터에 비해 적은 용량을 차지할 수 있다.Furthermore, in the case of assigned data, it is subordinate to the corresponding opening data, and in this case, the last data constituting the group is designated as assigned data. That is, only the data located immediately before the next opening data is designated as the assign data, and the last data forming the group is designated as the assign data. The assigned data may occupy less capacity than the opening data.

이 경우 시간 순서로 파악하였을 때 오프닝 데이터 및 어사인드 데이터 사이에 위치하는 나머지 데이터가 추측 데이터(Supposition Data)로 부호화되는 것인데, 이때 추측 데이터의 경우 오프닝 데이터 및 어사인드 데이터 모두에 종속되는 가장 하위의 데이터라 할 수 있다.In this case, the remaining data located between the opening data and the assigned data is encoded as the Supposition Data when it is identified in chronological order. It can be called data.

이러한 추측 데이터의 경우 오프닝 데이터와 어사인드 데이터의 사이에 낀 데이터로서, 양쪽 데이터 모두를 참조하여 파일의 전송 시 오프닝 데이터 및 어사인드 데이터의 움직임을 추측하여 움직일 수 있다는 특성을 지닌다. 나아가 추측 데이터는 오프닝 데이터 및 어사인드 데이터에 비해 적은 용량을 차지할 수 있다.In the case of the guess data, it is data sandwiched between the opening data and the assign data, and it has a characteristic that the movement of the opening data and the assign data can be estimated and moved when the file is transferred by referring to both data. Furthermore, the guess data may occupy a smaller capacity than the opening data and the assigned data.

따라서 파일을 이루는 데이터를 그룹화하되, 그룹에 속한 데이터들을 오프닝 데이터, 어사인드 데이터, 추측 데이터로 부호화하여 모든 데이터가 동일한 용량을 차지하는 것이 아닌 용량의 차등을 두도록 설정함으로써 하나의 그룹이 차지하는 용량을 최소화할 수 있도록 하고, 나아가 전송속도를 원활히 할 수 있도록 한 효과가 있다. 여기서 그룹의 길이 및 해당 그룹에 포함되는 데이터의 수는 본 발명의 시스템 관리자에 의해 설정될 수 있다.Therefore, group the data constituting the file, but minimize the capacity occupied by one group by encoding the data belonging to the group as opening data, assign data, and guess data, so that all data does not occupy the same capacity but different capacities. It has the effect of making it possible to do this, and further improving the transmission speed. Here, the length of the group and the number of data included in the group may be set by the system administrator of the present invention.

더불어 이와 같이 그룹이 형성되는 경우, 암호화 수행부(130)는 그룹 별 오프닝 데이터의 위치정보 뿐 아니라 어사인드 데이터의 위치정보를 함께 삽입하도록 하여따라서 각각의 보이드 데이터가 어느 그룹의 어느 오프닝 데이터에 삽입되었는지를 확인하도록 하고, 어사인드 데이터의 위치정보를 매개로 그룹의 정확한 크기를 파악할 수 있도록 함으로써 복호화 효율을 극대화할 수 있다.In addition, when a group is formed in this way, the encryption performing unit 130 inserts not only the location information of the opening data for each group but also the location information of the assign data, so that each void data is inserted into any opening data of a certain group Decryption efficiency can be maximized by checking whether the data has been confirmed or not, and by allowing the precise size of the group to be identified through the location information of the assigned data.

더불어 상술한 암호화 모듈(100)은 생성된 데이터패턴에 대한 보정을 수행하여 보안성을 보다 높일 수 있는데, 이를 위해 암호화 모듈(100)은 비교수치 산출부(140) 및 패턴 보정부(150)를 포함할 수 있다. 따라서 이와 같은 구성을 통해 데이터패턴의 보정이 가능한 경우, 암호화 수행부(130)는 보정 처리된 데이터패턴에 따라 보이드 데이터를 삽입하여 파일을 이루는 데이터를 암호화할 수 있다.In addition, the above-described encryption module 100 can perform correction on the generated data pattern to further enhance security. may include Accordingly, when the data pattern can be corrected through this configuration, the encryption performing unit 130 may encrypt data constituting the file by inserting void data according to the corrected data pattern.

비교수치 산출부(140)는 본 발명의 분산 저장 시스템을 이용해 저장된 파일의 총 용량, 디스크 할당 크기 및 오프닝 데이터 및 상술한 난수를 함께 반영하여 비교수치를 산출하는 기능을 수행한다.The comparison value calculation unit 140 calculates a comparison value by reflecting the total capacity of the files stored using the distributed storage system of the present invention, the disk allocation size and opening data, and the above-described random number together.

이때 비교수치 산출부(140)를 통한 비교수치 산출 방식에는 제한이 없으므로 저장된 분산 저장 시스템을 이용해 저장된 파일의 총 용량, 디스크 할당 크기 및 포함된 오프닝 데이터의 개수를 단순히 더하거나 곱하여 비교수치를 산출하는 것도 가능하나, 가장 바람직하게는 다음의 수학식 1을 기반으로 보정수치를 산출할 수 있다.At this time, there is no limitation in the method of calculating the comparative numerical value through the comparative numerical value calculation unit 140, so calculating the comparative numerical value by simply adding or multiplying the total capacity of the stored files, the disk allocation size, and the number of included opening data using the stored distributed storage system. It is possible, but most preferably, the correction value can be calculated based on Equation 1 below.

수학식 1,

Equation 1,

(여기서, S는 보정수치, c는 분산 저장 시스템을 이용해 저장된 파일의 총 용량, d는 분산 저장 시스템을 이용해 저장된 파일의 디스크 할당 크기, o는 오프닝 데이터의 총 개수, n은 선형합동법을 이용하여 생성된 난수로서

) (Where S is the correction value, c is the total capacity of the files stored using the distributed storage system, d is the disk allocation size of the files stored using the distributed storage system, o is the total number of opening data, and n is the linear congruence method. As a random number generated by

)

c와 d, 다시 설명하자면 저장된 분산 저장 시스템을 이용해 저장된 파일의 총 용량 및 디스크 할당 크기는 기본적으로 TB 단위를 이용할 수 있다. 예를 들어 용량이 408TB인 경우 c=408, 130GB인 경우 c=0.127가 된다.c and d, in other words, the total capacity and disk allocation size of files stored using the stored distributed storage system can use TB units by default. For example, if the capacity is 408 TB, c=408, and if the capacity is 130 GB, c=0.127.

나아가 저장된 분산 저장 시스템을 이용해 저장된 파일의 총 용량과 디스크 할당 크기는 같은 값을 나타낼 수도 있으나, 서로 다른 값을 나타낼 수도 있다. 이는 파일의 크기와 디스크 할당 크기가 서로 다를 수 있음을 반영한 것이며, 그 이유는 파일을 이루는 데이터가 저장되는 방식이 클러스터 단위로 저장되기 때문이다.Furthermore, the total capacity of the files stored using the stored distributed storage system and the disk allocation size may represent the same value, but may represent different values. This reflects that the file size and disk allocation size may be different from each other, because the data constituting the file is stored in cluster units.

디스크 할당 크기라 함은 클러스터의 크기에 따라 달라지는데, 이때 클러스터 크기의 경우 상술한 그룹의 수와 연관이 있다. 클러스터는 그룹의 군집을 일컫는 것으로, 이때 그룹의 개수 및 크기에 따라서 클러스터의 단위 크기가 달라지고, 이러한 클러스터의 크기 및 개수에 따라서 디스크 할당 크기가 달라진다.The disk allocation size varies depending on the size of the cluster. In this case, the cluster size is related to the number of groups described above. A cluster refers to a group of groups. In this case, the unit size of the cluster varies according to the number and size of the groups, and the disk allocation size varies according to the size and number of such clusters.

이에 따라 분산 저장 시스템을 통해 저장된 파일들의 총 용량이 디스크 할당 크기와 같을 수도 있으나, 서로 다를 수도 있으므로 이러한 두 값의 평균을 반영하여 자연로그를 취해 반영하는 것이며, 바람직하게는 비교수치의 경우에도 용량 및 디스크 할당 크기가 클수록 작아지고, 용량 및 디스크 할당 크기가 작을수록 커지는 경향을 나타낸다. Accordingly, the total capacity of the files stored through the distributed storage system may be the same as the disk allocation size, but may be different from each other. and the larger the disk allocation size, the smaller the disk allocation size, and the smaller the capacity and disk allocation size, the greater the tendency.

여기서 용량 및 디스크 할당 크기의 기하평균을 반영하게 되는데, 이는 용량 및 디스크 할당 크기가 데이터의 변동에 의해 변할 수 있는 변량이므로, 해당 변량의 평균을 나타낼 시에는 산술평균보다 기하평균이 보다 적절하기 때문에 기하평균이 반영된다.Here, the geometric mean of the capacity and disk allocation size is reflected. This is because the capacity and disk allocation size are variables that can change depending on data fluctuations. The geometric mean is reflected.

더불어 o는 오프닝 데이터의 총 개수로서, 빅데이터를 이룰 수 있는 이용인 DB에 저장된 모든 데이터에 대해 설정된 오프닝 데이터의 총 개수인 만큼 매우 큰 값을 취할 수 있다. 따라서 오프닝 데이터의 수에 상용로그를 취하여 이 역시 보완을 가능케 하였다.In addition, o is the total number of opening data, and can take a very large value as it is the total number of opening data set for all data stored in the DB, which is the use that can achieve big data. Therefore, by taking the commercial logarithm of the number of opening data, this also made it possible to supplement.

마지막으로 n은 선형합동법을 이용하여 생성된 난수로서,

의 범위를 갖는다. 선형합동법(Linear Congruential)은 가장 널리 쓰이는 유사난수법으로, 계산이 매우 빠르기 때문에 초창기부터 컴퓨터에 널리 사용된 방법이기도 하다. 흔히 이용되는 랜덤함수, 즉 rand() 함수를 통해 얻을 수 있는 값이다. 따라서 해당 범위 내에서 난수를 생성하고, 이를 비교수치 산출에 반영하여 암호화의 성능을 향상시킬 수 있도록 하였다.Finally, n is a random number generated using the linear congruence method,

has a range of The linear congruential method is the most widely used pseudorandom number method, and it is a method widely used in computers since the early days because it is very fast. It is a value that can be obtained through a commonly used random function, that is, the rand() function. Therefore, the performance of encryption can be improved by generating a random number within the corresponding range and reflecting it in the calculation of comparative values.

따라서 이와 같은 수학식 1을 통한 보정수치 산출에 의하여, 난수를 반영함으로써 암호화 성능을 높일 수 있도록 함과 동시에 저장된 파일들의 총 용량 및 디스크 할당 크기, 오프닝 데이터의 개수를 모두 반영토록 하여 데이터의 자체 속성을 이용하여 널 패킷의 보정을 수행할 수 있도록 한다.Therefore, by calculating the correction value through Equation 1 as described above, the encryption performance can be improved by reflecting the random number, and at the same time, the total capacity of the stored files, the disk allocation size, and the number of opening data are all reflected to reflect the properties of the data itself. is used to perform correction of null packets.

나아가 분자에 하이퍼볼릭 사인의 역함수를 취한 데이터의 총 용량의 합 및 디스크 할당 크기의 합이 반영되는데, 이를 통해 빅데이터의 중요성을 반영할 수 있도록 함과 동시에 데이터의 량 증가에 따라 가중치를 선형으로 증가하는 것이 아닌 비선형적인 데이터 하중을 반영하기 위해 하이퍼볼릭을 통한 보정을 가능케 하였다.Furthermore, the sum of the total capacity of the data obtained by taking the inverse function of the hyperbolic sine in the numerator and the sum of the disk allocation size are reflected. In order to reflect the non-linear data load rather than increasing, it is possible to calibrate through hyperbolic.

예를 들어, C=1331.2, d=1245.4, o=10000, n=5인 경우,For example, if C=1331.2, d=1245.4, o=10000, n=5,

과 같이 산출될 수 있다.can be calculated as

더불어 패턴 보정부(150)는 산출된 비교수치에 따라서 상기 데이터패턴을 보정하는데, 바람직하게는 비교수치가 높을수록 데이터패턴이 보다 많이 보정 처리되고 (주기의 변동 및 개수의 변동이 모두 이루어진다.) 비교수치가 낮을수록 데이터패턴이 기존과 유사하게 유지되는 것이 바람직할 수 있다. 이때 데이터패턴의 보정은 자동으로 이루어질 수도 있으며, 혹은 시스템 관리자가 직접 데이터패턴을 보정 처리하는 것 역시 가능하다.In addition, the pattern correction unit 150 corrects the data pattern according to the calculated comparison value. Preferably, the higher the comparison value, the more the data pattern is corrected (variation of the period and the number of changes are all made). As the comparison value is lower, it may be desirable to maintain the data pattern similar to the existing one. At this time, correction of the data pattern may be performed automatically, or it is also possible for the system administrator to directly correct the data pattern.

예를 들어 산출된 비교수치가 5 이하인 경우 데이터패턴의 보정을 수행하지 않고, 비교수치가 5 내지 15 사이인 경우 데이터패턴 보정에 있어 주기만의 변동을 수행하도록 하며, 비교수치가 15 이상으로 커지는 경우 주기 및 개수의 변동을 모두 수행하되, 비교수치가 커짐에 따라 삽입되는 보이드 데이터의 개수를 더 많이 늘리는 방식으로 패턴의 보정을 수행할 수 있다.For example, when the calculated comparison value is 5 or less, the data pattern is not corrected, and when the comparison value is between 5 and 15, only a period change is performed in the data pattern correction, and when the comparison value increases to 15 or more Although both the period and the number of variations are performed, the pattern may be corrected in such a way that the number of inserted void data is increased as the comparative value increases.

지금까지 설명한 바와 같이, 본 발명에 따른 분산형 파일 시스템(IPFS) 기반의 파일 분산 저장 시스템의 구성 및 작용을 상기 설명 및 도면에 표현하였지만 이는 예를 들어 설명한 것에 불과하여 본 발명의 사상이 상기 설명 및 도면에 한정되지 않으며, 본 발명의 기술적 사상을 벗어나지 않는 범위 내에서 다양한 변화 및 변경이 가능함은 물론이다.As described so far, the configuration and operation of the distributed file system (IPFS)-based file distribution storage system according to the present invention are expressed in the above description and drawings, but these are merely examples and the spirit of the present invention is not described above. And it is not limited to the drawings, and various changes and modifications are possible without departing from the technical spirit of the present invention.

1 : 메인 서버 2 : IPFS 노드
3 : 공유자 단말 4 : 요청자 단말
100 : 암호화 모듈 110 : 저장 용량 파악부
120 : 패턴 생성부 130 : 암호화 수행부
140 : 비교수치 산출부 150 : 패턴 보정부
200 : 분산 저장 모듈 300 : 블록체인 관리 모듈
400 : 해시 복원 모듈 500 : 파일 복원 모듈1: Main Server 2: IPFS Node
3: sharer terminal 4: requester terminal
100: encryption module 110: storage capacity identification unit
120: pattern generating unit 130: encryption performing unit
140: comparison value calculation unit 150: pattern correction unit
200: distributed storage module 300: blockchain management module
400: hash restoration module 500: file restoration module

Claims

분산형 파일 시스템(IPFS) 기반의 파일 분산 저장 시스템으로서,
입력된 파일을 이루는 데이터에 보이드 데이터(void data)를 삽입하여 암호화를 수행하는 것으로서, 상기 IPFS 네트워크를 통해 저장된 파일의 전체 용량을 파악하는 저장 용량 파악부와, 상기 전체 용량에 난수를 반영하여 상기 보이드 데이터(void data)의 데이터패턴을 생성하는 패턴 생성부 및, 상기 데이터패턴에 따라 상기 파일을 이루는 데이터에 상기 보이드 데이터를 삽입하는 암호화 수행부를 포함하는 암호화 모듈;
상기 암호화된 파일을 복수의 데이터 조각으로 분할 처리하고, IPFS 네트워크를 통해 상호 연결된 복수의 IPFS 노드에 상기 분할된 데이터 조각을 분산 저장하는 분산 저장 모듈;
상기 데이터의 다운로드 내역 및 업로드 내역과 상기 데이터패턴을 포함하는 트랜젝션 정보가 저장되는 블록 및, 복수개의 상기 블록이 연결된 블록체인을 형성하고, 상기 블록체인의 사본을 생성하여 상기 IPFS 노드에 전송하는 블록체인 관리 모듈;을 포함하되,
상기 암호화 모듈은,
상기 파일을 이루는 데이터를 그룹의 첫 번째 데이터인 오프닝 데이터(Opening data)와, 상기 오프닝 데이터에 종속되는 것으로서 상기 그룹의 마지막 데이터인 어사인드 데이터(Assigned data)와, 상기 오프닝 데이터 및 상기 어사인드 데이터에 종속 처리되는 것으로서 상기 오프닝 데이터와 상기 어사인드 데이터의 사이에 위치하는 적어도 하나의 추측 데이터(Supposition Data)로 그룹화하여 복수의 그룹을 생성하는 그룹 설정부를 포함하고,
상기 암호화 수행부는,
삽입되는 보이드 데이터마다 그룹 별 오프닝 데이터의 위치정보 및 상기 어사인드 데이터의 위치정보를 함께 삽입하는 기능을 포함하는 것을 특징으로 하는, 파일 분산 저장 시스템.A distributed file system (IPFS)-based file distributed storage system, comprising:
Encryption is performed by inserting void data into data constituting the input file, and a storage capacity determining unit for determining the total capacity of a file stored through the IPFS network, and a random number reflected in the total capacity. an encryption module comprising: a pattern generating unit for generating a data pattern of void data; and an encryption performing unit for inserting the void data into data constituting the file according to the data pattern;
a distributed storage module for dividing the encrypted file into a plurality of data fragments and distributing and storing the divided data fragments in a plurality of IPFS nodes interconnected through an IPFS network;
A block in which transaction information including download and upload details of the data and the data pattern is stored, and a block in which a plurality of blocks are connected to form a block chain, a copy of the block chain is created and transmitted to the IPFS node chain management module; including,
The encryption module is
The data constituting the file includes opening data that is the first data of the group, assigned data that is the last data of the group as subordinate to the opening data, and the opening data and the assigned data. and a group setting unit for generating a plurality of groups by grouping with at least one guess data (Supposition Data) positioned between the opening data and the assign data as subordinately processed to,
The encryption performing unit,
A distributed file storage system, characterized in that it includes a function of inserting together the position information of the opening data for each group and the position information of the assign data for each inserted void data.

제 1항에 있어서,
상기 암호화 모듈은,
암호화 알고리즘을 통해 입력된 파일을 암호화하여 해시(hash)값을 생성하고,
상기 트랜젝션 정보는,
상기 해시값을 포함하는 것을 특징으로 하는, 파일 분산 저장 시스템.The method of claim 1,
The encryption module is
Encrypts the input file through an encryption algorithm to generate a hash value,
The transaction information is
A file distributed storage system comprising the hash value.

제 2항에 있어서,
상기 암호화 모듈은,
공유자 단말로부터 입력된 파일을 상기 공유자 단말의 개인키를 사용해 암호화하고,
상기 트랜젝션 정보는,
상기 개인키와 대응되는 공개키를 포함하며,
상기 시스템은,
상기 개인키를 이용하여 상기 트랜젝션 정보에 저장된 상기 해시값을 복원하는 해시 복원 모듈;과,
상기 해시값을 기반으로 상기 IPFS 노드에 분산 저장된 데이터 조각으로부터 상기 파일을 복원 처리하는 파일 복원 모듈;을 포함하는 것을 특징으로 하는, 파일 분산 저장 시스템.3. The method of claim 2,
The encryption module is
Encrypting the file input from the sharer terminal using the sharer terminal's private key,
The transaction information is
and a public key corresponding to the private key,
The system is
A hash restoration module for restoring the hash value stored in the transaction information using the private key; and
and a file restoration module that restores the file from the pieces of data distributed and stored in the IPFS node based on the hash value.

제 1항에 있어서,
상기 암호화 모듈은,
저장된 파일의 용량, 디스크 할당 크기 및 오프닝 데이터의 개수에 따라 비교수치를 산출하는 비교수치 산출부 및,
상기 비교수치에 따라서 상기 데이터패턴을 보정하는 패턴 보정부를 포함하며,
상기 암호화 수행부는,
보정 처리된 상기 데이터패턴에 따라 상기 보이드 데이터를 삽입하는 것을 특징으로 하는, 파일 분산 저장 시스템.The method of claim 1,
The encryption module is
A comparison value calculation unit for calculating a comparison value according to the capacity of the stored file, the disk allocation size, and the number of opening data;
a pattern correction unit for correcting the data pattern according to the comparison value;
The encryption performing unit,
A distributed file storage system, characterized in that the void data is inserted according to the corrected data pattern.

제 4항에 있어서,
상기 비교수치 산출부는,
다음의 수학식 1을 기반으로 비교수치를 산출하는 것을 특징으로 하는, 파일 분산 저장 시스템.
수학식 1,

) 5. The method of claim 4,
The comparative numerical calculation unit,
A file distribution storage system, characterized in that the comparison value is calculated based on the following Equation (1).
Equation 1,

(Where S is the correction value, c is the total capacity of the files stored using the distributed storage system, d is the disk allocation size of the files stored using the distributed storage system, o is the total number of opening data, and n is the linear congruence method. As a random number generated by

)

삭제delete