KR20210058613A

KR20210058613A - Locking method for parallel i/o of a single file in non-volatiel memeroy file system and computing device implementing the same

Info

Publication number: KR20210058613A
Application number: KR1020200023715A
Authority: KR
Inventors: 박성용; 김준형; 김장웅; 김영재
Original assignee: 서강대학교산학협력단
Priority date: 2019-11-13
Filing date: 2020-02-26
Publication date: 2021-05-24

Abstract

The present invention relates to a method for operating a computing device operated by at least one processor, which comprises the steps of: receiving an input/output (hereinafter, referred to as 'I/O') request of an arbitrary file; configuring an I/O request node including a data page included in the I/O request, and inserting the I/O request node into an interval tree; inquiring the interval tree to check whether the I/O request node including the data page included in the I/O request node exists; and obtaining a lock of the data page included in the I/O request to perform the I/O request when the I/O request node in an overlapping range with the I/O request node is completed and deleted. The interval tree includes the I/O request nodes, and each of the I/O request nodes has a child node in accordance with whether the data page which is an I/O request target is included. Accordingly, the performance can be optimized, and overhead can be reduced.

Description

단일 파일의 병렬 읽기/쓰기를 위한 락킹 방법 및 이를 구현하는 컴퓨팅 장치{LOCKING METHOD FOR PARALLEL I/O OF A SINGLE FILE IN NON-VOLATIEL MEMEROY FILE SYSTEM AND COMPUTING DEVICE IMPLEMENTING THE SAME}A locking method for parallel reading/writing of a single file, and a computing device implementing the same {LOCKING METHOD FOR PARALLEL I/O OF A SINGLE FILE IN NON-VOLATIEL MEMEROY FILE SYSTEM AND COMPUTING DEVICE IMPLEMENTING THE SAME}

본 발명은 단일 파일의 병렬 읽기/쓰기를 위한 락킹 기술에 관한 것이다.The present invention relates to a locking technique for parallel read/write of a single file.

최근 서버가 100개 이상의 많은 코어를 갖게 되고, 각각의 코어가 많은 읽기/쓰기(Input/Output, 이하 'I/O'라고 호칭함) 연산을 발생시키면서, 멀티/매니코어 환경에서의 고성능 I/O에 대한 필요성이 대두된다. 구체적으로, 고성능 데이터베이스나 키-밸류 애플리케이션들의 I/O 처리는 멀티코어 환경을 고려하여 설계되고 있다. Recently, servers have more than 100 cores, and each core generates many read/write (Input/Output, hereinafter referred to as'I/O') operations, and high performance I/M in a multi/manicore environment. The need for O emerges. Specifically, I/O processing of high-performance databases or key-value applications is designed in consideration of a multi-core environment.

비휘발성 메모리(Non-Volatile Memory, NVM)는 메모리 버스에 장착되어 바이트 단위로 접근이 가능하며 DRAM과 비슷한 I/O 지연시간을 제공한다. 그리고 초당 메가바이트의 처리율(Throughput)인 기존의 블락(Block) 기반 장치에 비해, 비휘발성 메모리는 초당 기가바이트의 처리율을 갖는다.Non-Volatile Memory (NVM) is mounted on a memory bus and can be accessed in byte units and provides I/O delay time similar to that of DRAM. And, compared to conventional block-based devices, which have a throughput of megabytes per second, nonvolatile memory has a throughput of gigabytes per second.

또한 기존 블락 장치 기반의 파일시스템에서는 디스크 접근을 최소화하기 위해서 데이터를 접근할 때 페이지 캐시부터 접근하여 오버헤드가 발생하지만, 메모리 기반의 파일시스템은 페이지 캐시를 사용하지 않으므로 해당 오버헤드를 줄일 수 있다. In addition, in order to minimize disk access in the existing block device-based file system, overhead occurs by accessing data from the page cache when accessing data, but the memory-based file system does not use the page cache, so the overhead can be reduced. .

비휘발성 메모리 기반 파일시스템들은 메모리의 바이트 단위의 접근성을 이용하여 적은 양의 쓰기로 하나의 트랜잭션의 일관성을 보장하므로 오버헤드를 줄일 수 있다.Nonvolatile memory-based file systems can reduce overhead by ensuring the consistency of one transaction with a small amount of writes using the accessibility of memory in bytes.

그러나 여러 스레드들이 같은 파일에 I/O 연산을 하는 경우 병렬적인 I/O가 불가능하다는 단점이 있다. 쓰기 연산 시, 파일시스템 계층에서 파일 전체에 획일적으로 락을 걸어 모든 쓰기가 직렬화되기 때문이다. 또한 읽기 연산 시, 파일의 단일 읽기 카운터 변수로 인해 성능의 심각한 저해를 받는다.However, there is a disadvantage that parallel I/O is not possible when several threads perform I/O operations on the same file. This is because, when a write operation is performed, all writes are serialized by uniformly locking the entire file at the file system layer. In addition, during a read operation, performance is severely hampered by a single read counter variable of the file.

단일 파일에 병렬적인 I/O를 위하여 NOVA 파일시스템에서 제안된 범위 락킹 기법 연구가 존재하지만, 하나의 페이지를 읽고, 쓰기 위하여 연결된 모든 페이지를 선점해야 하므로 큰 오버헤드를 발생시킬 수 있다. For parallel I/O to a single file, there is a research on the proposed range locking method in the NOVA file system, but it can cause a large overhead because all pages connected to read and write one page must be preempted.

따라서 비휘발성 메모리 기반 파일시스템에서, 파일 전체가 아닌 일부분만을 락킹(Locking)하여 단일 파일에서 병렬적인 I/O를 가능하게 하는 방법이 요구된다. Therefore, in a nonvolatile memory-based file system, there is a need for a method that enables parallel I/O in a single file by locking only a portion of the file rather than the entire file.

해결하고자 하는 과제는 파일 전체가 아닌 I/O 요청이 중복되는 일부분만 락킹하기 위해, 인터벌 트리 구조를 바탕으로 데이터 페이지 단위에서 중복되는 쓰기 요청이 있는지 파악하고, 중복된 부분만을 락킹하는 방법을 제공하는 것이다.The task to be solved is to determine whether there are duplicate write requests in data page units based on the interval tree structure, and provide a method to lock only the duplicated portions in order to lock not the entire file, but only a portion of the duplicate I/O request. It is to do.

또한, 해결하고자 하는 과제는 데이터 페이지들의 사용 현황을 파악하는 세그먼트 범위 기반의 락킹을 이용하여 중복되는 I/O 요청이 있는지 파악하고, 중복된 부분만을 락킹하는 방법을 제공하는 것이다. In addition, the problem to be solved is to provide a method of determining whether there are overlapping I/O requests by using segment range-based locking to determine the usage status of data pages, and locking only the overlapped portions.

한 실시예에 따른 적어도 하나의 프로세서에 의해 동작하는 컴퓨팅 장치의 동작 방법으로서, 임의의 파일의 읽기/쓰기(이하 ‘I/O’라고 호칭함) 요청을 입력받는 단계, 상기 I/O 요청에 포함된 데이터 페이지를 포함하는 I/O 요청 노드를 구성하고, 상기 I/O 요청 노드를 인터벌 트리에 삽입하는 단계, 상기 인터벌 트리를 조회하여, 상기 I/O 요청 노드에 포함된 데이터 페이지를 포함하는 I/O 요청 노드가 있는지 확인하는 단계, 상기 I/O 요청 노드와 중복되는 범위의 I/O 요청 노드가 작업이 완료되어 삭제되면, 상기 I/O 요청에 포함된 데이터 페이지의 락을 획득하고 상기 I/O 요청을 수행하는 단계를 포함하고, 상기 인터벌 트리는 복수의 I/O 요청 노드들을 포함하고, 각 I/O 요청 노드는 I/O 요청 대상인 데이터 페이지의 포함 여부에 따라 자식 노드를 갖는다.A method of operating a computing device operated by at least one processor according to an embodiment, the step of receiving a request to read/write (hereinafter referred to as'I/O') of an arbitrary file, in response to the I/O request Constructing an I/O request node including an included data page and inserting the I/O request node into an interval tree, including a data page included in the I/O request node by querying the interval tree Checking whether there is an I/O requesting node. When the I/O requesting node overlapping with the I/O requesting node is deleted after completion of the work, a lock of the data page included in the I/O request is acquired. And performing the I/O request, wherein the interval tree includes a plurality of I/O request nodes, and each I/O request node selects a child node according to whether or not a data page targeted for an I/O request is included. Have.

상기 확인하는 단계는, 상기 인터벌 트리의 왼쪽 하위부터 오른쪽 하위 방향으로 중위 순회(Inorder Traversal)할 수 있다.The checking may be performed in an in-order traversal from a lower left to a lower right of the interval tree.

상기 I/O 요청 노드는, 자식 노드들에 포함된 데이터 페이지 중 가장 큰 값인 상한값을 더 포함할 수 있다.The I/O request node may further include an upper limit value, which is the largest value among data pages included in child nodes.

상기 확인하는 단계는, 임의의 I/O 요청 노드에 포함된 상한값이 상기 I/O 요청 노드의 상한값보다 작으면 상기 임의의 I/O 요청 노드의 자식 노드들은 조회하지 않을 수 있다.In the step of checking, if an upper limit value included in an arbitrary I/O request node is less than an upper limit value of the I/O request node, child nodes of the arbitrary I/O request node may not be inquired.

상기 확인하는 단계는, 상기 I/O 요청 노드에 포함된 데이터 페이지를 포함하는 I/O 요청 노드의 개수인 중복 요청 개수를 상기 I/O 요청 노드에 더 포함할 수 있다.The checking may further include a number of duplicate requests, which is the number of I/O request nodes including data pages included in the I/O request node, in the I/O request node.

상기 수행하는 단계는, 상기 중복 요청 개수가 0이 되면 상기 I/O 요청을 수행할 수 있다.In the performing step, when the number of duplicate requests becomes 0, the I/O request may be performed.

상기 인터벌 트리는 DRAM(Dynamic Random Access Memory)에 저장될 수 있다.The interval tree may be stored in a dynamic random access memory (DRAM).

다른 실시예에 따른 적어도 하나의 프로세서에 의해 동작하는 컴퓨팅 장치의 동작 방법으로서, 파일을 구성하는 각 세그먼트의 사용 중인 정보를 세마포어(Semaphore)로 나타내어 관리하는 단계, 상기 파일의 특정 세그먼트에 대한 읽기/쓰기 요청을 입력받는 단계, 상기 특정 세그먼트에 대응된 세마포어 값을 확인하는 단계, 그리고 상기 세마포어 값에 따라 상기 특정 세그먼트의 락을 획득하고, 상기 읽기/쓰기 요청을 수행하는 단계를 포함한다.A method of operating a computing device operated by at least one processor according to another embodiment, comprising the steps of representing and managing information in use of each segment constituting a file as a semaphore, and reading/managing a specific segment of the file. Receiving a write request, checking a semaphore value corresponding to the specific segment, acquiring a lock of the specific segment according to the semaphore value, and performing the read/write request.

상기 관리하는 단계는, 상기 세그먼트가 쓰기 연산을 수행 중이면 상기 세마포어 값에 포함된 특정 비트값을 1로 설정하고, 읽기 연산을 수행 중이면 상기 세마포어 값에 1을 더하고, 쓰기 또는 읽기 연산을 수행 중이 아니면 상기 세마포어 값을 0으로 설정할 수 있다.In the managing step, if the segment is performing a write operation, a specific bit value included in the semaphore value is set to 1, and if a read operation is being performed, 1 is added to the semaphore value, and a write or read operation is performed. Otherwise, the semaphore value may be set to 0.

상기 수행하는 단계는, 상기 세마포어 값이 0이면, 상기 특정 비트값을 1로 설정하고 상기 세그먼트의 쓰기 락을 획득하고, 상기 특정 비트값이 0이면, 상기 세마포어 값을 1을 더하고 상기 세그먼트의 읽기 락을 획득할 수 있다.If the semaphore value is 0, the specific bit value is set to 1 and a write lock of the segment is acquired. If the specific bit value is 0, the semaphore value is added to 1, and the segment is read. You can acquire a lock.

상기 세그먼트는, 상기 파일에 포함된 데이터 페이지들의 모음일 수 있다.The segment may be a collection of data pages included in the file.

한 실시예에 따른 컴퓨팅 장치로서, 메모리, 그리고 상기 메모리에 로드된 프로그램의 명령들(instructions)을 실행하는 적어도 하나의 프로세서를 포함하고, 상기 프로그램은 임의의 파일의 읽기/쓰기(이하 ‘I/O’라고 호칭함) 요청을 입력받는 단계, 상기 I/O 요청에 포함된 데이터 페이지가 임의의 스레드에 의해 사용 중인지 확인하는 단계, 그리고 상기 임의의 스레드에 의한 사용이 완료되면, 상기 데이터 페이지의 락을 획득하고, 상기 I/O 요청을 수행하는 단계실행하도록 기술된 명령들을 포함한다.A computing device according to an embodiment, comprising a memory and at least one processor that executes instructions of a program loaded in the memory, wherein the program reads/writes an arbitrary file (hereinafter referred to as'I/ O') receiving a request, checking whether the data page included in the I/O request is being used by a thread, and when the use by the thread is completed, the data page is It contains instructions described to execute the steps of acquiring a lock and performing the I/O request.

상기 확인하는 단계는, 상기 I/O 요청에 포함된 데이터 페이지 정보를 노드로 구성한 인터벌 트리 또는 상기 I/O 요청에 포함된 데이터 페이지의 사용 현황을 I/O 세마포어(Semaphore) 값으로 표현한 세그먼트를 이용할 수 있다.In the step of verifying, an interval tree consisting of data page information included in the I/O request as nodes or a segment representing the usage status of a data page included in the I/O request as an I/O semaphore value. Can be used.

본 발명에 따르면 파일 전체를 락킹하는 획일적인 방법에 비해, 락킹의 범위를 줄일 수 있으므로 단일 파일 I/O 연산에서 어플리케이션의 확장성을 보장할 수 있다. According to the present invention, compared to a standard method of locking the entire file, the range of locking can be reduced, and thus the scalability of an application can be guaranteed in a single file I/O operation.

또한, 본 발명에 따르면 어플리케이션의 I/O 패턴에 따라 락킹의 범위를 사용자가 직접 설정하여 최적의 성능을 낼 수 있고, 오버헤드를 줄일 수 있다.In addition, according to the present invention, a user can directly set a locking range according to an I/O pattern of an application to achieve optimum performance and reduce overhead.

도 1은 한 실시예에 따른 파일의 일부분을 락킹하는 컴퓨팅 장치의 설명도이다.
도 2는 한 실시예에 따른 NOVA 파일시스템의 쓰기 동작을 나타내는 설명도이다.
도 3은 한 실시예에 따른 파일의 일부분을 락킹하는 방법의 설명도이다.
도 4는 한 실시예에 따른 파일의 일부분을 락킹하는 방법의 흐름도이다.
도 5는 다른 실시예에 따른 파일의 일부분을 락킹하는 방법의 설명도이다.
도 6은 다른 실시예에 따른 파일의 일부분을 락킹하는 방법의 흐름도이다.
도 7은 한 실시예에 따른 락킹 방법의 쓰기 출력량을 평가한 결과를 나타내는 설명도이다.
도 8은 한 실시예에 따른 락킹 방법의 읽기 출력량을 평가한 결과를 나타내는 설명도이다.
도 9는 한 실시예에 따른 락킹 방법의 연산 횟수를 평가한 결과를 나타내는 설명도이다.
도 10은 한 실시예에 따른 컴퓨팅 장치의 하드웨어 구성도이다.1 is an explanatory diagram of a computing device for locking a part of a file according to an exemplary embodiment.
2 is an explanatory diagram illustrating a write operation of a NOVA file system according to an embodiment.
3 is an explanatory diagram of a method of locking a part of a file according to an exemplary embodiment.
4 is a flowchart of a method of locking a part of a file according to an exemplary embodiment.
5 is an explanatory diagram of a method of locking a part of a file according to another exemplary embodiment.
6 is a flowchart of a method of locking a part of a file according to another exemplary embodiment.
7 is an explanatory diagram illustrating a result of evaluating a write output amount of a locking method according to an exemplary embodiment.
8 is an explanatory diagram illustrating a result of evaluating a read output amount of a locking method according to an exemplary embodiment.
9 is an explanatory diagram showing a result of evaluating the number of calculations of a locking method according to an exemplary embodiment.
10 is a hardware configuration diagram of a computing device according to an embodiment.

아래에서는 첨부한 도면을 참고로 하여 본 발명의 실시예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art may easily implement the present invention. However, the present invention may be implemented in various different forms and is not limited to the embodiments described herein. In the drawings, parts irrelevant to the description are omitted in order to clearly describe the present invention, and similar reference numerals are attached to similar parts throughout the specification.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 또한, 명세서에 기재된 "…부", "…기", "모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다.Throughout the specification, when a part "includes" a certain component, it means that other components may be further included rather than excluding other components unless specifically stated to the contrary. In addition, terms such as "... unit", "... group", and "module" described in the specification mean a unit that processes at least one function or operation, which can be implemented by hardware or software or a combination of hardware and software. have.

도 1은 한 실시예에 따른 파일의 일부분을 락킹하는 컴퓨팅 장치의 설명도이다.1 is an explanatory diagram of a computing device for locking a part of a file according to an exemplary embodiment.

도 1을 참고하면, 컴퓨팅 장치(1000)는 복수개의 데이터 페이지들을 포함하는 파일의 쓰기 요청을 받으면, 파일이 여러번 쓰여지는 것을 방지하기 위해 파일 전체 또는 파일의 일부분에 락(Lock)을 거는 락킹부(100)와 락이 걸린 동안 파일 쓰기 연산을 수행하는 쓰기부(200)를 포함한다. 데이터 페이지란 데이터를 일정한 크기를 가진 블록으로 나눈 단위를 의미한다.Referring to FIG. 1, when receiving a request to write a file including a plurality of data pages, the computing device 1000 locks the entire file or a part of the file to prevent the file from being written multiple times. It includes 100 and a writing unit 200 that performs a file write operation while the lock is locked. A data page refers to a unit obtained by dividing data into blocks having a certain size.

설명을 위해, 락킹부(100)와 쓰기부(200)로 명명하여 부르나, 이들은 적어도 하나의 프로세서에 의해 동작하는 컴퓨팅 장치(1000)이다. 여기서, 락킹부(100)와 쓰기부(200)는 하나의 컴퓨팅 장치(1000)에 구현되거나, 별도의 컴퓨팅 장치(1000)에 분산 구현될 수 있다. 별도의 컴퓨팅 장치(1000)에 분산 구현된 경우, 락킹부(100)와 쓰기부(200)는 통신 인터페이스를 통해 서로 통신할 수 있다. 컴퓨팅 장치(1000)는 본 발명을 수행하도록 작성된 소프트웨어 프로그램을 실행할 수 있는 장치이면 충분하고, 예를 들면, 서버, 랩탑 컴퓨터 등일 수 있다.For the sake of explanation, the locking unit 100 and the writing unit 200 are referred to as being called, but these are the computing device 1000 operated by at least one processor. Here, the locking unit 100 and the writing unit 200 may be implemented in one computing device 1000 or distributedly implemented in a separate computing device 1000. When distributed in a separate computing device 1000, the locking unit 100 and the writing unit 200 may communicate with each other through a communication interface. The computing device 1000 may be a device capable of executing a software program written to perform the present invention, and may be, for example, a server, a laptop computer, or the like.

락킹부(100)는 여러 스레드에 의해 파일이 쓰여지는 것을 막기 위해, 파일을 독점하는 역할을 하며, 파일의 일부분에 락을 건다. 이때 락킹의 범위는 어느 하나에 제한되지 않으며, 예를 들어 파일에 포함된 데이터 페이지 단위이거나 데이터 페이지들의 묶음인 세그먼트일 수 있다. The locking unit 100 serves to monopolize the file to prevent the file from being written by multiple threads, and locks a part of the file. In this case, the range of locking is not limited to any one, and may be, for example, a data page unit included in a file or a segment that is a group of data pages.

락킹부(100)가 파일의 일부분을 락킹하기 위해, 중복된 쓰기 요청이 있는지 확인해야 한다. 중복된 범위의 쓰기 요청이 있는지 확인하기 위해 인터벌 트리를 이용하거나 세그먼트 방식을 이용할 수 있다. 인터벌 트리와 세그먼트 방식에 대한 자세한 내용은 도 3 내지 도 6을 통해 설명한다. In order for the locking unit 100 to lock a part of the file, it must be checked whether there is a duplicate write request. To check whether there is a write request of an overlapped range, an interval tree can be used or a segment method can be used. Details of the interval tree and the segment method will be described with reference to FIGS. 3 to 6.

쓰기부(200)는 파일 디스크립터를 이용하여 참조한 파일에 데이터를 쓰는 동작을 수행하며, 예를 들어 Write 시스템 콜(System Call) 함수를 이용할 수 있다.The writing unit 200 performs an operation of writing data to a referenced file using a file descriptor, and may use, for example, a Write system call function.

도 2는 한 실시예에 따른 NOVA 파일시스템의 쓰기 동작을 나타내는 설명도이다.2 is an explanatory diagram illustrating a write operation of a NOVA file system according to an embodiment.

도 2를 참고하면, 비휘발성 메모리(Non-Volatile Memory, 이하 'NVM'이라 호칭함)와 DRAM을 모두 사용하는 하이브리드 메모리 기반 로깅(NOnVolatile Memory Accelerated, 이하 ‘NOVA’라 호칭함) 파일시스템의 구조와 쓰기 과정을 나타낸 것이다. Referring to FIG. 2, the structure of a hybrid memory-based logging (NOnVolatile Memory Accelerated, hereinafter referred to as'NOVA') file system using both non-volatile memory (Non-Volatile Memory, hereinafter referred to as'NVM') and DRAM And the writing process.

NOVA 파일시스템은 NVM과 DRAM을 포함하며, 파일에 접근하기 위해 사용되는 인덱스 트리(Index Tree)는 빠른 접근을 위하여 DRAM에 존재하고, 나머지 자료구조는 모두 비휘발성 메모리에 존재한다. The NOVA file system includes NVM and DRAM, and an index tree used to access files exists in DRAM for quick access, and all other data structures exist in nonvolatile memory.

NOVA 파일시스템은 쓰기를 수행할 때 파일시스템의 일관성을 위하여 COW(Copy-On-Write) 방식으로 데이터를 업데이트하고, 쓰기 연산에 대하여 로깅을 한다. The NOVA file system updates data in a COW (Copy-On-Write) method for consistency of the file system when writing is performed, and logs write operations.

NOVA 파일시스템에서는 파일마다 로그를 할당하여 병렬적으로 파일에 접근할 수 있다. 파일에 할당된 로그들은 비휘발성 메모리의 로그 페이지에 저장되며 모든 로그 페이지들은 연결 리스트에 의해 연결되어 있다.In the NOVA file system, you can access files in parallel by allocating logs for each file. Logs allocated to a file are stored in log pages in nonvolatile memory, and all log pages are linked by a linked list.

이제 NOVA 파일시스템의 쓰기 동작을 예를 들어 설명한다. 쓰기 대상 파일이 4개의 데이터 페이지(Data 0 내지 Data 3)를 포함하고, 각 페이지의 크기는 4 KB 라고 가정한다. 각 데이터 페이지는 각 데이터의 페이지 오프셋을 나타내는 번호를 가지고 있다. Now, the write operation of the NOVA file system will be described as an example. It is assumed that the file to be written includes 4 data pages (Data 0 to Data 3), and the size of each page is 4 KB. Each data page has a number indicating the page offset of each data.

이때 8000 번째 바이트를 시작으로 5000 바이트 크기의 쓰기 시스템 콜이 요청된 경우에 대해 설명한다. At this time, a case where a write system call of 5000 bytes in size is requested starting with the 8000th byte will be described.

가장 먼저, 파일이 동시에 여러 스레드에 의해 쓰여지는 것을 방지하기 위해 쓰기 연산 이전에 파일 전체에 락을 건다(Locking). 파일이 락킹되면 다른 스레드들은 해당 파일에 쓰기를 수행할 수 없다. First of all, locking the entire file before a write operation to prevent the file from being written by multiple threads at the same time. When a file is locked, other threads cannot write to the file.

쓰기 스레드가 락을 획득하면, NOVA 파일시스템의 프리 블록 매니저는 비휘발성 메모리로부터 3개의 데이터 페이지(데이터 페이지 1', 데이터 페이지 2’, 데이터 페이지 3’)를 할당한다. 데이터 페이지 할당이 끝나면 유저 버퍼에서 유저 데이터를 복사한 후, 데이터 페이지 1과 데이터 페이지 3의 일부를 데이터 페이지 1’과 데이터 페이지 3'으로 복사한다. When the write thread acquires the lock, the free block manager of the NOVA file system allocates three data pages (data page 1', data page 2', data page 3') from nonvolatile memory. When data page allocation is complete, user data is copied from the user buffer, and then a part of data page 1 and data page 3 is copied to data page 1'and data page 3'.

새로운 유저 데이터 페이지가 구성되었으므로, NOVA 파일시스템은 로그 엔트리를 만들고 새로운 데이터 페이지의 시작 주소를 가리키도록 로깅한다. 로그 엔트리는 쓰기 연산의 위치와 크기 등을 포함하는 정보를 포함한다. 로그 엔트리가 만들어지면, NOVA 파일시스템은 아이노드(Inode)의 테일 포인터를 새로운 로그 엔트리의 주소로 갱신한다. Now that a new user data page has been constructed, the NOVA filesystem creates a log entry and logs it to point to the starting address of the new data page. The log entry contains information including the location and size of the write operation. When a log entry is created, the NOVA filesystem updates the inode's tail pointer with the address of the new log entry.

테일 포인터가 업데이트 되면, 해당 쓰기 연산이 유효하게 된다. 업데이트 된 이후 전원이 차단되는 경우, NOVA 파일시스템은 로그의 처음부터 테일이 가리키는 로그 엔트리까지 탐색하여 파일의 인덱스 구조를 복구할 수 있다. 업데이트 전 전원이 차단되는 경우, 해당 로그는 복구 대상에서 제외되므로 쓰기 연산은 없던 것이 되어 일관성이 보장된다. When the tail pointer is updated, the corresponding write operation becomes valid. If the power is cut off after the update, the NOVA file system can recover the index structure of the file by searching from the beginning of the log to the log entry pointed to by the tail. If the power is cut off before the update, the log is excluded from the recovery target, so there is no write operation, so consistency is guaranteed.

이후, 파일의 인덱스 트리의 1, 2, 3 번째 노드가 새로운 로그 엔트리를 가리키도록 갱신하여 인덱스 트리를 갱신하면 비로소 쓰기 연산은 완료된다. Thereafter, when the index tree is updated by updating the first, second, and third nodes of the file's index tree to point to the new log entry, the write operation is completed.

이러한 쓰기 과정에서, NOVA 파일시스템은 쓰기 연산을 시작할 때 파일 전체에 락을 걸어 파일을 독점한다. 따라서 해당 쓰기 연산이 수행되는 동안 다른 스레드는 그 파일을 쓰거나 읽을 수 없다. 따라서 본 명세서에서는 파일 전체가 아닌, 파일을 구성하는 일부 데이터 페이지 또는 관리자가 설정한 범위에 대해 락킹하는 방법을 제안한다. During this write process, the NOVA file system monopolizes the file by locking the entire file when starting a write operation. Therefore, other threads cannot write or read the file while the corresponding write operation is being performed. Therefore, the present specification proposes a method of locking a range set by an administrator or some data pages constituting a file, not the entire file.

도 3은 한 실시예에 따른 파일의 일부분을 락킹하는 방법의 설명도이고, 도 4는 한 실시예에 따른 파일의 일부분을 락킹하는 방법의 흐름도이다.3 is an explanatory diagram of a method of locking a part of a file according to an exemplary embodiment, and FIG. 4 is a flowchart of a method of locking a part of a file according to an exemplary embodiment.

도 3을 참고하면, 컴퓨팅 장치(1000)는 인터벌 트리라는 자료구조를 이용하여 쓰기 범위가 중복되는 노드를 찾고, 중복되는 페이지 범위만 락킹한다.Referring to FIG. 3, the computing device 1000 searches for nodes with overlapping write ranges using a data structure called an interval tree, and locks only the overlapping page ranges.

인터벌 트리(Interval Tree)는 이진 검색 트리(Binary Search Tree)와 유사한 구조로, 각 노드는 특정 정수 구간의 값을 가지고 있으며, 각 노드의 구간이 중복되는지 탐지하고, 구간 내 범위를 갖는 노드는 해당 노드의 자식 노드로 삽입될 수 있다. Interval Tree is a structure similar to Binary Search Tree.Each node has a value of a specific integer section, detects whether the section of each node is overlapped, and nodes with a range within the section correspond to It can be inserted as a child node of a node.

본 명세서에서, 인터벌 트리는 다음 3개의 값을 포함한다. 데이터 페이지 범위, 자식 노드들의 데이터 페이지 범위 중 가장 큰 값, 데이터 페이지 범위가 중복되는 노드의 개수 즉, 중복 요청 개수이다. In this specification, the interval tree includes the following three values. The data page range, the largest value among the data page ranges of child nodes, and the number of nodes with overlapping data page ranges, that is, the number of duplicate requests.

데이터 페이지 범위는 파일을 구성하는 데이터 페이지의 시작 페이지 번호와 끝 페이지 번호로 나타내며, 임의의 정수 구간일 수 있다.The data page range is represented by a start page number and an end page number of a data page constituting a file, and may be an arbitrary integer section.

컴퓨팅 장치(1000)는 해당 노드의 데이터 페이지 범위가 트리를 구성하는 다른 노드와 겹치는지 판단하기 위해, 각 노드를 순회(Traversal)할 수 있다. 순회 방법은 어느 하나에 한정되지 않으며, 예를 들어 중위 순회(Inorder Traversal)를 이용할 수 있다. The computing device 1000 may traversal each node to determine whether the data page range of the corresponding node overlaps with other nodes constituting the tree. The traversal method is not limited to any one, and for example, Inorder Traversal may be used.

각 노드는 트리에 포함된 노드들의 데이터 페이지 범위를 효율적으로 파악하기 위해, 각 노드의 자식 노드들의 데이터 페이지 범위 중 가장 큰 값을 노드에 저장한다.Each node stores the largest value among the data page ranges of the child nodes of each node in the node in order to efficiently grasp the data page range of the nodes included in the tree.

즉, 저장된 최댓값을 이용하여 중위 순회의 일부 과정이 생략될 수 있다. 예를 들어 중위 순회 중, 방문한 노드의 최댓값이 본인 노드가 갖는 데이터 페이지 범위의 끝값보다 작은 경우, 이는 방문한 노드의 자식 노드들 중 어느 노드도 본인 노드의 데이터 페이지 범위와 겹치지 않는다는 것을 의미한다. 따라서 해당 노드의 자식 트리는 순회하지 않을 수 있다.That is, some processes of the median traversal may be omitted by using the stored maximum value. For example, during a median traversal, if the maximum value of the visited node is less than the end value of the data page range of the own node, this means that none of the child nodes of the visited node overlap with the data page range of the own node. Therefore, the child tree of the node may not be traversed.

중복 요청 개수는 데이터 페이지 범위가 중복되는 노드의 개수를 의미한다. The number of duplicate requests refers to the number of nodes with overlapping data page ranges.

한편, 인터벌 트리는 파일에 따라 정의되고, 컴퓨팅 장치(1000)는 새로운 파일이 생성되면 인터벌 트리를 초기화 한다. 또한 컴퓨팅 장치(1000)는 빠른 지연시간을 위해 인터벌 트리를 DRAM에 생성할 수 있다. Meanwhile, the interval tree is defined according to the file, and the computing device 1000 initializes the interval tree when a new file is created. In addition, the computing device 1000 may generate an interval tree in DRAM for a fast delay time.

이하에서는, 인터벌 트리를 이용하여 쓰기 범위가 중복되는 노드를 찾고, 중복되는 페이지 범위만 락킹하는 방법에 대해 설명한다.Hereinafter, a method of finding nodes with overlapping write ranges and locking only overlapping page ranges using an interval tree will be described.

도 4를 참고하면, 컴퓨팅 장치(1000)는 쓰기 요청을 입력받는다(S101).Referring to FIG. 4, the computing device 1000 receives a write request (S101).

컴퓨팅 장치(1000)는 NOVA 파일시스템에 의해 걸린 락을 제거한다(S102). 도 2에서 설명한 바와 같이, NOVA 파일시스템은 기본적으로 쓰기 연산이 요청되면 파일 전체를 락킹하므로 이를 해제하는 것이다.The computing device 1000 removes the lock held by the NOVA file system (S102). As described in FIG. 2, the NOVA file system basically locks the entire file when a write operation is requested, and thus releases it.

컴퓨팅 장치(1000)는 쓰기 요청이 입력되면 각 요청의 데이터 페이지 범위를 확인하고 노드를 생성하여 인터벌 트리를 구성한다(S103). 이때 도 3과 같이 인터벌 트리를 구성하는 각 노드는 각 요청의 데이터 페이지 범위를 포함한다. 또한, 범위에 따라 노드를 인터벌 트리에 삽입하고, 삽입된 위치의 자식 노드들의 데이터 페이지 범위 중 가장 큰 값을 더 포함할 수 있다. When a write request is input, the computing device 1000 checks the data page range of each request and creates a node to construct an interval tree (S103). At this time, each node constituting the interval tree as shown in FIG. 3 includes a data page range of each request. In addition, the node may be inserted into the interval tree according to the range, and the largest value among the data page ranges of the child nodes at the inserted position may be further included.

컴퓨팅 장치(1000)는 트리를 구성하는 노드들과 데이터 페이지 범위가 겹치는지 확인하고, 중복 요청 개수를 노드에 포함할 수 있다. 이때 노드에 포함된 최댓값 정보를 이용할 수 있다. 예를 들어, 본인 노드와 중복되는 쓰기 요청이 5개의 스레드에 의하여 수행되고 있다면 본인 노드의 중복 요청 개수는 5이다. The computing device 1000 may check whether the nodes constituting the tree and the data page range overlap, and may include the number of duplicate requests in the node. At this time, the maximum value information included in the node can be used. For example, if a write request that overlaps with the own node is being executed by 5 threads, the number of duplicate requests of the own node is 5.

컴퓨팅 장치(1000)는 인터벌 트리를 순회하여, 요청된 노드와 중복되는 범위의 데이터 페이지를 포함하는 노드가 있는지 확인한다(S104). The computing device 1000 traverses the interval tree and checks whether there is a node including a data page overlapping the requested node (S104).

S105 단계에서 중복된 범위의 노드가 없다면, 해당 노드의 쓰기 요청은 다른 노드들과 겹치지 않으므로, 컴퓨팅 장치(1000)는 해당 노드의 쓰기 요청을 수행한다(S106). If there is no node in the overlapping range in step S105, since the write request of the corresponding node does not overlap with other nodes, the computing device 1000 performs the write request of the corresponding node (S106).

S105 단계에서 중복된 범위의 노드가 있다면, 컴퓨팅 장치(1000)는 해당 노드의 쓰기 요청을 다른 스레드가 작업하는 것으로 판단하고 대기한다(S107).If there is a node in the overlapping range in step S105, the computing device 1000 determines that another thread is working on the write request of the node and waits (S107).

컴퓨팅 장치(1000)는 중복 요청 개수가 0이 되는지 확인하고(S108), 0이 되면 해당 노드가 요청한 쓰기 연산을 수행한다(S109). The computing device 1000 checks whether the number of duplicate requests becomes 0 (S108), and when it becomes 0, performs a write operation requested by the corresponding node (S109).

컴퓨팅 장치(1000)는 요청된 쓰기가 완료되면, 요청한 노드를 인터벌 트리에서 삭제하고, 해당 노드와 중복되는 범위를 갖는 노드들의 중복 요청 개수를 1만큼 감소시킨다(S110). When the requested write is completed, the computing device 1000 deletes the requested node from the interval tree, and reduces the number of redundant requests of nodes having a range overlapping with the corresponding node by 1 (S110).

한편, 인터벌 트리 구조를 이용한 페이지 단위의 락킹 방법은, 락킹의 최소 단위가 파일을 구성하는 데이터 페이지로 제한된다. 따라서, 이하에서는 락킹을 할 범위를 사용자 또는 관리자가 직접 설정하는 다른 방법을 설명한다. Meanwhile, in the page-based locking method using an interval tree structure, the minimum unit of locking is limited to data pages constituting a file. Therefore, in the following, another method of directly setting the locking range by a user or an administrator will be described.

도 5는 다른 실시예에 따른 파일의 일부분을 락킹하는 방법의 설명도이고, 도 6은 다른 실시예에 따른 파일의 일부분을 락킹하는 방법의 흐름도이다.5 is an explanatory diagram illustrating a method of locking a part of a file according to another exemplary embodiment, and FIG. 6 is a flowchart illustrating a method of locking a part of a file according to another exemplary embodiment.

도 5를 참고하면, 컴퓨팅 장치(1000)는 파일의 일부 데이터 페이지가 사용중인 정보를 세그먼트로 표현할 수 있다. 세그먼트란, 파일의 연속적인 페이지들의 묶음을 의미하며, 파일의 각 세그먼트들은 32비트의 변수를 포함하며, 이 변수를 I/O 세마포어라고 호칭할 수 있다. 세마포어(Semaphore)란 두 개의 원자적 함수로 조작되는 정수 변수로서, 멀티프로그래밍 환경에서 공유된 자원에 대한 복수의 프로세스의 접근을 제한하는 방법으로 사용된다.Referring to FIG. 5, the computing device 1000 may represent information being used by some data pages of a file as segments. A segment means a group of consecutive pages of a file, and each segment of a file includes a 32-bit variable, and this variable can be called an I/O semaphore. A semaphore is an integer variable manipulated by two atomic functions, and is used as a method of restricting access of multiple processes to shared resources in a multiprogramming environment.

32비트로 구성된 I/O 세마포어의 첫번째 비트를 이용하여 데이터 페이지의 쓰기 상태를 표현할 수 있다. 예를 들어, 세그먼트에 해당하는 데이터 페이지가 쓰기 중이면 32비트 중 가장 왼쪽의 비트를 1로 나타내고, 쓰기 중이 아니라면 0으로 나타낼 수 있다. 또한 가장 왼쪽 비트를 제외한 남은 31비트는 세그먼트의 데이터 페이지들을 읽고 있는 스레드들의 카운터로 사용될 수 있다.The write status of the data page can be expressed by using the first bit of the 32-bit I/O semaphore. For example, if a data page corresponding to a segment is being written, the leftmost bit of 32 bits may be represented as 1, and if not being written, it may be represented as 0. Also, the remaining 31 bits excluding the leftmost bit can be used as a counter for threads reading the data pages of the segment.

도 6을 참고하면, 컴퓨팅 장치(1000)는 파일을 구성하는 데이터 페이지의 묶음을 세그먼트라는 단위로 설정하고, 각 세그먼트가 사용 중인 현황을 세그먼트 세마포어(Segment Semaphore)으로 표현한다(S201). 세그먼트 세마포어는 32비트일 수 있고, 도 5에서 설명한 바와 같다. Referring to FIG. 6, the computing device 1000 sets a bundle of data pages constituting a file in units of segments, and expresses the status of each segment being used as a segment semaphore (S201). The segment semaphore may be 32 bits, as described in FIG. 5.

컴퓨팅 장치(1000)는 요청된 연산이 쓰기인지 읽기인지 확인한다(S202). The computing device 1000 checks whether the requested operation is write or read (S202).

쓰기 연산이 요청된 경우, 컴퓨팅 장치(1000)는 대해 특정 세그먼트에 대응되는 I/O 세마포어의 값을 확인한다(S203).When a write operation is requested, the computing device 1000 checks a value of an I/O semaphore corresponding to a specific segment (S203).

S204 단계에서 I/O 세마포어의 값이 0이 아니면, 컴퓨팅 장치(1000)는 세마포어 값 전체가 0이 될 때까지 대기하고, 다시 세마포어 값을 확인한다(S203). I/O 세마포어의 가장 왼쪽 비트값이 1이면 해당 세그먼트가 쓰기 중임을 의미하므로, 해당 비트가 0이 될 때까지 대기한다. If the value of the I/O semaphore is not 0 in step S204, the computing device 1000 waits until the entire semaphore value becomes 0, and checks the semaphore value again (S203). If the leftmost bit value of the I/O semaphore is 1, it means that the segment is being written, so wait until the corresponding bit becomes 0.

S204 단계에서 I/O 세마포어의 값이 0이면, 컴퓨팅 장치(1000)는 가장 왼쪽의 비트값을 1로 변경하고, 해당 세그먼트의 쓰기 락을 획득한다(S205).If the value of the I/O semaphore is 0 in step S204, the computing device 1000 changes the leftmost bit value to 1, and acquires a write lock of the segment (S205).

쓰기 락을 획득하면, 컴퓨팅 장치(1000)는 특정 세마포어에 해당하는 세그먼트의 파일 쓰기 연산을 수행한다(S206). Upon obtaining the write lock, the computing device 1000 performs a file write operation of a segment corresponding to a specific semaphore (S206).

한편 S202 단계에서, 읽기 연산이 요청된 경우, 컴퓨팅 장치(1000)는 특정 세그먼트에 대응되는 세마포어 가장 왼쪽 비트값을 확인한다(S207).Meanwhile, in step S202, when a read operation is requested, the computing device 1000 checks the leftmost bit value of the semaphore corresponding to the specific segment (S207).

S208 단계에서 비트값이 1이면, 쓰기 연산 중인 것을 의미하므로 다시 세마포어 가장 왼쪽 비트값을 확인한다(S207).If the bit value is 1 in step S208, it means that the write operation is in progress, so the leftmost bit value of the semaphore is checked again (S207).

S208 단계에서 비트값이 0이면, 컴퓨팅 장치(1000)는 세마포어의 값을 1 증가 시키고, 해당 세그먼트의 읽기 락을 획득한다(S209).If the bit value is 0 in step S208, the computing device 1000 increases the semaphore value by 1 and acquires a read lock of the segment (S209).

컴퓨팅 장치(1000)는 특정 세마포어에 해당하는 세그먼트의 파일 읽기 연산을 수행한다(S210). The computing device 1000 performs a file read operation of a segment corresponding to a specific semaphore (S210).

I/O 세마포어의 특정 비트값을 통해 쓰기와 읽기가 수행 중인 상태를 나타내므로, 컴퓨팅 장치(1000)는 쓰기 중인 세그먼트에 대해서는 읽기와 쓰기 연산 모두 수행할 수 없으나, 읽기 중인 세그먼트에 대해서는 쓰기만 불가능함을 보장할 수 있다.Since writing and reading are being performed through a specific bit value of the I/O semaphore, the computing device 1000 cannot perform both read and write operations on the segment being written, but only write to the segment being read is not possible. Can be guaranteed.

한편, 세그먼트를 이용한 락킹 방법을 수행하기 위해, 커널에서 제공하는 원자적 비트 연산을 사용한 의사코드는 다음 표 1 내지 표 2와 같다. 표 1은 쓰기 락킹 알고리즘, 표 2는 읽기 락킹 알고리즘을 나타낸 것이다. Meanwhile, in order to perform the locking method using segments, pseudocodes using atomic bit operations provided by the kernel are shown in Tables 1 to 2 below. Table 1 shows the write locking algorithm, and Table 2 shows the read locking algorithm.

1
2
3
4
5
6
7
8
9
10
11One
2
3
4
5
6
7
8
9
10
11 void nova_segment_write_lock (inode, start, end){
atomic_t *rwlock = inode.rwlock_semaphore;
unsigned int wlock = 1 << 31;
for (cur=start; cur<=end; cur++)
while true :
smp_mb_before_atomic();
old_semaphore = atomic_cmpxchg(&rwlock[cur],0,wlock);
smp_mb_after_atomic();
if old_semaphore == 0;
break;
return; void nova_segment_write_lock (inode, start, end){
atomic_t *rwlock = inode.rwlock_semaphore;
unsigned int wlock = 1 << 31;
for (cur=start; cur<=end; cur++)
while true:
smp_mb_before_atomic();
old_semaphore = atomic_cmpxchg(&rwlock[cur],0,wlock);
smp_mb_after_atomic();
if old_semaphore == 0;
break;
return;

1
2
3
4
5
6
7
8
9
10One
2
3
4
5
6
7
8
9
10 void nova_segmnet_read_lock (inode, start, end){
atomic_t *rwlock= inode.rwlock_semaphore;
unsigned int wlock = 1 << 31;
for (cur=start; cur<=end; cur++)
smp_mb_before_atomic();
old_semaphore = atomic_add_unless(&rwlock[cur],1,wlock);
smp_mb_after_atomic();
if old_semaphore != 0;
break;
return; void nova_segmnet_read_lock(inode, start, end){
atomic_t *rwlock= inode.rwlock_semaphore;
unsigned int wlock = 1 << 31;
for (cur=start; cur<=end; cur++)
smp_mb_before_atomic();
old_semaphore = atomic_add_unless(&rwlock[cur],1,wlock);
smp_mb_after_atomic();
if old_semaphore != 0;
break;
return;

표 1의 7번째 줄에서, atomic_cmpxchg(&v, old, new) 함수는 v값을 읽어 old와 같은 경우 new값으로 교체한 후 저장하고 교체 전의 v값을 반환한다. 표 2의 6번째 줄에서, atomic_add_unless(&v, i, u) 함수는 v값을 읽은 후 u값과 다른 경우 i값을 더해 저장한다. 교체 전의 v값이 u값과 같은 경우 0을 반환한다.In the 7th line of Table 1, the atomic_cmpxchg(&v, old, new) function reads the v value, replaces it with the new value if it is equal to old, stores it, and returns the v value before the replacement. In the 6th line of Table 2, the atomic_add_unless(&v, i, u) function reads the v value, and if it is different from the u value, adds the i value and stores it. If the v value before replacement is the same as the u value, 0 is returned.

표 1의 atomic_cmpxchg 함수와 표 2의 atomic_add_unless 함수는 연산의 원자성을 제공한다. 연산의 원자성이란, 임의의 순차적인 기계어 명령(Machine Instruction)이 실행 중에 중단될 수 없음을 의미한다. 즉, 여러 스레드가 같은 세그먼트에 대하여 atomic_cmpxchg 함수를 이용하여 해당 세그먼트에 대응되는 세마포어의 가장 왼쪽 비트를 1로 셋팅하려 할 때, atomic_cmpxchg는 연산의 원자성을 제공하므로, 하나의 스레드에 대해서만 함수의 반환 값이 0이 될 수 있다. 이에 따라 해당 세그먼트를 독점하는 스레드는 언제나 하나일 수 밖에 없다.The atomic_cmpxchg function in Table 1 and the atomic_add_unless function in Table 2 provide atomicity of operations. The atomicity of an operation means that any sequential machine instruction cannot be interrupted during execution. That is, when several threads try to set the leftmost bit of the semaphore corresponding to the segment to 1 using the atomic_cmpxchg function for the same segment, atomic_cmpxchg provides the atomicity of the operation, so the function returns only for one thread. The value can be 0. Accordingly, there is always only one thread that monopolizes the segment.

표 1의 7번째 줄에서, 락킹 함수는 루프를 돌며 특정 세그먼트에 대응되는 세마포어가 0일 때까지 대기한다. 만약 atomic_cmpxchg 함수의 반환 값이 0이라면 해당 세그먼트가 쓰여지지 않는다는 의미이므로 가장 왼쪽 비트를 1로 셋팅하여 세그먼트를 독점할 수 있다. In the 7th line of Table 1, the locking function loops and waits until the semaphore corresponding to a specific segment is 0. If the return value of the atomic_cmpxchg function is 0, it means that the segment is not written, so you can monopolize the segment by setting the leftmost bit to 1.

같은 방식으로 표 2의 6번째 줄에서, 락킹 함수는 루프를 돌며 특정 세그먼트에 대응되는 세마포어의 가장 왼쪽 비트가 0일 때까지 대기한다. 만약 atomic_add_unless 함수의 반환 값이 0이 아니라면 해당 세그먼트에 쓰기를 수행하고 있지 않는다는 의미이므로 세마포어에 1을 더하여 세그먼트를 읽을 수 있다. In the same way, in the 6th line of Table 2, the locking function loops and waits until the leftmost bit of the semaphore corresponding to a specific segment is 0. If the return value of the atomic_add_unless function is not 0, it means that the segment is not being written, so you can read the segment by adding 1 to the semaphore.

도 7은 한 실시예에 따른 락킹 방법의 쓰기 출력량을 평가한 결과를 나타내는 설명도이고, 도 8은 한 실시예에 따른 락킹 방법의 읽기 출력량을 평가한 결과를 나타내는 설명도이고, 도 9는 한 실시예에 따른 락킹 방법의 연산 횟수를 평가한 결과를 나타내는 설명도이다. 7 is an explanatory diagram showing a result of evaluating a write output amount of a locking method according to an embodiment, FIG. 8 is an explanatory diagram showing a result of evaluating a read output amount of a locking method according to an embodiment, and FIG. It is an explanatory diagram showing the result of evaluating the number of calculations of the locking method according to the embodiment.

도 7 내지 도 9는 각각 연산을 수행할 때 확장성을 측정한 것이다. 확장성을 측정하는 방법으로서, 파일시스템 확장성을 측정하는 마이크로 벤치마크인 fxmark와 데이터베이스 벤치마크인 filebench-oltp를 이용할 수 있다. 한 예로서, fxmark 중 여러 스레드가 같은 파일의 다른 부분에 쓰기를 수행하는 DWOM 워크로드를, 같은 파일의 다른 부분에 읽기를 수행하는 DRBM 워크로드를 이용할 수 있다. 구체적으로, 표 3 및 표 4와 같은 워크로드를 이용할 수 있다. 7 to 9 respectively measure scalability when performing an operation. As a method of measuring scalability, fxmark, a micro-benchmark that measures filesystem scalability, and filebench-oltp, a database benchmark, can be used. As an example, among fxmarks, a DWOM workload in which multiple threads write to different parts of the same file can be used, and a DRBM workload that reads to different parts of the same file can be used. Specifically, the workloads shown in Tables 3 and 4 can be used.

파일 크기File size 4 MB4 MB 파일 개수Number of files 1 개One 쓰기 I/O 크기Write I/O size 4 KB4 KB I/O 패턴I/O pattern 스레드가 같은 파일의 다른 부분에 덮어쓰기 Thread overwrites another part of the same file 스레드 개수Number of threads 1 ~ 120 개1 to 120 pcs

표 3에서, 컴퓨팅 장치(1000)는 4MB 크기의 하나의 파일에 대해서 스레드가 각자 다른 부분에 4KB 크기의 쓰기 I/O를 반복적으로 수행할 수 있다.In Table 3, the computing device 1000 may repeatedly perform 4KB write I/O in different portions of each thread for one 4MB file.

파일 크기File size 4 MB4 MB 파일 개수Number of files 1 개One 읽기 I/O 크기Read I/O size 4 KB4 KB I/O 패턴I/O pattern 스레드가 같은 파일의 다른 부분에 읽기 Thread reading to different parts of the same file 스레드 개수Number of threads 1 ~ 120 개1 to 120 pcs

표 4에서, 컴퓨팅 장치(1000)는 4MB 크기의 하나의 파일에 대해서 스레드가 각자 다른 부분에 4KB 크기의 읽기 I/O를 반복적으로 수행할 수 있다.In Table 4, the computing device 1000 may repeatedly perform 4KB read I/O in different portions of each thread for one 4MB file.

도 7을 참고하면, NOVA는 기존의 NOVA 방법을 의미하고, pNOVA-interval은 본 발명의 인터벌 트리를 이용한 락킹 방법을 의미하고, pNOVA-segment는 본 발명의 세그먼트를 이용한 락킹 방법을 의미한다.Referring to FIG. 7, NOVA refers to a conventional NOVA method, pNOVA-interval refers to a locking method using an interval tree of the present invention, and pNOVA-segment refers to a locking method using a segment of the present invention.

DWOM 워크로드에서는 모든 스레드가 파일의 다른 부분에 쓰기를 수행한다. 기존 NOVA 방식의 경우, 파일 전체에 락이 걸려 하나의 스레드만 쓰기를 수행할 수 있어, 스레드의 개수가 많아져도 출력량이 높아지지 않는다. In the DWOM workload, all threads write to different parts of the file. In the case of the existing NOVA method, the entire file is locked and only one thread can write, so even if the number of threads increases, the output volume does not increase.

pNOVA-interval와 pNOVA-segment는 겹치는 쓰기 요청이 없기 때문에 모든 스레드가 병렬적으로 쓰기 연산을 수행할 수 있으므로, 각각 7 스레드, 15 스레드까지 확장성을 보일 수 있다. 하지만 pNOVA-interval은 인터벌 트리로 인한 병목 현상으로 인해 다른 두 방법보다 성능이 높지 않을 수 있다. pNOVA-segment 방법은 락 또는 언락을 위해여 카운터 값 또는 비트값을 바꾸는 단순한 방식이므로 병목 지점이 없어 pNOVA-interval보다 높은 성능을 보인다.Since pNOVA-interval and pNOVA-segment do not have overlapping write requests, all threads can perform write operations in parallel, so scalability up to 7 threads and 15 threads, respectively. However, pNOVA-interval may not perform as well as the other two methods due to the bottleneck caused by the interval tree. Since the pNOVA-segment method is a simple method of changing the counter value or bit value for locking or unlocking, there is no bottleneck, so it shows higher performance than pNOVA-interval.

도 8을 참고하면, DRBM 워크로드에서는 모든 스레드가 파일의 다른 부분에 읽기를 수행한다. 기존 NOVA 방식의 경우, 하나의 읽기 카운터 변수를 증가시키려 하므로, 스레드의 개수가 많아질수록 출력량이 증가하다 하락한다.Referring to FIG. 8, in the DRBM workload, all threads read to other parts of a file. In the case of the conventional NOVA method, since one read counter variable is to be increased, the output amount increases and decreases as the number of threads increases.

pNOVA-interval은 인터벌 트리의 병목현상으로 인해 모든 스레드가 병렬적으로 읽기 연산을 수행할 수 없다. 그러나 pNOVA-segment 방법은 읽기 카운터 변수가 세그먼트 별로 정의되어 있기 때문에 이로 인한 오버헤드가 발생하지 않는다.In pNOVA-interval, all threads cannot perform read operations in parallel due to the bottleneck of the interval tree. However, the pNOVA-segment method does not incur overhead due to the fact that the read counter variable is defined for each segment.

도 9를 참고하면, 데이터베이스 벤치마킹 툴인 filebench-oltp를 이용하여 락킹 방법의 성능을 평가할 수 있다. 표 5의 설정값을 이용하여, I/O 스레드의 개수를 1부터 120까지 증가시키며 평가하였다. Referring to FIG. 9, performance of the locking method can be evaluated using filebench-oltp, which is a database benchmarking tool. Using the setting values in Table 5, the number of I/O threads was increased from 1 to 120 and evaluated.

벤치마크 항목Benchmark item filebench-oltpfilebench-oltp DB 파일 크기DB file size 100 MB100 MB DB 파일 개수Number of DB files 1One 로그 파일 크기Log file size 100 MB100 MB 로그 파일 개수Number of log files 1One 버퍼 풀 크기Buffer pool size N/AN/A 스레드 개수Number of threads 88

기존 NOVA 방식에 비해, 본 발명에서 제안한 락킹 방법들은 filebench-oltp의 경우 병렬 쓰기 가능 비율이 69.04%로 높았다. 특히 블록 비트맵을 이용한 락킹 방법은 64 스레드에서 기존 NOVA 방식보다 최대 1.66배의 성능 향상이 있었다. Compared to the conventional NOVA method, the locking methods proposed in the present invention have a higher parallel write ratio of 69.04% in the case of filebench-oltp. In particular, the locking method using the block bitmap has a maximum performance improvement of 1.66 times compared to the conventional NOVA method in 64 threads.

pNOVA-interval, pNOVA-segment는 단일 파일의 병렬성으로 인해, 기존의 NOVA 방식보다 높은 성능을 보였다. 특히 pNOVA-segment는 인터벌 트리 방식보다 적은 오버헤드로 인해 pNOVA-interval보다 다소 높은 성능을 보였다.The pNOVA-interval and pNOVA-segment showed higher performance than the conventional NOVA method due to the parallelism of a single file. In particular, pNOVA-segment showed somewhat higher performance than pNOVA-interval due to less overhead than the interval tree method.

도 10은 한 실시예에 따른 컴퓨팅 장치의 하드웨어 구성도이다.10 is a hardware configuration diagram of a computing device according to an embodiment.

도 10을 참고하면, 락킹부(100)와 쓰기부(200)는 적어도 하나의 프로세서에 의해 동작하는 컴퓨팅 장치(1000)에서, 본 발명의 동작을 실행하도록 기술된 명령들(instructions)이 포함된 프로그램을 실행한다. Referring to FIG. 10, the locking unit 100 and the writing unit 200 include instructions described to execute the operation of the present invention in the computing device 1000 operated by at least one processor. Run the program.

컴퓨팅 장치(1000)의 하드웨어는 적어도 하나의 프로세서(310), 메모리(320), 스토리지(330), 통신 인터페이스(340)을 포함할 수 있고, 버스를 통해 연결될 수 있다. 이외에도 입력 장치 및 출력 장치 등의 하드웨어가 포함될 수 있다. 컴퓨팅 장치(1000)는 프로그램을 구동할 수 있는 운영 체제를 비롯한 각종 소프트웨어가 탑재될 수 있다.The hardware of the computing device 1000 may include at least one processor 310, a memory 320, a storage 330, and a communication interface 340, and may be connected through a bus. In addition, hardware such as an input device and an output device may be included. The computing device 1000 may be equipped with various software including an operating system capable of driving a program.

프로세서(310)는 컴퓨팅 장치(1000)의 동작을 제어하는 장치로서, 프로그램에 포함된 명령들을 처리하는 다양한 형태의 프로세서일 수 있고, 예를 들면, CPU(Central Processing Unit), MPU(Micro Processor Unit), MCU(Micro Controller Unit), GPU(Graphic Processing Unit) 등 일 수 있다. 메모리(320)는 본 발명의 동작을 실행하도록 기술된 명령들이 프로세서(310)에 의해 처리되도록 해당 프로그램을 로드한다. 메모리(320)는 예를 들면, ROM(read only memory), RAM(random access memory) 등 일 수 있다. 스토리지(330)는 본 발명의 동작을 실행하는데 요구되는 각종 데이터, 프로그램 등을 저장한다. 통신 인터페이스(340)는 유/무선 통신 모듈일 수 있다.The processor 310 is a device that controls the operation of the computing device 1000 and may be various types of processors that process instructions included in a program. For example, a CPU (Central Processing Unit) or a MPU (Micro Processor Unit) may be used. ), microcontroller unit (MCU), graphic processing unit (GPU), and the like. The memory 320 loads a corresponding program such that instructions described to perform the operation of the present invention are processed by the processor 310. The memory 320 may be, for example, read only memory (ROM), random access memory (RAM), or the like. The storage 330 stores various types of data and programs required to perform the operation of the present invention. The communication interface 340 may be a wired/wireless communication module.

이상에서 설명한 본 발명의 실시예는 장치 및 방법을 통해서만 구현이 되는 것은 아니며, 본 발명의 실시예의 구성에 대응하는 기능을 실현하는 프로그램 또는 그 프로그램이 기록된 기록 매체를 통해 구현될 수도 있다.The embodiments of the present invention described above are not implemented only through an apparatus and a method, but may be implemented through a program that realizes a function corresponding to the configuration of the embodiment of the present invention or a recording medium in which the program is recorded.

이상에서 본 발명의 실시예에 대하여 상세하게 설명하였지만 본 발명의 권리범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 발명의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리범위에 속하는 것이다.Although the embodiments of the present invention have been described in detail above, the scope of the present invention is not limited thereto, and various modifications and improvements by those skilled in the art using the basic concept of the present invention defined in the following claims are also provided. It belongs to the scope of rights.

Claims

적어도 하나의 프로세서에 의해 동작하는 컴퓨팅 장치의 동작 방법으로서,
임의의 파일의 읽기/쓰기(이하 'I/O'라고 호칭함) 요청을 입력받는 단계,
상기 I/O 요청에 포함된 데이터 페이지를 포함하는 I/O 요청 노드를 구성하고, 상기 I/O 요청 노드를 인터벌 트리에 삽입하는 단계,
상기 인터벌 트리를 조회하여, 상기 I/O 요청 노드에 포함된 데이터 페이지를 포함하는 I/O 요청 노드가 있는지 확인하는 단계,
상기 I/O 요청 노드와 중복되는 범위의 I/O 요청 노드가 작업이 완료되어 삭제되면, 상기 I/O 요청에 포함된 데이터 페이지의 락을 획득하고 상기 I/O 요청을 수행하는 단계
를 포함하고,
상기 인터벌 트리는 복수의 I/O 요청 노드들을 포함하고, 각 I/O 요청 노드는 I/O 요청 대상인 데이터 페이지의 포함 여부에 따라 자식 노드를 갖는, 동작 방법.A method of operating a computing device operated by at least one processor,
Step of receiving a request to read/write an arbitrary file (hereinafter referred to as'I/O'),
Configuring an I/O request node including a data page included in the I/O request, and inserting the I/O request node into an interval tree,
Checking whether there is an I/O request node including a data page included in the I/O request node by querying the interval tree,
When an I/O request node in the range overlapping with the I/O request node is deleted after the operation is completed, acquiring a lock of the data page included in the I/O request and performing the I/O request
Including,
The interval tree includes a plurality of I/O request nodes, and each I/O request node has a child node according to whether or not a data page that is an I/O request target is included.

제1항에서,
상기 확인하는 단계는,
상기 인터벌 트리의 왼쪽 하위부터 오른쪽 하위 방향으로 중위 순회(Inorder Traversal)하는, 동작 방법.In claim 1,
The step of confirming,
The operation method of performing an in-order traversal from a lower left to a lower right of the interval tree.

제1항에서,
상기 I/O 요청 노드는, 자식 노드들에 포함된 데이터 페이지 중 가장 큰 값인 상한값을 더 포함하는, 동작 방법.In claim 1,
The I/O request node further includes an upper limit value, which is a largest value among data pages included in child nodes.

제3항에서,
상기 확인하는 단계는,
임의의 I/O 요청 노드에 포함된 상한값이 상기 I/O 요청 노드의 상한값보다 작으면 상기 임의의 I/O 요청 노드의 자식 노드들은 조회하지 않는, 동작 방법.In paragraph 3,
The step of confirming,
If an upper limit value included in an arbitrary I/O request node is less than an upper limit value of the I/O request node, child nodes of the arbitrary I/O request node are not inquired.

제1항에서,
상기 확인하는 단계는,
상기 I/O 요청 노드에 포함된 데이터 페이지를 포함하는 I/O 요청 노드의 개수인 중복 요청 개수를 상기 I/O 요청 노드에 더 포함하는, 동작 방법.In claim 1,
The step of confirming,
The method of operation further comprising, in the I/O request node, a number of duplicate requests, which is the number of I/O request nodes including data pages included in the I/O request node.

제5항에서,
상기 수행하는 단계는,
상기 중복 요청 개수가 0이 되면 상기 I/O 요청을 수행하는, 동작 방법.In clause 5,
The performing step,
When the number of duplicate requests becomes 0, the I/O request is performed.

제1항에서,
상기 인터벌 트리는 DRAM(Dynamic Random Access Memory)에 저장되는, 동작 방법.In claim 1,
The interval tree is stored in a dynamic random access memory (DRAM).

적어도 하나의 프로세서에 의해 동작하는 컴퓨팅 장치의 동작 방법으로서,
파일을 구성하는 각 세그먼트의 사용 중인 정보를 세마포어(Semaphore)로 나타내어 관리하는 단계,
상기 파일의 특정 세그먼트에 대한 읽기/쓰기 요청을 입력받는 단계,
상기 특정 세그먼트에 대응된 세마포어 값을 확인하는 단계, 그리고
상기 세마포어 값에 따라 상기 특정 세그먼트의 락을 획득하고, 상기 읽기/쓰기 요청을 수행하는 단계
를 포함하는, 동작 방법.A method of operating a computing device operated by at least one processor,
Managing information in use of each segment constituting a file as a semaphore,
Receiving a read/write request for a specific segment of the file,
Checking the semaphore value corresponding to the specific segment, and
Acquiring the lock of the specific segment according to the semaphore value and performing the read/write request
Containing, operating method.

제8항에서,
상기 관리하는 단계는,
상기 세그먼트가 쓰기 연산을 수행 중이면 상기 세마포어 값에 포함된 특정 비트값을 1로 설정하고,
읽기 연산을 수행 중이면 상기 세마포어 값에 1을 더하고,
쓰기 또는 읽기 연산을 수행 중이 아니면 상기 세마포어 값을 0으로 설정하는, 동작 방법.In clause 8,
The managing step,
If the segment is performing a write operation, a specific bit value included in the semaphore value is set to 1,
If a read operation is being performed, 1 is added to the semaphore value,
If a write or read operation is not being performed, the semaphore value is set to 0.

제9항에서,
상기 수행하는 단계는,
상기 세마포어 값이 0이면, 상기 특정 비트값을 1로 설정하고 상기 세그먼트의 쓰기 락을 획득하고,
상기 특정 비트값이 0이면, 상기 세마포어 값을 1을 더하고 상기 세그먼트의 읽기 락을 획득하는, 동작 방법.In claim 9,
The performing step,
If the semaphore value is 0, the specific bit value is set to 1 and a write lock of the segment is obtained,
If the specific bit value is 0, adding 1 to the semaphore value and obtaining a read lock of the segment.

제8항에서,
상기 세그먼트는, 상기 파일에 포함된 데이터 페이지들의 모음인, 동작 방법.In clause 8,
The segment is a collection of data pages included in the file.

컴퓨팅 장치로서,
메모리, 그리고
상기 메모리에 로드된 프로그램의 명령들(instructions)을 실행하는 적어도 하나의 프로세서를 포함하고,
상기 프로그램은
임의의 파일의 읽기/쓰기(이하 ‘I/O’라고 호칭함) 요청을 입력받는 단계,
상기 I/O 요청에 포함된 데이터 페이지가 임의의 스레드에 의해 사용 중인지 확인하는 단계, 그리고
상기 임의의 스레드에 의한 사용이 완료되면, 상기 데이터 페이지의 락을 획득하고, 상기 I/O 요청을 수행하는 단계
실행하도록 기술된 명령들을 포함하는, 컴퓨팅 장치.As a computing device,
Memory, and
Including at least one processor to execute instructions (instructions) of the program loaded in the memory,
The above program is
Step of receiving a request to read/write an arbitrary file (hereinafter referred to as'I/O'),
Checking whether the data page included in the I/O request is being used by any thread, and
When the use by the random thread is completed, acquiring a lock of the data page and performing the I/O request
A computing device comprising instructions described to execute.

제12항에서,
상기 확인하는 단계는,
상기 I/O 요청에 포함된 데이터 페이지 정보를 노드로 구성한 인터벌 트리 또는 상기 I/O 요청에 포함된 데이터 페이지의 사용 현황을 I/O 세마포어(Semaphore) 값으로 표현한 세그먼트를 이용하는, 컴퓨팅 장치.In claim 12,
The step of confirming,
A computing device using an interval tree in which data page information included in the I/O request is composed of nodes or a segment representing a usage status of a data page included in the I/O request as an I/O semaphore value.