KR100607932B1

KR100607932B1 - Coding method with use of error correction code and decoding method therefor

Info

Publication number: KR100607932B1
Application number: KR1019990029283A
Authority: KR
Inventors: 이윤우; 김영윤; 김병준; 정현권
Original assignee: 삼성전자주식회사
Priority date: 1999-07-20
Filing date: 1999-07-20
Publication date: 2006-08-03
Also published as: KR20010010406A

Abstract

데이터 압축 부호화/복호화 방법에 관한 것으로서 특히 소오스 데이터에 에러 정정 코드를 부가하고, 에러 정정 코드의 에러 정정 특성에 의해 소오스 데이터의 일부를 생략하여 부호화하는 방법 및 이에 적합한 복호화 방법에 관한 것이다.The present invention relates to a data compression encoding / decoding method, and more particularly, to a method of adding an error correction code to source data, omitting a portion of the source data according to an error correction characteristic of the error correction code, and a decoding method suitable for the same.

본 발명에 따른 부호화 방법은 N개의 심벌로 구성된 소오스 데이터를 부호화하는 방법에 있어서, N개의 심벌로 구성된 소오스 데이터를 부호화하는 방법에 있어서, N개의 심벌로 구성된 소오스 데이터에 M개의 심벌로 구성되는 에러 정정 코드를 부가하는 과정; 상기 에러 정정 코드에서 동일한 심벌이 m개 연속되어 있으면 연속된 심벌의 개수를 나타내는 헤더, 연속된 심벌의 값, 그리고 N-m개의 소오스 데이터를 부호화하는 제1부호화 과정; 및 상기 에러 정정 코드에 연속된 대표값이 없을 경우, 연속된 심벌이 없음을 나타내는 헤더와 N개의 소오스 데이터를 부호화하는 제2부호화 과정을 포함하는 것을 특징으로 한다.The encoding method according to the present invention is a method of encoding a source data consisting of N symbols, the method of encoding a source data consisting of N symbols, the error consisting of M symbols in the source data consisting of N symbols Adding a correction code; A first encoding process of encoding a header indicating the number of consecutive symbols, a value of consecutive symbols, and N-m source data when m identical symbols are contiguous in the error correction code; And a second encoding process of encoding a header and N source data indicating that there is no continuous symbol when there is no continuous representative value in the error correction code.

본 발명에 따른 부호화 방법은 소오스 데이터에 부가된 에러 정정 코드의 일부와 소오스 데이터의 일부만을 부호화하며, 에러 정정 코드의 중복성이 클수록 압축률이 증가하는 효과를 갖는다.The encoding method according to the present invention encodes only a part of the error correction code added to the source data and a part of the source data, and the compression rate increases as the redundancy of the error correction code increases.

Description

에러 정정 코드를 이용한 부호화 방법 및 이에 적합한 복호화 방법{Coding method with use of error correction code and decoding method therefor}Coding method with use of error correction code and decoding method therefor}

도 1은 본 발명에 따른 부호화 방법을 보이는 흐름도이다.1 is a flowchart showing an encoding method according to the present invention.

도 2는 도 1에 도시된 부호화 방법을 도식적으로 보이기 위해 도시된 것이다.FIG. 2 is a diagram schematically showing the encoding method shown in FIG. 1.

도 3은 도 1에 도시된 부호화 방법에 의해 얻어진 부호화 데이터의 형태를 보이는 것이다.FIG. 3 shows the form of encoded data obtained by the encoding method shown in FIG. 1.

도 4는 본 발명에 따른 복호화 방법을 보이는 흐름도이다.4 is a flowchart illustrating a decoding method according to the present invention.

도 5는 도 4에 도시된 복호화 방법을 도식적으로 보이기 위해 도시된 것이다.FIG. 5 is shown to schematically show the decoding method shown in FIG. 4.

도 6은 도 1에 도시된 부호화 방법을 여러 가지 파일에 적용한 결과를 보이는 것이다.FIG. 6 shows a result of applying the encoding method shown in FIG. 1 to various files.

본 발명은 데이터 압축 부호화/복호화 방법에 관한 것으로서 특히 소오스 데이터에 에러 정정 코드를 부가하고, 에러 정정 코드의 에러 정정 특성에 의해 소오 스 데이터의 일부를 생략하여 부호화하는 방법 및 이에 적합한 복호화 방법에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a data compression encoding / decoding method, and more particularly to a method of adding an error correction code to source data, omitting a portion of the source data according to an error correction characteristic of the error correction code, and a decoding method suitable for the same. will be.

데이터를 압축하는 방법은 크게 두 가지로 분류된다. 첫 번째는 손실 부호화이고 두 번째는 무손실 부호화이다. 손실 부호화는 음성, 화상 데이터 등의 부호화에 주로 사용되며 무손실 부호화는 프로그램 데이터, 텍스트 데이터 등의 부호하에 주로 사용된다.There are two main ways to compress data. The first is lossy coding and the second is lossless coding. Lossy coding is mainly used for encoding audio, image data, and the like, and lossless coding is mainly used under coding of program data, text data and the like.

무손실 부호화는 데이터의 중복성을 이용하거나(run length 부호화), 데이터의 확률성을 이용하거나(가변장 부호화), 혹은 수학적 성질을 이용한다(산술 부호화).Lossless coding uses data redundancy (run length coding), data stochasticity (variable length coding), or mathematical properties (arithmetic coding).

그러나, 프로그램 데이터나 텍스트 데이터 등은 데이터의 중복성이 적기 때문에 압축효율이 그다지 높지 못하다는 문제점이 있다..However, program data, text data, and the like have a problem in that the compression efficiency is not so high because data redundancy is small.

따라서, 본 발명은 무손실 부호화에 있어서 압축 효율을 제고시키는 개선된 부호화 방법을 제공하는 것을 그 목적으로 한다.It is therefore an object of the present invention to provide an improved coding method for improving the compression efficiency in lossless coding.

본 발명의 다른 목적은 상기의 부호화 방법에 적합한 복호화 방법을 제공하는 것을 그 목적으로 한다. Another object of the present invention is to provide a decoding method suitable for the above encoding method.

상기의 목적을 달성하는 본 발명에 따른 부호화 방법은 N개의 심벌로 구성된 소오스 데이터를 부호화하는 방법에 있어서, N개의 심벌로 구성된 소오스 데이터를 부호화하는 방법에 있어서, N개의 심벌로 구성된 소오스 데이터에 M개의 심벌로 구 성되는 에러 정정 코드를 부가하는 과정; 상기 에러 정정 코드에서 동일한 심벌이 m개 연속되어 있으면 연속된 심벌의 개수를 나타내는 헤더, 연속된 심벌의 값, 그리고 N-m개의 소오스 데이터를 부호화하는 제1부호화 과정; 및 상기 에러 정정 코드에 연속된 대표값이 없을 경우, 연속된 심벌이 없음을 나타내는 헤더와 N개의 소오스 데이터를 부호화하는 제2부호화 과정을 포함하는 것을 특징으로 한다.According to an aspect of the present invention, there is provided a method of encoding source data consisting of N symbols, the method of encoding source data consisting of N symbols, comprising: Adding an error correction code consisting of two symbols; A first encoding process of encoding a header indicating the number of consecutive symbols, a value of consecutive symbols, and N-m source data when m identical symbols are contiguous in the error correction code; And a second encoding process of encoding a header and N source data indicating that there is no continuous symbol when there is no continuous representative value in the error correction code.

본 발명에 따른 부호화 방법에 의하면 에러 정정 코드에 동일한 심벌이 m개 연속될 경우 소오스 데이터를 (m-2)개만큼 생략하고 부호화할 수 있다.According to the encoding method of the present invention, when m identical symbols are contiguous to an error correction code, the source data may be omitted by encoding (m-2).

따라서, m이 2이상인 경우 소오스 데이터를 압축하는 효과를 얻을 수 있다.Therefore, when m is 2 or more, the effect of compressing the source data can be obtained.

상기의 다른 목적을 달성하는 본 발명에 따른 복호화 방법은 상기 부호화된 데이터에서 헤더를 검출하는 헤더 검출 과정; 상기 검출된 헤더의 값에 의해 상기 연속된 심벌의 유무를 판별하는 헤더값 판별 과정; 연속된 심벌이 있는 경우 상기 대표값을 가지는 연속된 심벌들을 얻고 연속된 심벌들의 선두 위치를 판별하고, 판별된 선두 위치에 따라 에러 정정을 수행하여 부호화시 생략된 심벌들을 포함하는 소오스 데이터를 복원하는 제1복호화 과정; 및 연속된 심벌이 없는 경우 상기 헤더를 제외한 데이터로부터 소오스 데이터를 복원하는 제2복호화 과정을 포함하는 것을 특징으로 한다.According to another aspect of the present invention, there is provided a decoding method, comprising: a header detecting step of detecting a header from the encoded data; A header value determination step of determining the presence or absence of the continuous symbol based on the detected header value; If there are consecutive symbols, the system obtains consecutive symbols having the representative value, determines the head positions of the consecutive symbols, performs error correction according to the determined head positions, and restores source data including symbols omitted during encoding. First decryption process; And a second decoding process of restoring the source data from the data except the header when there are no consecutive symbols.

이하 첨부된 도면을 참조하여 본 발명의 구성 및 동작을 상세히 설명한다.Hereinafter, the configuration and operation of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명에 따른 부호화 방법을 보이는 흐름도이다. 도 1에 도시된 방법은 에러 정정 코드 부가 과정(S102), 연속된 심벌의 존재 여부를 판단하는 과정(S104), 제1부호화 과정(S106), 제2부호화 과정(S108)을 포함한다. 본 발명의 부호화 방법은 N개의 심벌을 가지는 소오스 데이터 단위로 수행된다. 즉, 데이터 파일인 경우 각 심볼들이 N개 단위로 분할되고, 분할된 단위에 대하여 부호화가 수행된다.1 is a flowchart showing an encoding method according to the present invention. The method shown in FIG. 1 includes an error correcting code adding process S102, a process of determining whether a continuous symbol exists (S104), a first encoding process S106, and a second encoding process S108. The encoding method of the present invention is performed in source data units having N symbols. That is, in the case of a data file, each symbol is divided into N units, and encoding is performed on the divided units.

에러 정정 코드 부가 과정(S102에서는 N개의 심벌로 구성된 소오스 데이터에 M개의 심벌로 구성된 에러 정정 코드를 부가한다. 이러한 에러 정정 코드로서는 리드 솔로몬 코드(Reed Solomon code)가 적합하다. Error correction code addition process (S102) An error correction code consisting of M symbols is added to the source data consisting of N symbols. As such error correction code, a Reed Solomon code is suitable.

에러 정정 코드에 의한 정정은 에러 정정(error correction)과 이레이저 정정(eraser correction)으로 나뉘어지며, 에러 정정 능력은 M/2이하의 정수개가 되며, 이레이저 정정 능력은 M이 된다.The correction by the error correction code is divided into error correction and erasure correction, the error correction capability is an integer number of M / 2 or less, and the eraser correction capability is M.

여기서, 에러 정정은 에러 위치 및 에러값을 아는 경우의 정정을 말하고, 이레이저 정정은 에러 위치만을 아는 경우의 정정을 말한다. 따라서, 에러 정정 코드의 정정 능력을 M이라 할 때 M개의 이레이저를 정정하거나 M/2개의 에러를 정정할 수 있다.Here, error correction refers to correction when the error position and error value are known, and erasure correction refers to correction when only the error position is known. Therefore, when the ability to correct the error correction code is M, it is possible to correct M erasers or M / 2 errors.

판단 과정(S104)에서는 에러 정정 코드에서 m개의 연속된 심벌이 있는 지를 판단한다. m개의 연속된 심벌이 있으면 제1부호화 과정(S106)을 통하여 부호화되고, 그렇지 않으면 제2부호화 과정(S108)을 통하여 부호화된다.In the determination process S104, it is determined whether there are m consecutive symbols in the error correction code. If there are m consecutive symbols, the signal is encoded through the first encoding process S106. Otherwise, the m consecutive symbols are encoded through the second encoding process S108.

에러 정정 코드는 기본적으로 소오스 데이터 및 에러 정정 코드 자체에서 발생된 M개의 이레이저를 정정할 수 있고, 에러 정정 코드 중의 단 하나의 심벌만으로도 소오스 데이터 중의 한 심벌 및 에러 정정 코드 자체를 복원할 수 있다. 왜냐하면 에러 정정 코드를 구성하는 M개의 심벌들 중에서 위치 및 값이 알려진 심벌 하나에 의해 에러 정정 코드의 나머지 (M-1)개의 심벌들 및 소오스 데이터 중의 한 심벌을 복원할 수 있다. 이는 M개의 심벌로 구성되는 에러 정정 코드 중의 알려진 심벌 하나와 소오스 데이터를 구성하는 N개의 심벌들 중에서 한 심벌을 제외한 나머지 심벌들만을 부호화하더라도 원래의 소오스 데이터를 복원할 수 있음을 의미한다. 이때 소오스 데이터 중에서 부호화시 생략되는 심벌의 위치는 예를 들면, 소오스 데이터의 마지막 심벌이라는 등으로 사전의 약정에 의해 알려져 있어야한다.The error correction code can basically correct M erasers generated from the source data and the error correction code itself, and recover only one symbol of the source data and the error correction code itself with only one symbol of the error correction code. . Because one of the M symbols constituting the error correction code, the position and the value of the known symbol can be restored one of the remaining (M-1) symbols and source data of the error correction code. This means that the original source data can be restored even if only one known symbol of the error correction code consisting of M symbols and N symbols constituting the source data except for one symbol are encoded. At this time, the position of a symbol omitted during encoding in the source data should be known by a prior agreement, for example, the last symbol of the source data.

다른 한편으로 알려진 심벌이 m개라면, 에러 정정 코드를 구성하는 M개의 심벌들 중에서 알려진 심벌을 제외한 (M-m)개의 심벌들 및 소오스 데이터 중의 m개의 심벌들을 복원할 수 있다. 이는 에러 정정 코드 중의 m개의 알려진 심벌과 소오스 데이터 중의 m개의 심벌들을 제외한 일부의 소오스 데이터를 부호화하더라도 원래의 소오스 데이터를 복원할 수 있음을 의미한다. 이때 부호화시 소오스 데이터에서 생략되는 심벌들의 위치는 예를 들면, 소오스 데이터의 뒷부분에서 m개의 심벌들이라는 등으로 사전의 약정에 의해 알려져 있어야한다.On the other hand, if there are m known symbols, (M-m) symbols except for known symbols among the M symbols constituting the error correction code and m symbols of the source data may be restored. This means that even if some source data except for m known symbols in the error correction code and m symbols in the source data are encoded, the original source data can be restored. In this case, the positions of symbols omitted from the source data during encoding should be known by prior agreement, for example, m symbols later in the source data.

제1부호화 과정(S106)에서는 부가된 에러 정정 코드에서 m개 이상의 연속된 심벌이 있는 경우의 부호화 과정으로서 에러 정정 코드 중의 m개의 알려진 심벌과 소오스 데이터 중의 m개의 심벌들을 제외한 일부의 소오스 데이터를 부호화한다. 부호화 결과로서 연속된 심벌의 개수를 나타내는 헤더, 연속된 심벌의 대표값, 그리고 원래의 소오스 데이터에서 끝에서 m개의 심벌들이 제거된 소오스 데이터가 얻어진다.In the first encoding process S106, encoding is performed when there are m or more consecutive symbols in the added error correction code, and encoding some source data except m known symbols in the error correction code and m symbols in the source data. do. As a result of the encoding, a header indicating the number of consecutive symbols, a representative value of the consecutive symbols, and source data from which m symbols are removed from the original source data are obtained.

제2부호화 과정(S108)은 에러 정정 코드 부가 과정(S402)에서 부가된 에러 정정 코드에서 m개미만의 연속된 심벌들이 있는 경우의 부호화 과정으로서 소오스 데이터 전부가 생략 없이 부호화된다. 부호화 결과로서 연속된 심벌이 없음을 나타내는 헤더, 그리고 원래의 소오스 데이터가 얻어진다.The second encoding process S108 is an encoding process when there are less than m consecutive symbols in the error correction code added in the error correction code adding process S402, and all the source data are encoded without omission. As a result of the encoding, a header indicating that there are no consecutive symbols, and original source data are obtained.

본원 발명에 있어서 판단과정(S104)에서 m은 부호화 효율에 의해 결정된다. 본원 발명의 부호화 방법에 의하면 연속된 심벌의 개수를 표시하는 헤더, 심벌의 대표값을 나타내기 위해 기본적으로 두 심벌이 부가된다. 즉, 소오스 데이터에서 생략되는 심벌의 개수가 두 개 이하인 경우에는 부호화 효율이 저하된다. 따라서, m은 2이상인 것이 바람직하다.In the determination process (S104) in the present invention, m is determined by the coding efficiency. According to the encoding method of the present invention, two symbols are basically added to indicate a header indicating a number of consecutive symbols and a representative value of the symbols. That is, when the number of symbols omitted from the source data is two or less, the coding efficiency is reduced. Therefore, it is preferable that m is two or more.

m개 이상의 연속되는 심벌이 없는 경우에는 헤더를 위하여 한 개의 심벌이 기본적으로 부가되기 때문에 부호화 효율을 저해한다. 그러나, m개 이상의 연속되는 심벌이 있는 경우가 m개미만의 연속된 심벌이 있는 경우보다 확률적으로 많다면 부호화 효과를 달성할 수 있다.If there are no m or more consecutive symbols, since one symbol is basically added for the header, coding efficiency is hindered. However, if there are more than m consecutive symbols than if there are less than m consecutive symbols, the coding effect can be achieved.

본원에 있어서 에러 정정 코드의 특성상 동일한 심벌이 반드시 연속될 필요는 없다. 즉, 에러 정정 코드에서 동일한 값을 가지는 심벌들이 연속되지 않고 산발적으로 존재하더라도 이들의 개수만큼 소오스 데이터의 일부를 생략할 수 있다. 그러나, 이 경우 선택된 심벌 각각의 위치를 개별적으로 표시하기 위한 부가 코드를 필요로 하거나 헤더의 크기가 커지거나 혹은 디코더의 구성이 복잡해진다. 따라서, 동일한 심벌이 m개 이상 연속되는 것이 바람직하다.In the present specification, the same symbol does not necessarily have to be continuous due to the nature of the error correction code. That is, even if symbols having the same value are sporadic and non-sequential in the error correction code, a part of the source data may be omitted by the number thereof. However, in this case, an additional code for individually indicating the position of each of the selected symbols is required, the size of the header is increased, or the configuration of the decoder is complicated. Therefore, it is preferable that the same symbol is m or more consecutive.

도 2는 도 1에 보여지는 본 발명에 따른 부호화 방법을 도식적으로 보이기 위하여 도시된 것이다. 도 2(a)는 각각이 4비트로 구성되는 7개의 심벌(s1∼s7)로 구성된 소오스 데이터를 보이는 것이고, 도 2(b)는 소오스 데이터에 7개의 심벌(p1∼p7)로 구성되는 에러 정정 코드를 부가한 것을 보이는 것이다.FIG. 2 is a diagram for schematically showing an encoding method according to the present invention shown in FIG. Fig. 2 (a) shows the source data consisting of seven symbols (s1 to s7) each consisting of four bits, and Fig. 2 (b) shows error correction consisting of seven symbols (p1 to p7) on the source data. The code is added.

에러 정정 코드의 심벌수가 7이므로 이레이저 정정 능력이 7이며, 에러 정정 능력은 3(7/2인 정수이므로)이다. 여기서, 에러 정정 코드의 정정 능력은 소오스 데이터의 심벌의 수와 같은 것이 바람직하다. 왜냐하면 에러 정정 코드의 심벌수가 많아지더라도 중복성이 높아질 확률이 없기 때문에(실험결과에 의하면 2개 내지 3개가 중복될 확률이 가장 많음) 이에 따라 부호화 효율이 높아질 확률이 없고 디코딩이 복잡해진다. 한편, 에러 정정 코드의 심벌수가 적어지면 디코딩은 간단해지지만 중복성이 낮아져 부호화 효율이 저하되기 때문이다.Since the number of symbols of the error correction code is 7, the erasure correction capability is 7 and the error correction capability is 3 (since an integer of 7/2). Here, the correction capability of the error correction code is preferably equal to the number of symbols in the source data. Because the number of symbols of the error correction code increases, there is no probability of increasing redundancy (the results of experiment show that the probability of overlapping two to three is the most likely). Accordingly, there is no probability of increasing coding efficiency and the decoding becomes complicated. On the other hand, when the number of symbols of the error correction code decreases, the decoding becomes simple, but the redundancy is lowered and the coding efficiency is lowered.

도 2(c)는 에러 정정 코드를 구성하는 7개의 심벌(p1∼p7)에서 연속되는 p1∼p3이 동일한 심벌값(십진수의 "A")을 가지는 것을 보인다. 이 경우 복호화시 연속되는 심벌들(p1∼p3)의 위치 및 값들을 알고 있다면 에러 정정 코드를 구성하는 7개의 심벌(p1∼p7)에서 나머지 심벌들 (p4∼p7) 및 소오스 데이터를 구성하는 7개의 심벌들(s1∼s7)중의 3개를 정정할 수 있다. 이는 에러 정정 코드 중의 3개의 알려진 심벌 및 3개의 심벌들을 제외한 소오스 데이터를 부호화하더라도 원래의 소오스 데이터를 복원할 수 있음을 의미한다. Fig. 2 (c) shows that the consecutive p1 to p3 in the seven symbols p1 to p7 constituting the error correction code have the same symbol value (" A " in decimal). In this case, if the positions and values of the consecutive symbols p1 to p3 are known at the time of decoding, the seven symbols constituting the remaining data (p4 to p7) and the source data are composed of the seven symbols p1 to p7 constituting the error correction code. Three of the symbols s1 to s7 can be corrected. This means that the original source data can be restored even if the source data except for three known symbols and three symbols in the error correction code is encoded.

이때 부호화시 생략되는 심벌의 위치는 예를 들면, 소오스 데이터의 뒷부분에서 3개의 심벌들이라는 등으로 사전의 약정에 의해 알려져 있어야한다. 본원 발명의 부호화 방법에서는 연속된 심벌에 인접되고 이레이저 정정 능력만큼의 심벌들을 생략한다고 규정하여 디코더에서 생략되는 심벌의 위치를 인식할 수 있게 한다.In this case, the position of a symbol omitted during encoding should be known by a prior agreement, for example, three symbols later in the source data. In the encoding method of the present invention, it is possible to recognize the position of a symbol that is omitted in the decoder by specifying that the symbols adjacent to the consecutive symbols are omitted as much as the erasure correction capability.

도 2(d)는 제거되는 심벌들의 위치를 보이기 위해 도시된 것이다. 도 2(c)에 도시된 연속된 심벌들(p1∼p3)의 주변에 있는 7개의 심벌들(s5∼s7, p4∼p7)은 연속된 심벌들(p1∼p3)에 의해 복원 가능한 것들이다. 따라서, 부호화시 이들을 생략하고 부호화하더라도 복호화시 이들을 정확하게 복원할 수 있다.Figure 2 (d) is shown to show the location of the symbols to be removed. Seven symbols s5 to s7 and p4 to p7 around the consecutive symbols p1 to p3 shown in FIG. 2 (c) are those reconstructed by the consecutive symbols p1 to p3. . Therefore, even if these are omitted and encoded during encoding, they can be correctly restored during decoding.

도 2(e)는 부호화된 결과를 보이기 위해 도시된 것이다. 도 2(e)에 도시된 바에 있어서, 헤더는 연속된 심벌의 개수를 나타내고, 대표값은 연속된 심벌의 값을 나타내고, 나머지 부분은 도 2(d)에서 생략되지 않은 소오스 데이터들을 포함한다. 2 (e) is shown to show the encoded result. As shown in FIG. 2 (e), the header indicates the number of consecutive symbols, the representative value indicates the value of the consecutive symbols, and the remainder includes source data not omitted in FIG. 2 (d).

도 3(a) 내지 도 3(d)은 도 1에 도시된 부호화 방법에 의해 발생되는 부호화 데이터의 형식을 보이는 것이다. 도 2(e)에 도시된 바와 같이 부호화한 결과는 헤더(202), 대표값(204), 그리고 소오스 데이터부(206)를 포함한다. 헤더(202)는 연속된 심벌의 개수를 나타내고, 대표값(204)은 연속된 심벌의 값을 나타내고, 소오스 데이터부(206)는 도 2(d)에서 생략되지 않은 소오스 데이터들을 포함한다. 여기서, 헤더(202)가 연속된 심벌의 선두위치에 관한 정보를 가지지 않음을 유의하여야 한다. 연속된 심벌의 선두위치는 추후에 설명하는 바와 같이 연속된 심벌에 대응하는 경우의 수만큼 설치되는 디코더들에 의해 알려진다.3 (a) to 3 (d) show the format of encoded data generated by the encoding method shown in FIG. As shown in FIG. 2E, the encoding result includes a header 202, a representative value 204, and a source data unit 206. The header 202 indicates the number of consecutive symbols, the representative value 204 indicates the value of the consecutive symbols, and the source data portion 206 includes source data that is not omitted in FIG. 2 (d). Here, it should be noted that the header 202 does not have information regarding the head position of consecutive symbols. The head position of successive symbols is known by decoders installed in the number of cases corresponding to the successive symbols as described later.

헤더(202)의 비트수는 기본적으로 심벌의 비트수(예를 들면 도 2에 도시된 경우에 있어서 4비트)와 같이 하는 것이 생각될 수 있다. 그러나, 본원에서는 동일한 심벌의 개수가 2미만인 경우, 2개인 경우, 3개인 경우만을 가정하여 부호화하고, 헤더(202)를 헤더(202)를 1비트 혹은 2비트로 함으로써 부호화 효율을 더욱 높 인다. 이는 에러 정정 코드에서 동일한 심벌값이 4개 이상 연속될 확률이 작기 때문에 연속되는 심벌들의 개수를 2비트 이상으로 표현할 필요가 없기 때문이다.It is conceivable that the number of bits of the header 202 is basically the same as the number of bits of the symbol (for example, 4 bits in the case shown in FIG. 2). However, in the present application, when the number of the same symbol is less than 2, the encoding is assumed on the basis of the case of two, if only three, and the header 202 is set to 1 bit or 2 bits, the coding efficiency is further increased. This is because it is not necessary to express the number of consecutive symbols more than 2 bits because the probability of the same symbol value being contiguous 4 or more in the error correction code is small.

도 3(b)은 헤더값이 1bit("0")로서 동일한 심벌의 개수가 2개일 경우의 포맷을 보이는 것이다. 동일한 심벌의 개수가 2개이므로 소오스 데이터의 7개의 심벌들(p1∼p7)중에서 (p1∼p5)만이 부호화된다. FIG. 3 (b) shows the format when the header value is 1 bit (“0”) and the number of the same symbol is two. Since the same number of symbols is two, only (p1 to p5) among the seven symbols p1 to p7 of the source data are encoded.

예를 들면, 도 2(a)에 도시된 소오스 데이터(4bit*7symbol=28bit)가 25bit로 부호화되므로 압축률(부호화 결과로 얻어지는 비트수/소오스 데이터의 비트수)은 25/28=0.8928이다.For example, since the source data (4 bits * 7 symbols = 28 bits) shown in Fig. 2A is coded with 25 bits, the compression ratio (the number of bits obtained as the result of encoding / the number of bits of the source data) is 25/28 = 0.8928.

도 3(c)은 헤더값이 2bit("10")로서 동일한 심벌의 개수가 3개일 경우의 포맷을 보이는 것이다. 동일한 심벌의 개수가 3개이므로 소오스 데이터의 7개의 심벌들(p1∼p7)중에서 (p1∼p4)만이 부호화된다. FIG. 3 (c) shows the format when the header value is 2 bits (" 10 ") and the number of the same symbol is three. Since the same number of symbols is three, only (p1 to p4) among the seven symbols p1 to p7 of the source data are encoded.

예를 들면, 도 2(a)에 도시된 소오스 데이터(4bit*7symbol=28bit)가 21bit로 부호화되므로 압축률은 21/28=0.75이다.For example, since the source data (4bit * 7symbol = 28bit) shown in Fig. 2A is encoded by 21bit, the compression ratio is 21/28 = 0.75.

도 3(b)은 헤더값이 2비트("11")로서 동일한 심벌의 개수가 2개 미만일 경우의 포맷을 보이는 것이다. 동일한 심벌의 개수가 2개 미만이므로 에러 정정 코드는 부호화되지 않고 소오스 데이터만이 부호화된다.FIG. 3 (b) shows the format when the header value is two bits (“11”) and the number of the same symbols is less than two. Since the number of identical symbols is less than two, the error correction code is not encoded and only the source data is encoded.

예를 들면, 도 2(a)에 도시된 소오스 데이터(4bit*7symbol=28bit)가 30bit로 부호화되므로 압축률은 30/28=1.0714이다. For example, since the source data (4bit * 7symbol = 28bit) shown in FIG. 2A is encoded by 30bit, the compression ratio is 30/28 = 1.0714.

도 3(b) 내지 도 3(c)에 도시된 것은 부호화에 의해 3비트 혹은 7비트가 줄어드는 경우이고, 도 3(d)에 도시된 것은 반대로 부호화에 의해 비트수가 2비트만 큼 늘어나는 경우이다. 따라서, 에러 정정 코드에서 심벌이 2개가 연속되는 경우가 전혀 없는 경우보다 2/3배 이상이 되거나 심벌이 3개가 연속되는 경우가 전혀 없는 경우보다 2/7배 이상이 되면 압축효과가 발생한다.3 (b) to 3 (c) show a case in which 3 bits or 7 bits are reduced by encoding, and the case shown in FIG. 3 (d) on the contrary is a case where the number of bits increases by 2 bits due to encoding. . Therefore, a compression effect occurs when the error correction code is 2/3 times or more than when no two symbols are contiguous, or 2/7 times or more than when no three symbols are contiguous.

이러한 본원 발명의 압축 방법은 이미지 데이터 파일(확장자가 .BMP인 파일), 텍스트 파일(word processor의 데이터 파일), 프로그램 파일(실행 파일 등), 압축된 파일(mpg, jpg등)에서와 같이 알려진 압축 방법에 의해서는 더 이상 압축하기 어려운 소오스 데이터에 중복성을 주어 압축할 수 있게 한다. 생략된 데이터는 에러 정정 코드의 특성에 의해 복원된다.Such a compression method of the present invention is known as in image data files (files with .BMP extension), text files (data files of word processor), program files (executable files, etc.), compressed files (mpg, jpg, etc.) The compression method provides redundancy to source data that can no longer be compressed. The omitted data is restored by the characteristic of the error correction code.

도 4는 본 발명에 따른 디코딩 방법을 보이는 흐름도이다. 도 4에 도시된 방법은 헤더 검출 과정(S402), 헤더값 판별 과정(S404), 제1복호화 과정(S406), 제2복호화 과정(S408)을 포함한다.4 is a flowchart showing a decoding method according to the present invention. The method illustrated in FIG. 4 includes a header detection process S402, a header value determination process S404, a first decoding process S406, and a second decoding process S408.

헤더 검출 과정(S402)은 부호화된 데이터에서 도 3(b) 내지 도 3(d)에 도시된 바의 헤더를 검출하고, 헤더값에 의해 헤더, 대표값, 그리고 부호화된 소오스 데이터를 분리한다. The header detection process S402 detects a header as shown in FIGS. 3 (b) to 3 (d) from the encoded data, and separates the header, the representative value, and the encoded source data by the header value.

헤더값 판별 과정(S404)은 헤더값에 제1복호화 과정(S406) 혹은 제2복호화 과정(S408)의 여부를 판별한다.The header value determination process (S404) determines whether the header value is the first decoding process (S406) or the second decoding process (S408).

헤더의 첫 번째 비트가 "0"인 경우는 도 3(b)에 도시된 헤더밖에 없으므로 2개의 심벌이 연속되며, 다음의 4비트가 대표값이고, 다음의 20비트가 부호화된 소오스 데이터임을 인식한다. 이 경우는 제1복호화 과정(S406)을 통하여 복호화한다.When the first bit of the header is "0", since only the header shown in FIG. 3 (b) is used, two symbols are consecutive, the next 4 bits are representative values, and the next 20 bits are encoded source data. do. In this case, decoding is performed through the first decoding process S406.

헤더의 첫 번째 비트가 "1"인 경우는 도 3(c) 혹은 도 3(d)에 도시된 경우이 며, 이들은 헤더의 두 번째 비트에 판별된다. 헤더의 두 번째 비트가 "0"이면 도 3(c) 도시된 경우로서 3개의 심벌이 연속되며, 다음의 4비트가 대표값이고, 다음의 16비트가 부호화된 소오스 데이터임을 인식한다. 이 경우도 제1복호화 과정(S406)을 통하여 복호화한다.The case where the first bit of the header is "1" is the case shown in FIG. 3 (c) or FIG. 3 (d), and these are determined by the second bit of the header. If the second bit of the header is " 0 ", three symbols are consecutive as shown in FIG. 3 (c), and the next four bits are representative values, and the next 16 bits are encoded source data. Also in this case, decoding is performed through the first decoding process (S406).

한편, 헤더의 두 번째 비트가 "1"이면 도 3(d)에 도시된 경우로서 연속된 심벌이 없으며, 다음의 28비트가 부호화된 소오스 데이터임을 인식한다. 이 경우는 제2복호화 과정(S408)을 통하여 복호화한다.On the other hand, if the second bit of the header is "1", as shown in FIG. 3 (d), there are no consecutive symbols, and it is recognized that the next 28 bits are encoded source data. In this case, decoding is performed through the second decoding process S408.

제1복호화 과정(S406)은 동일한 심벌이 2개 혹은 3개 연속된 경우에 대응하는 복호화 과정이다. 여기서는 대표값을 가지는 2개 혹은 3개의 심벌을 얻고 연속된 심벌들의 선두 위치를 판별하고, 판별된 선두 위치에 따라 에러 정정을 수행하여 부호화시 생략된 소오스 데이터의 일부 심벌을 복원한다.The first decoding process S406 is a decoding process corresponding to two or three consecutive symbols of the same symbol. Here, two or three symbols having representative values are obtained, the head positions of consecutive symbols are determined, error correction is performed according to the determined head positions, and some symbols of the source data omitted during encoding are restored.

연속된 심벌의 선두 위치는 복수 개의 디코더에 의해 판별된다. 이를 상세히 설명하면 다음과 같다. 에러 정정 코드가 7개의 심벌들(p1∼p7)로 구성될 경우 3개의 연속된 심벌이 발생될 수 있는 선두 위치는 5개밖에 없다. 즉, 선두위치가 p1∼p5일 경우만 3개의 연속된 심벌이 발생될 수 있다. 예를 들면 도 3(c)에 도시된 바에 있어서 p1에서 시작하는 3개의 연속된 심벌들이 있을 경우에만 에러 정정이 가능하고, p2∼p5에서 시작하는 3개의 연속된 심벌들이 있을 경우에는 에러 정정이 가능하지 않다.The head position of consecutive symbols is determined by a plurality of decoders. This will be described in detail as follows. When the error correction code is composed of seven symbols p1 to p7, there are only five leading positions at which three consecutive symbols can be generated. That is, three consecutive symbols may be generated only when the head position is p1 to p5. For example, as shown in FIG. 3 (c), error correction is possible only when there are three consecutive symbols starting at p1, and when there are three consecutive symbols starting at p2 to p5, error correction is performed. not possible.

따라서, 도 5에 도시된 바와 같이 선두위치가 각각 p1 내지 p5이고, 각각의 선두위치에서 3개의 연속된 대표값을 가지는 디코더들에 대표값 및 부호화된 소오 스 데이터를 입력하고, 이들 디코더들 중에서 올바른 에러 정정이 수행되는 디코더만을 선택하면 그것의 출력으로부터 복호화된 소오스 데이터를 얻을 수 있다. 예를 들면 도 3(c)에 도시된 바와 같은 경우는 선두위치가 p1인 디코더에 의해서만 올바른 에러 정정이 수행된다.Thus, as shown in Fig. 5, the head positions are p1 to p5, respectively, and the representative value and the encoded source data are input to decoders having three consecutive representative values at each head position, and among these decoders. Selecting only a decoder for which correct error correction is performed can obtain decoded source data from its output. For example, as shown in Fig. 3C, correct error correction is performed only by the decoder whose head position is p1.

한편, 에러 정정 코드가 7개의 심벌들(p1∼p7)로 구성될 경우 2개의 연속된 심벌이 발생될 수 있는 선두 위치는 6개밖에 없다. 즉, 선두위치가 p1∼p6일 경우만 2개의 연속된 심벌이 발생될 수 있다. On the other hand, when the error correction code is composed of seven symbols p1 to p7, there are only six leading positions at which two consecutive symbols can be generated. That is, two consecutive symbols may be generated only when the head position is p1 to p6.

따라서, 선두위치가 각각 p1 내지 p6이고, 각각의 선두위치에서 2개의 연속된 대표값을 가지는 디코더들에 대표값 및 부호화된 소오스 데이터를 입력하고, 이들 디코더들 중에서 올바른 에러 정정이 수행되는 디코더만을 선택하면 그것의 출력으로부터 복호화된 소오스 데이터를 얻을 수 있다.Therefore, only the decoders whose input positions are p1 to p6 and input representative values and coded source data to decoders having two consecutive representative values at each leading position, and correct error correction among these decoders are performed When selected, decoded source data can be obtained from its output.

제2복호화 과정(S408)은 동일한 심벌이 없을 경우에 대응하는 복호화 과정이다. 여기서는 헤더를 제외한 28비트로 소오스 데이터를 복원한다.The second decoding process S408 is a decoding process corresponding to the case where there is no identical symbol. Here, the source data is restored to 28 bits except the header.

도 5는 도 4에 도시된 복호화 과정을 도식적으로 보이기 위하여 도시된 것이다. 도 5(a)는 도 3(a)에 도시된 소오스 데이터를 복호화한 것으로서 도 3(e)에 도시된 것과 같다.FIG. 5 is a diagram schematically showing the decoding process shown in FIG. 4. FIG. 5 (a) shows the decoding of the source data shown in FIG. 3 (a) and is the same as that shown in FIG.

도 5(b)는 도 5(a)에 도시된 것에 의해 소오스 데이터 및 에러 정정 코드를 복원한 결과를 보이는 것이다. 도 5(b)에 있어서 짙은 흑색으로 보이는 부분(p1∼p3)이 알려진 연속된 심벌들이고, 옅은 흑색으로 보이는 부분(s5∼s7, p4∼p7)이 연속된 심벌들에 의해 확장된 부분이다. 옅은 흑색으로 보이는 부분(s5∼s7, p4∼p7)은 대표값으로 덮어씌어진 후, 알려진 연속된 심벌들에 의해 복원된다.FIG. 5 (b) shows the result of restoring the source data and the error correction code as shown in FIG. 5 (a). In Fig. 5 (b), the portions p1 to p3 that appear to be dark black are known consecutive symbols, and the portions s5 to s7 and p4 to p7 that appear to be light black are extended portions by successive symbols. The portions s5 to s7 and p4 to p7 that appear light black are overwritten with representative values and then restored by known successive symbols.

도 5(c)는 에러 정정을 수행함에 의해 복원된 결과를 보이는 것이다. 도 6(b)에 도시된 (p1∼p3)은 심벌값 "A"를 가지는 진정한 심벌들이므로 이들에 의해 확장된 심벌들(s5∼s7, p4∼p7)을 복원할 수 있다.5 (c) shows the result restored by performing the error correction. Since (p1 to p3) shown in Fig. 6B are true symbols having a symbol value "A", the extended symbols s5 to s7 and p4 to p7 can be restored.

도 5(d)는 도 6(c)에 도시된 결과로부터 소오스 데이터를 분리한 것을 보이는 것으로서, 도 3(a)에 도시된 것과 같은 것임을 알 수 있다.FIG. 5 (d) shows that the source data is separated from the result shown in FIG. 6 (c), which is the same as that shown in FIG. 3 (a).

도 6은 본원 발명의 압축 방법을 여러 가지의 파일들에 대하여 실행하고 압축률을 조사한 결과를 보이는 것이다. 대상 파일들은 MPEG 데이터 파일, power point 파일, word processor 파일, 실행 파일, 압축 파일(확장자가 zip인 파일), 압축 파일(확장가가 JPG인 파일), excel 파일, 텍스트 파일(확장자가 TXT인 파일)들이며, 각각에 대해, 0.8693, 0.8823, 0.8091, 0.8885, 0.8911, 0.8852, 0.7883, 0.8464의 압축율을 얻었다.Figure 6 shows the results of performing the compression method of the present invention on a variety of files and checking the compression rate. The target files are MPEG data file, power point file, word processor file, executable file, compressed file (file with zip extension), compressed file (file with JPG extension), excel file, text file (file with TXT extension) In each case, compression ratios of 0.8693, 0.8823, 0.8091, 0.8885, 0.8911, 0.8852, 0.7883, and 0.8464 were obtained.

각각의 대상 파일에 있어서 2개 이상의 심벌들이 연속되는 경우가 연속되지 않는 경우보다 확률이 크므로 압축 효과를 얻을 수 있다. 도 7에 도시된 MPEG 파일에 있어서 연속되지 않는 경우의 수는 17,511이고 2개가 연속된 경우는 305,660이고, 3개가 연속된 경우는 128,469이다. 연속되지 않는 경우는 491,428비트를 부호화함에 의해 526,530비트로 비트수가 늘어난다. 그러나 2개가 연속되는 경우는 8,558,480비트가 7,641,500비트로 줄었고, 3개가 연속되는 경우는 3,597,132비트가 2,826,318비트로 줄었다. 이에 따라 압축하기 전의 데이터 비트수 12,647,040이 압 축 후에는 10,994,348이 되어 0.8693의 압축률을 얻었다.Since a case where two or more symbols are consecutive in each target file is more probable than a case where they are not consecutive, a compression effect can be obtained. In the MPEG file shown in Fig. 7, the number of non-contiguous cases is 17,511, two consecutive cases are 305,660, and three consecutive cases are 128,469. If it is not contiguous, the number of bits is increased to 526,530 bits by encoding 491,428 bits. However, 8,558,480 bits are reduced to 7,641,500 bits when two are consecutive, and 3,597,132 bits are reduced to 2,826,318 bits when three are consecutive. As a result, the number of data bits 12,647,040 before compression became 10,994,348 after compression to obtain a compression ratio of 0.8693.

이것은 10G비트를 저장하는 저장 기기를 생각할 때 본원 발명의 부호화방법을 적용함에 의해 약 13%의 저장 공간(1.3G)을 절약할 수 있음을 의미한다.This means that, when considering a storage device that stores 10G bits, about 13% storage space (1.3G) can be saved by applying the encoding method of the present invention.

도 6에 도시된 바와 같이 본원 발명의 압축 방법은 일반 압축 파일, 프로그램 실행 파일, 영상 압축 파일 등에 대해서도 압축효과를 보이는 것을 알 수 있다.As shown in FIG. 6, it can be seen that the compression method of the present invention also has a compression effect on a general compressed file, a program executable file, an image compressed file, and the like.

본원 발명에 있어서 부호화시 생략된 심벌들은 복원된 에러 정정 코드에 의해 정확히 복원되지만 부호화된 데이터들에서 에러가 발생하면 정확히 복원할 수 없다. 그러나, 데이터 저장기기에 데이터를 저장함에 있어서는 별도의 에러 정정 코드를 부가하여 저장한다. 따라서, 본원 발명에 따라 압축된 데이터에 별도의 강력한 에러 정정 코드가 부가되어 저장되기 때문에 복원 가능성이 보장된다.In the present invention, symbols omitted at the time of encoding are correctly recovered by the reconstructed error correction code, but cannot be correctly recovered when an error occurs in the encoded data. However, in storing data in the data storage device, an additional error correction code is added and stored. Accordingly, the possibility of restoration is ensured because a separate strong error correction code is added and stored in the compressed data according to the present invention.

본원의 부호화 방법에 있어서 에러 정정 코드는 에러 정정을 위하여 부가되는 것이 아니라 데이터 압축을 위하여 부가되는 것임을 유의하여야 한다. 에러 정정 코드를 부가하는 것은 소오스 데이터를 데이터의 반복성이 높도록 변형시켜서 압축할 수 있도록 하는 것으로, 이 변형된 데이터는 에러 정정 코드의 에러 정정 능력을 이용하여 원래의 데이터로 복원된다. 즉, 본원의 부호화 방법은 에러 정정 코드의 일부를 부가하고, 대신에 소오스 데이터의 일부를 생략한다. 생략되는 소오스 데이터가 많을수록 압축률이 증가한다.In the encoding method of the present application, it should be noted that the error correction code is added for data compression, not added for error correction. Adding an error correction code allows the source data to be deformed and compressed so that the data is highly repeatable, and the modified data is restored to the original data using the error correction capability of the error correction code. In other words, the encoding method of the present application adds a part of the error correction code, and omits a part of the source data instead. As more source data is omitted, the compression rate increases.

상술한 바와 같이 본 발명에 따른 부호화 방법은 소오스 데이터에 부가된 에러 정정 코드의 일부와 소오스 데이터의 일부만을 부호화하며, 에러 정정 코드의 중복성이 클수록 압축률이 증가하는 효과를 갖는다.As described above, the encoding method according to the present invention encodes only a part of an error correction code added to the source data and a part of the source data, and the compression rate increases as the redundancy of the error correction code increases.

Claims

N개의 심벌로 구성된 소오스 데이터를 부호화하는 방법에 있어서,In the method for encoding the source data consisting of N symbols,

N개의 심벌로 구성된 소오스 데이터에 M개의 심벌로 구성되는 에러 정정 코드를 부가하는 과정;Adding an error correction code consisting of M symbols to the source data consisting of N symbols;

상기 에러 정정 코드에서 동일한 심벌이 m개 연속되어 있으면 연속된 심벌의 개수를 나타내는 헤더, 연속된 심벌의 값, 그리고 N-m개의 소오스 데이터를 부호화하는 제1부호화 과정; 및A first encoding process of encoding a header indicating the number of consecutive symbols, a value of consecutive symbols, and N-m source data when m identical symbols are contiguous in the error correction code; And

상기 에러 정정 코드에 연속된 대표값이 없을 경우, 연속된 심벌이 없음을 나타내는 헤더와 N개의 소오스 데이터를 부호화하는 제2부호화 과정을 포함하는 부호화 방법.And a second encoding process of encoding a header and N source data indicating that there is no continuous symbol when the error correction code has no consecutive representative value.

제1항에 있어서, 상기 에러 정정 코드는 리드 솔로몬 부호인 것을 특징으로 하는 부호화 방법.The encoding method according to claim 1, wherein the error correction code is a Reed Solomon code.

제1항에 있어서, 상기 N과 M은 같은 것을 특징으로 하는 부호화 방법.The encoding method according to claim 1, wherein N and M are the same.

제3항에 있어서, 상기 m은 2인 것을 특징으로 하는 부호화 방법.4. The encoding method according to claim 3, wherein m is two.

제1항에 있어서, 상기 m은 2인 것을 특징으로 하는 부호화 방법.The encoding method according to claim 1, wherein m is two.

제1항에 있어서, 상기 제1부호화 과정에서 생략되는 소오스 데이터의 일부 심벌들은 에러 정정 코드에 인접된 것들임을 특징으로 하는 부호화 방법.The encoding method of claim 1, wherein some symbols of the source data omitted in the first encoding process are adjacent to an error correction code.

제1항에 있어서, 상기 제1부호화 과정에서 결과되는 부호화 데이터는 연속된 심벌의 개수를 나타내는 헤더, 연속된 심벌들의 값, 소오스 데이터를 구성하는 M개의 심벌들 중에서 m개의 심벌을 제외한 심벌들을 가짐을 특징으로 하는 부호화 방법.The method of claim 1, wherein the encoded data resulting from the first encoding process has a header indicating a number of consecutive symbols, a value of consecutive symbols, and symbols except m symbols among M symbols constituting source data. Encoding method characterized in that.

제1항에 있어서, 상기 제2부호화 과정에서 결과되는 부호화 데이터는 연속된 심벌이 m개 미만임을 나타내는 헤더, 소오스 데이터를 구성하는 M개의 심벌들을 가짐을 특징으로 하는 부호화 방법The encoding method of claim 1, wherein the encoded data resulting from the second encoding process has a header indicating that there are less than m consecutive symbols, and M symbols constituting source data.

N개의 심벌로 구성된 소오스 데이터에 M개의 심벌로 구성되는 에러 정정 코드를 부가하는 과정, 상기 에러 정정 코드에서 동일한 심벌이 m개 연속되어 있으면 연속된 심벌의 개수를 나타내는 헤더, 연속된 심벌의 값, 그리고 N-m개의 소오스 데이터를 부호화하는 과정, 그리고 상기 에러 정정 코드에 연속된 대표값이 없을 경우, 연속된 심벌이 없음을 나타내는 헤더와 N개의 소오스 데이터를 부호화하는 과정을 포함하는 부호화 방법에 의해 부호화된 데이터로부터 상기 소오스 데이터를 복원하는 방법에 있어서,Adding an error correction code consisting of M symbols to source data consisting of N symbols; if m identical symbols are contiguous in the error correction code, a header indicating the number of consecutive symbols, a value of consecutive symbols, And encoding the Nm source data, and encoding the header and N source data indicating that there is no continuous symbol when there is no continuous representative value in the error correction code. In the method of restoring the source data from data,

상기 부호화된 데이터에서 헤더를 검출하는 헤더 검출 과정;A header detection step of detecting a header from the encoded data;

상기 검출된 헤더의 값에 의해 상기 연속된 심벌의 유무를 판별하는 헤더값 판별 과정;A header value determination step of determining the presence or absence of the continuous symbol based on the detected header value;

연속된 심벌이 있는 경우 상기 대표값을 가지는 연속된 심벌들을 얻고 연속된 심벌들의 선두 위치를 판별하고, 판별된 선두 위치에 따라 에러 정정을 수행하여 부호화시 생략된 심벌들을 포함하는 소오스 데이터를 복원하는 제1복호화 과정; 및If there are consecutive symbols, the system obtains consecutive symbols having the representative value, determines the head positions of the consecutive symbols, performs error correction according to the determined head positions, and restores source data including symbols omitted during encoding. First decryption process; And

연속된 심벌이 없는 경우 상기 헤더를 제외한 데이터로부터 소오스 데이터를 복원하는 제2복호화 과정을 포함하는 복호화 방법.And a second decoding process of restoring source data from data excluding the header when there are no consecutive symbols.

제9항에 있어서, 상기 제1복호화 과정은The method of claim 9, wherein the first decoding process is performed.

연속된 심벌들의 가능한 선두위치에서부터 연속된 심벌들을 가지는 복수의 디코더들에 대표값 및 부호화된 소오스 데이터를 입력하고, 이들 디코더들 중에서 올바른 에러 정정이 수행되는 디코더만을 선택하고, 그것의 출력으로부터 복호화된 소오스 데이터를 얻는 것을 특징으로 하는 복호화 방법.Representative values and coded source data are input to a plurality of decoders having successive symbols from possible head positions of successive symbols, and among these decoders, only a decoder for which correct error correction is performed is decoded from its output. And decoding the source data.

소오스 데이터에 에러 정정 코드를 부가하고, 에러 정정 코드의 일부 및 소오스 데이터의 일부만을 추출하여 부호화하는 것을 특징으로 하는 부호화 방법.And an error correction code is added to the source data, and only a part of the error correction code and a part of the source data are extracted and encoded.

제11항에 있어서, 상기 에러 정정 코드는 리드 솔로몬 코드임을 특징으로 하는 부호화 방법.12. The method of claim 11, wherein the error correction code is a Reed Solomon code.

제11항에 있어서, 상기 에러 정정 코드의 일부는 에러 정정 코드를 구성하는 심벌들 중에서 m개 이상 반복되는 심벌들이고, 상기 소오스 데이터의 일부는 상기 소오스 데이터로부터 m개의 심벌들을 생략한 것임을 특징으로 하는 부호화 방법.The method of claim 11, wherein a part of the error correction code is a symbol that is repeated m or more of the symbols constituting the error correction code, a portion of the source data is characterized in that m symbols are omitted from the source data Coding method.

제13항에 있어서, 결과되는 부호화 데이터는 연속된 심벌의 개수를 나타내는 헤더, 연속된 심벌들의 값, 소오스 데이터를 구성하는 M개의 심벌들 중에서 m개의 심벌을 제외한 심벌들을 가짐을 특징으로 하는 부호화 방법.The encoding method according to claim 13, wherein the resultant encoded data has a header indicating the number of consecutive symbols, a value of consecutive symbols, and symbols except m symbols among M symbols constituting the source data. .

제14항에 있어서, 에러 정정 코드에서 연속된 m개의 심벌이 없을 경우 결과되는 부호화 데이터는 연속된 심벌들의 값을 가지지 않는 것을 특징으로 하는 부호화 방법.The encoding method according to claim 14, wherein the encoded data resulting from the absence of m consecutive symbols in the error correction code has no value of consecutive symbols.