JP6425709B2

JP6425709B2 - Data recovery using preliminary extended dictionary during recovery

Info

Publication number: JP6425709B2
Application number: JP2016509219A
Authority: JP
Inventors: コージン、ダニエル; リン、メイチー、マギー; マレ、アーサー; マカリスター、ティモシー、イー; スリンガー、ニゲル、ジー; トブラー、ジョン、ビー; ジュウ、ウエン、ジエ
Original assignee: International Business Machines Corp
Current assignee: International Business Machines Corp
Priority date: 2013-08-27
Filing date: 2014-07-25
Publication date: 2018-11-21
Anticipated expiration: 2034-07-25
Also published as: US8902087B1; WO2015029329A1; JP2016533046A

Description

本発明の実施形態は、圧縮及び復元に関し、より詳細には、復元性能を向上させるために圧縮辞書の予備拡張及び自己復元データ要素を用いることに関する。 Embodiments of the present invention relate to compression and decompression, and more particularly, to using compression dictionary pre-expansion and self decompression data elements to improve decompression performance.

現代のソフトウェアアプリケーションは、ディスク上に若しくはメモリ内に格納されるデータ、又はネットワークを介して伝送されるデータのサイズを小さくするために、圧縮を広範に利用している。圧縮技術は、損失なしの場合も損失ありの場合もある。損失あり技術は、元のソースデータに対してある程度のデータを失うことが許容できる場合に用いることができる。例えば、低解像度システム（例えば、携帯電話のディスプレイ）によって提示される写真は、より高解像度のソース写真（例えば、高解像度の家族写真）によって提供される詳細の全てを必ずしも必要としない。携帯電話は、本質的に、より高解像度の家族写真の実体を表示するが、より低解像度の画像を使用することにより利用可能な低減された処理要件及びストレージ要件を伴う。 Modern software applications make extensive use of compression to reduce the size of data stored on disk or in memory, or transmitted over a network. Compression techniques may be lossless or lossy. Lossy techniques can be used where it is acceptable to lose some data to the original source data. For example, photos presented by low resolution systems (eg, cell phone displays) do not necessarily require all of the details provided by higher resolution source photos (eg, high resolution family photos). Mobile phones inherently display higher resolution family picture entities, but with reduced processing and storage requirements available by using lower resolution images.

データを圧縮するための損失なし技術は、データの損失が有害な場合（例えば、銀行口座の桁の損失、社会保障番号の損失、又は緊急応答システムなどの基幹システム用のデータの損失）に用いることができる。圧縮及び復元技術の使用は、一般に、圧縮処理側及び復元処理側の両方において、ＣＰＵ処理リソースと、ディスクスペース又はデータ伝送リソースとのトレード・オフとなる。換言すれば、高い圧縮効率は、典型的にはより多くの処理リソース及びより少ないストレージ・スペースを必要とし、一方、低い圧縮効率は、典型的にはより少ない処理リソース及びより多くのストレージ又は伝送リソースを必要とする。効率を高めるための伝統的な手法は、圧縮アルゴリズムに焦点を合わせてきた。 Lossless techniques for compressing data are used when loss of data is harmful (eg, loss of digit in bank account, loss of social security number, or loss of data for backbone systems such as emergency response systems) be able to. The use of compression and decompression techniques is generally a trade-off between CPU processing resources and disk space or data transmission resources on both the compression and decompression sides. In other words, high compression efficiency typically requires more processing resources and less storage space, while low compression efficiency typically requires less processing resources and more storage or transmission. I need resources. Traditional approaches to increase efficiency have focused on compression algorithms.

本発明の目的は、復元性能を向上させるための、圧縮データを復元する方法、システム及びコンピュータ・プログラム製品を提供することである。 An object of the present invention is to provide a method, system and computer program product for decompressing compressed data to improve the decompression performance.

本発明の１つの実施形態によれば、コンピュータによって実行される、圧縮データを復元するための方法が提供される。第１の復元辞書が解析される。第１の復元辞書は、アドレス指定スキームに基づいて第１の復元辞書内に非連続方式で分散された未圧縮データ部分を各々が有する複数の連鎖を含み、ここで各連鎖の前記未圧縮データ部分は、圧縮データの対応する未圧縮バージョンを形成する。第１の復元辞書内の連鎖の各々の未圧縮データ部分を組み合わせて圧縮データの未圧縮バージョンを形成することにより、第２の復元辞書が生成され、圧縮データを復元するための命令が第２の復元辞書内に挿入される。圧縮データを第２の復元辞書に適用することによって、圧縮データが復元される。本発明の実施形態は、圧縮データを上述の方法と実質的に同じ方式で復元するためのシステム及びコンピュータ・プログラム製品をさらに含む。 According to one embodiment of the present invention, there is provided a computer-implemented method for decompressing compressed data. The first reconstruction dictionary is analyzed. The first decompression dictionary includes a plurality of chains each having uncompressed data portions distributed in a non-continuous manner in the first decompression dictionary based on the addressing scheme, wherein the uncompressed data of each chain The parts form the corresponding uncompressed version of the compressed data. A second decompression dictionary is generated by combining the uncompressed data portions of each of the chains in the first decompression dictionary to form an uncompressed version of the compressed data, and an instruction for decompressing the compressed data is second Inserted in the recovery dictionary of. Applying the compressed data to the second decompression dictionary decompresses the compressed data. Embodiments of the present invention further include systems and computer program products for decompressing compressed data in substantially the same manner as the method described above.

一般に、種々の図面において同様の数字は、同様の構成要素を示すために使用される。 In general, like numerals are used to indicate similar components in the various drawings.

本発明の実施形態で使用するための例示的なコンピューティング環境の線図である。FIG. 1 is a diagrammatic view of an exemplary computing environment for use in embodiments of the present invention. 本発明の実施形態による圧縮スキームの初期復元辞書から第２の復元辞書を生成する方式を示す手順フローチャートである。FIG. 7 is a procedure flow chart illustrating a scheme for generating a second decompression dictionary from an initial decompression dictionary of a compression scheme according to an embodiment of the present invention. 本発明の実施形態による第２の復元辞書を利用して圧縮データを復元する方式を示す手順フローチャートである。5 is a procedure flowchart illustrating a method of decompressing compressed data using a second decompression dictionary according to an embodiment of the present invention.

多くのデータ圧縮技術がその対応する復元方式と共に存在するが、本明細書で説明する技術は、Ｚｉｖ−Ｌｅｍｐｅｌアルゴリズムのような記号又はトークン置換型損失なし圧縮技術において特に有用となる。Ｚｉｖ−Ｌｅｍｐｅｌアルゴリズムは、一連の記号を単一コードで置き換えることによって効率を達成するが、付加的なコード化を伴う。例えば、文字「Ａ」を数字１で置き換えることができ、「Ｂ」を数字２で、以下同様に英語のアルファベットの２６文字を置き換えることができるが、それでもなおこれは、１対１記号置換である。 Although many data compression techniques exist with their corresponding decompression schemes, the techniques described herein are particularly useful in symbol or token substitution type lossless compression techniques such as the Ziv-Lempel algorithm. The Ziv-Lempel algorithm achieves efficiency by replacing a series of symbols with a single code, but with additional coding. For example, the letter "A" can be replaced by the number 1, "B" by the number 2, and so on, the 26 letters of the English alphabet can be replaced, but still this is a one-to-one symbol substitution is there.

しかしながら、いずれの所与の言語においても、一連の記号は共通である。例えば、単語「ｔｈｅ」は、３つの記号又は単一のトークンを含み、英語において広範に用いられる。この場合、単語「ｔｈｅ」全体を、単一ビット（又は文字として格納される場合には単一バイト）しか必要としない数字「１」のような単一記号で表すことができ、これにより、３記号の単語が３対１（３：１）の置換効率で単一記号に削減される。高い効率を得るために、ビット単位コードを用いることができる（例えば、トークンをコード化するための最小数のビット）。この例においては、数字「１」がトークン「ｔｈｅ」の置換コードとなる。さらに、文法構造が関与する場合、単語「ｔｈｅ」は、記号「ｔｈｅ」が完全にその内部に含まれる（例えばａｎ「ｔｈｅ」ｍ）「ａｎｔｈｅｍ」のような別の単語の一部でないときには常に、少なくとも２つのスペースで囲まれていることを識別することができる。従って、英語において見いだされる単語「ｔｈｅ」は、通常、２つのスペースで囲まれた「ｔｈｅ」の形態（すなわち、単語「ｔｈｅ」）であり、合計で５つの記号になるが、それでもなお数字「１」のような単一コードで置き換えることができるので、これによりこのトークン内の５つの記号が、５：１の置換効率で単一コードに削減される。語句がソース文書内で繰り返される場合、さらなる効率を達成することができる。 However, in any given language, the series of symbols are common. For example, the word "the" includes three symbols or a single token and is widely used in English. In this case, the whole word "the" can be represented by a single symbol such as the number "1", which only needs a single bit (or a single byte if stored as a character), Three symbol words are reduced to a single symbol with a 3 to 1 (3: 1) substitution efficiency. A bit-wise code can be used to obtain high efficiency (eg, a minimum number of bits for encoding a token). In this example, the number "1" is the replacement code for the token "the". Furthermore, whenever a grammatical structure is involved, the word "the" is always whenever it is not part of another word such as "anthem", where the symbol "the" is completely contained within it (e.g. It can be identified that it is enclosed by at least two spaces. Thus, the word "the" found in the English language is usually in the form of the "the" surrounded by two spaces (ie the word "the"), giving a total of five symbols, but still the figure " This can reduce the five symbols in this token to a single code with a 5: 1 replacement efficiency, as it can be replaced with a single code such as "1". Further efficiencies can be achieved if the phrase is repeated in the source document.

元のトークンとその対応する置換コードとが辞書内に格納される。この点に関して、単純な置換を用いる圧縮の単純化された考え方は、復元中にコード「１」を「見つけ（ｌｏｏｋｕｐ）」、トークン「ｔｈｅ」で置換することである。しかしながら、実際の動作においては、入力データストリームは、通常、連続的に処理され（例えば、大容量の辞典を圧縮することを考える）、新たなトークンに遭遇するたびにこれが付加される。従って、圧縮辞書は単なる単純な語彙集ではなく、最終的に元データを損失なしに回復するための、未圧縮データからの入力トークンの複雑なコード化、及び辞書内でのトークンのアドレス指定である。このような複雑さは、Ｚｉｖ−Ｌｅｍｐｅｌ圧縮に対する復元技術に関してこの後で提示される例を通じて明らかになるであろう。 The original token and its corresponding replacement code are stored in the dictionary. In this regard, a simplified idea of compression using a simple permutation is to "look up" the code "1" and to replace it with the token "the" during decompression. However, in actual operation, the input data stream is usually processed continuously (e.g., consider compressing a large dictionary) and added each time a new token is encountered. Thus, a compression dictionary is not just a simple lexicon, but with the complex encoding of input tokens from uncompressed data and the addressing of tokens within the dictionary to finally recover the original data without loss. is there. Such complexity will become apparent through the example presented later on the decompression technique for Ziv-Lempel compression.

説明を簡単にするために、本明細書で説明する技術を、Ｚｉｖ−Ｌｅｍｐｅｌ圧縮スキームのような広範に用いられる基本アルゴリズムに関して説明する。本明細書で説明する技術の多くは、他の圧縮及び復元技術（例えば、文脈木重み付け法又はバイト対コード化）に変換することができる。学界によっては、Ｚｉｖ−Ｌｅｍｐｅｌは、Ｌｅｍｐｅｌ−Ｚｉｖ（ＬＺ）とも呼ばれる。本明細書において用いる場合、Ｚｉｖ−Ｌｅｍｐｅｌ、Ｌｅｍｐｅｌ−Ｚｉｖ、及びＬＺは全て、Ｚｉｖ−Ｌｅｍｐｅｌアルゴリズム、又はその変形及び等価物を指すものとする。 For simplicity, the techniques described herein will be described in terms of widely used basic algorithms, such as the Ziv-Lempel compression scheme. Many of the techniques described herein can be converted to other compression and decompression techniques (eg, context tree weighting or byte-to-code encoding). In some academic circles, Ziv-Lempel is also called Lempel-Ziv (LZ). As used herein, Ziv-Lempel, Lempel-Ziv, and LZ all refer to the Ziv-Lempel algorithm, or variations and equivalents thereof.

その発端から、Ｚｉｖ−Ｌｅｍｐｅｌは、特定の用途に合わせてアルゴリズムを改良して（例えば、Ｈｕｆｆｍａｎコード化技術を追加することにより）、ＬＺアルゴリズムのファミリーを形成することにより進歩してきた。例えば、ＬＺ７７及びＬＺ７８アルゴリズムは、異なる用途に用いられている。ＬＺ７７は、．ＺＩＰファイル圧縮及びＳｔａｃｋｅｒディスク圧縮アルゴリズムと共に用いる場合に効率を有する。別の変形において、ＬＺ７８は、Ｖ．４２モデム標準を用いたネットワーク上のデータ伝送と共に用いる場合、又は画像交換フォーマット（ＧＩＦ）圧縮を用いる画像圧縮と共に用いる場合に効率を有する。このようにして、Ｚｉｖ−Ｌｅｍｐｅｌアルゴリズムは時と共に進化してきた。 From its inception, Ziv-Lempel has advanced by forming a family of LZ algorithms, modifying the algorithm (eg, by adding Huffman coding techniques) to a particular application. For example, LZ77 and LZ78 algorithms are used for different applications. LZ 77 is. It has efficiency when used with ZIP file compression and Stacker disk compression algorithms. In another variation, LZ 78 is a V.V. It has efficiency when used with data transmission over the network using the T.42 modem standard, or with image compression using Image Interchange Format (GIF) compression. Thus, the Ziv-Lempel algorithm has evolved over time.

伝統的な圧縮技術は、一般に復元効率よりもむしろ圧縮効率をより指向したものである。本発明の実施形態は、復元効率を高める。本発明の実施形態で使用するための例示的な環境を図１に示す。具体的には、この環境は、１つ又は複数のサーバ・システム１０と、１つ又は複数のクライアント又はエンドユーザ・システム１４とを含む。サーバ・システム１０及びクライアント・システム１４は、互いに遠隔にあってネットワーク１２上で通信するものとすることができる。ネットワークは、任意の数の任意の適切な通信媒体（例えば、広域ネットワーク（ＷＡＮ）、ローカル・エリア・ネットワーク（ＬＡＮ）、インターネット、イントラネットなど）によって実装することができる。代替的に、サーバ・システム１０とクライアント・システム１４は、互いにローカルにあって、任意の適切なローカル通信媒体（例えば、ローカル・エリア・ネットワーク（ＬＡＮ）、配線、無線リンク、イントラネットなど）を介して通信するものとすることができる。 Traditional compression techniques are generally more oriented to compression efficiency rather than recovery efficiency. Embodiments of the present invention enhance restoration efficiency. An exemplary environment for use in embodiments of the present invention is shown in FIG. Specifically, the environment includes one or more server systems 10 and one or more client or end user systems 14. The server system 10 and the client system 14 may be remote from one another and communicate on the network 12. The network may be implemented by any number of any suitable communication media (eg, a wide area network (WAN), a local area network (LAN), the Internet, an intranet, etc.). Alternatively, server system 10 and client system 14 are local to each other and may be via any suitable local communication medium (eg, local area network (LAN), wiring, wireless link, intranet, etc.) Communication can be performed.

サーバ・システム１０及びクライアント・システム１４は、ディスプレイ又はモニタ（図示せず）、ベース（例えば、少なくとも１つのプロセッサ１５、１つ若しくは複数のメモリ３５及び／又は内部若しくは外部ネットワークインタフェース若しくは通信装置２５（例えば、モデム、ネットワークカードなど））、随意の入力装置（例えば、キーボード、マウス又はその他の入力装置）、並びに任意の市販ソフトウェア及びカスタム・ソフトウェア（例えば、サーバ／通信ソフトウェア、文字列復元記録モジュール、命令コード化モジュール、ブラウザ／インタフェース・ソフトウェアなど）を装備することが好ましい、任意の従来の又はその他のコンピュータ・システムによって実装することができる。 The server system 10 and the client system 14 may comprise a display or monitor (not shown), a base (eg at least one processor 15, one or more memories 35 and / or an internal or external network interface or communication device 25 For example, modems, network cards etc)), optional input devices (eg keyboard, mouse or other input devices), and any commercial and custom software (eg server / communication software, string recovery recording module) It may be implemented by any conventional or other computer system, preferably equipped with an instruction coding module, browser / interface software, etc.).

クライアント・システム１４は、ユーザが、サーバ・システム１０へ圧縮データセット（例えば、ファイル、文書、ピクチャなど）を圧縮すること、又はサーバ・システム１０から圧縮データセットを取り出すことを可能にする。サーバ・システムは、後述するように、既に圧縮された（例えばＺｉｖ−Ｌｅｍｐｅｌアルゴリズムにより）圧縮データ（例えば、ファイル、文書、ピクチャなど）を復元するための復元モジュール１６と、ひとたび圧縮データセットが取り出されたとき（例えば、データ処理システム又はユーザにより）の自己復元のための命令を埋め込むための命令コード化モジュール２０とを含む。 Client system 14 allows the user to compress a compressed data set (eg, file, document, picture, etc.) to server system 10 or retrieve a compressed data set from server system 10. The server system fetches the compressed data set once, and the decompression module 16 for decompressing compressed data (for example, file, document, picture, etc.) already compressed (for example, by Ziv-Lempel algorithm), as described later. And an instruction coding module 20 for embedding an instruction for self-restoration when it is done (eg, by the data processing system or the user).

データベース・システム１８は、種々の形態の圧縮ベースの情報（例えば、圧縮データ及びデータベース、並びに復元辞書など）を格納することができる。データベース・システムは、任意の従来の又はその他のデータベース又はストレージ・ユニットによって実装することができ、サーバ・システム１０及びクライアント・システム１４に対してローカル又は遠隔にあるものとすることができ、任意の適切な通信媒体（例えば、ローカル・エリア・ネットワーク（ＬＡＮ）、広域ネットワーク（ＷＡＮ）、インターネット、配線、無線リンク、イントラネットなど）を介して通信することができる。 Database system 18 may store various forms of compression based information such as, for example, compressed data and databases and decompression dictionaries. The database system may be implemented by any conventional or other database or storage unit, may be local or remote to the server system 10 and the client system 14, and may be any Communication can be via any suitable communication medium (eg, a local area network (LAN), a wide area network (WAN), the Internet, wired, wireless links, intranets, etc.).

本発明の実施形態の簡単な要約を目的として、伝統的な方法で圧縮されたデータセットが、本明細書で説明する技術によって予備拡張される。本明細書で説明する例において、圧縮データは、固定長、その元の長さ、又は復元効率を補助するその他の何らかの都合の良い長さに復元される。Ｚｉｖ−Ｌｅｍｐｅｌ圧縮に関して、復元は、辞書の拡張によって達成され、この辞書は、圧縮中に定義される本質的に「翻訳」辞書であり、これは、より小さい参照コードをその元の対応トークンに変換する（例えば、上述のようにコード数字「１」が「ｔｈｅ」に変換される）。拡張された対応トークンを指定された順序で連結して元のソースデータが形成され、これにより、圧縮されていた元データが生成される。 For the purpose of a brief summary of embodiments of the present invention, data sets compressed by traditional methods are pre-expanded by the techniques described herein. In the example described herein, the compressed data is decompressed to a fixed length, its original length, or some other convenient length that aids in the decompression efficiency. With respect to Ziv-Lempel compression, decompression is achieved by extension of the dictionary, which is essentially a "translation" dictionary defined during compression, which converts the smaller reference code to its original corresponding token. Convert (eg, code number "1" is converted to "the" as described above). The expanded corresponding tokens are concatenated in a specified order to form the original source data, thereby generating the compressed original data.

あるいは、１つ又は複数のクライアント・システム１４は、独立型ユニットとして動作している場合には復元を行うことができる。独立型動作モードにおいて、クライアント・システムは、データ（例えば、圧縮データ及びデータベース、並びに復元辞書など）を格納するか又はデータへのアクセスを有し、復元モジュール１６と、格納及び復元のための命令を挿入するための命令コード化モジュール２０とを含む。本質的に、ＬＺ辞書を部分的に拡張して第２の辞書を得る。復元中の第２の復元辞書の使用により、最終的に元のデータセットの回復がもたらされる。 Alternatively, one or more client systems 14 may perform restoration if operating as a stand-alone unit. In the stand-alone mode of operation, the client system stores or has access to data (e.g., compressed data and databases, and decompression dictionaries, etc.), and the decompression module 16 and instructions for storage and decompression. And an instruction coding module 20 for inserting. In essence, the LZ dictionary is partially expanded to obtain a second dictionary. The use of the second reconstruction dictionary during reconstruction ultimately leads to the restoration of the original data set.

クライアント・システム１４のグラフィカル・ユーザ・インタフェース（例えば、ＧＵＩなど）又はその他のインタフェース（例えば、コマンドラインプロンプト、メニュースクリーンなど）は、所望のデータ及び解析に関係する対応ユーザからの情報を要請し、解析結果（例えば、圧縮効率、予備拡張効率、最終復元中のＣＰＵ使用量など）を含むレポートを提供することができる。 The client system 14 graphical user interface (eg, a GUI, etc.) or other interface (eg, a command line prompt, menu screen, etc.) requests information from the corresponding user related to the desired data and analysis; Reports can be provided that include analysis results (eg, compression efficiency, pre-expansion efficiency, CPU usage during final recovery, etc.).

モジュール１６及び２０は、後述する本発明の実施形態の種々の機能を行うための１つ又は複数のモジュール又はユニットを含むことができる。種々のモジュール（例えば、復元モジュール１６、命令コード化モジュール２０など）は、任意の数のソフトウェア及び／又はハードウェア・モジュール又はユニットの任意の組合せによって実装することができ、プロセッサ１５による実行のためにサーバ及び／又はクライアント・システムのメモリ３５内に常駐することができる。この点に関して、後述のように、復元モジュール１６は、復元を行い、命令コード化モジュール２０は、結果の復元データを生成する自己復元のための命令を第２の辞書に挿入する。 Modules 16 and 20 may include one or more modules or units for performing various functions of the embodiments of the invention described below. The various modules (eg, decompression module 16, instruction coding module 20, etc.) may be implemented by any number of software and / or hardware modules or units in any combination and for execution by processor 15 In the memory 35 of the server and / or client system. In this regard, as described below, the decompression module 16 performs decompression, and the instruction coding module 20 inserts an instruction for self decompression into the second dictionary, which generates the resultant decompression data.

本発明の実施形態により復元モジュール１６及び命令コード化モジュール２０が（例えば、サーバ・システム１０及び／又はクライアント・システム１４を介して）予備拡張及び復元を行う方法を、図２及び図３に示す。図２は、予備拡張の例示的なプロセスを示し、一方、図３は、復元プロセスを示す。 The manner in which the decompression module 16 and the instruction coding module 20 perform pre-expansion and decompression (eg, via the server system 10 and / or the client system 14) according to an embodiment of the present invention is illustrated in FIGS. . FIG. 2 shows an exemplary process of pre-expansion, while FIG. 3 shows a restoration process.

具体的には、図２を参照すると、ステップ２００においてＺｉｖ−Ｌｅｍｐｅｌ圧縮が行われ、圧縮データセット及び関連付けられた復元辞書が生成される。得られた圧縮データセットは、コードのアレイ（元データを置換する）と、関連付けられた復元辞書とを含むことができ、この復元辞書は、コードのアレイに対してマッピングするか又は組合せトークン及び辞書アレイを提供する。復元辞書は、元データ由来のデータセグメントの圧縮トークンの拡張及び復元バージョンを収容し、ここで、後述のように、トークンの復元バージョンがアドレス指定スキームによって連鎖されて復元データセグメントを形成する。圧縮２００は、伝統的なＺｉｖ−Ｌｅｍｐｅｌ技術及び／又は本明細書で説明する技術によってさらなる効率をもたらす修正された技術を用いて行うことができる。圧縮データセットの復元は、ステップ２１０において開始され（例えば復元モジュール１６によって行われる）、第２の修正された復元辞書を生成する。元の復元辞書及び第２の修正復元辞書の両方を、端末装置の能力、又はシステム規則若しくは要件に従って、復元のために格納することができる。 Specifically, referring to FIG. 2, Ziv-Lempel compression is performed at step 200 to generate a compressed data set and an associated decompression dictionary. The resulting compressed data set may include an array of codes (replacing the original data) and an associated decompression dictionary, which maps to the array of codes or combination tokens and Provide a dictionary array. The decompression dictionary contains expanded and decompressed versions of compressed tokens of data segments from the original data, where the decompressed versions of the tokens are chained together by an addressing scheme to form decompressed data segments, as described below. Compression 200 can be performed using traditional Ziv-Lempel techniques and / or modified techniques that provide additional efficiencies with the techniques described herein. Decompression of the compressed data set is initiated at step 210 (eg, performed by decompression module 16) to generate a second modified decompression dictionary. Both the original recovery dictionary and the second modified recovery dictionary may be stored for recovery in accordance with the capabilities of the terminal or system rules or requirements.

データセットの圧縮中に圧縮されるデータセグメントの各トークンに対して、ステップ２２０においてトークンを処理して復元データが生成される。一例のＺｉｖ−Ｌｅｍｐｅｌ復元を後で詳細に提示する。後述の例において、圧縮されたエントリは、１２ビットでコード化され（例えば、３つの４ビット１６進法ニブルとして）、これを用いて対応する８バイト（６４ビット）拡張が圧縮トークンに対して誘導され、ここで拡張データセグメント（複数のトークンを有する）は、２９バイト又は文字を含み、これは文字列と呼ばれることもある（例えば、２９文字拡張文字列）。この例において復元のために用いられるメモリ位置は、メモリアドレス０１４９＿６４３９３０１４ｘにて開始し、ここで「ｘ」は１６進法（基数１６）表示を示し、この場合、各桁は４ビット（例えば、０−Ｆｘ）であり、メモリ位置はバイト（８ビット）の量に関して表される。例として、復元辞書は、アドレス００００００４８＿Ｂ６Ｃ８８０００ｘにてメモリ内にロードされる。例示的な圧縮データの一部を以下に示し、ここで開始アドレスは左側にあり、４つの後続のメモリ位置の内容（例えば、各々が３２ビット又は４バイトを含む）が各アドレスの右側にある。

For each token of the data segment to be compressed during data set compression, the token is processed at step 220 to generate recovered data. An example Ziv-Lempel restoration is presented in detail later. In the example below, compressed entries are encoded in 12 bits (for example, as three 4 bit hex nibbles), with which the corresponding 8 byte (64 bit) extensions are for compressed tokens Derived, where the extended data segment (with multiple tokens) contains 29 bytes or characters, sometimes referred to as a string (e.g., a 29 character extended string). The memory location used for reconstruction in this example starts at memory address 0149_64393014x, where "x" indicates a hexadecimal (base 16) representation, where each digit is 4 bits (e.g. 0) Memory locations are expressed in terms of byte (8 bit) quantities. As an example, the restoration dictionary is loaded into memory at address 00000048_B6C88000x. A portion of an exemplary compressed data is shown below, where the start address is on the left and the contents of four subsequent memory locations (eg, each containing 32 bits or 4 bytes) are on the right of each address .

アドレス０１４９＿６４３９３０１４ｘで始まる最初の６バイトはヘッダであり、この場合、値は０４００２００１ＣＢ０１ｘであり、上記で点線の下線が付されている。ヘッダの後の次の１２ビットである４１ＥＸは、第１のトークンに対する１２ビットコード（実線下線）であり、本質的に復元辞書へのポインタである。８バイト拡張を用いると、復元辞書へのメモリ・オフセットに８が掛けられる（すなわち、４１Ｅｘ×８、又は２０Ｆ０ｘ）。この値を復元辞書開始位置である４８＿Ｂ６Ｃ８８０００ｘ（先導するゼロは省略される）に加算して、２０Ｆ０ｘ＋４８Ｂ６０Ｃ８０００ｘ＝４８Ｂ６０ＣＡ０Ｆ０ｘを得る。この位置での例示的な復元辞書データを以下に示し、ここで開始アドレスは左側にあり、４つのメモリストアが各アドレスの右側にある。

The first six bytes starting at the address 0149_64393014x is a header, in this case the value is 04002001CB01x, which is underlined above with a dotted line. The next 12 bits 41EX after the header, which is the 12 bit code (solid underline) for the first token, is essentially a pointer to the recovery dictionary. With an 8-byte extension, the memory offset to the decompression dictionary is multiplied by 8 (ie, 41Ex × 8, or 20F0x). This value is added to the restoration dictionary start position 48_B6C88000x (a leading zero is omitted) to obtain 20F0x + 48B60C8000x = 48B60CA0F0x. Exemplary decompressed dictionary data at this location is shown below, where the start address is on the left and four memory stores are on the right of each address.

８バイト・トークン拡張がこのアドレスに存在し、下線が付されている（後の２つのメモリ位置内のデータは、この例には関係しない）。トークン拡張は、１６進数「Ａ」で開始し、１２ビット拡張アドレス４１９ｘが続き、５バイト（４０ビット）の未圧縮データ００３０００３０００ｘが続く。従って、この例のトークンは、未圧縮データを収容するのみならず、メタデータ（すなわち、４ビット命令「Ａ」及び拡張アドレス）も収容する。今までのところ、拡張データセグメント内の未圧縮データは、トークンの値又は００３０００３０００ｘである。規約により、トークン拡張の最初の４ビットがゼロではなく値Ａであるので、データセグメント拡張は不完全であるとみなされ、４１９ｘアドレスの１２ビット拡張を用いてデータセグメントの拡張が続けられる。復元辞書への次のメモリ・オフセットは、４１９ｘ×８、又は２０Ｃ８ｘである。この値を復元辞書開始位置である４８＿Ｂ６Ｃ８８０００ｘに加算して、２０Ｃ８ｘ＋４８Ｂ６０Ｃ８０００ｘ＝４８Ｂ６０ＣＡ０Ｃ８ｘを得る。そのメモリ位置における復元辞書データは、例えば、

とすることができる。 An 8-byte token extension is present at this address and is underlined (data in the last two memory locations are not relevant to this example). The token expansion starts with a hexadecimal number "A", followed by a 12 bit expanded address 419x, followed by 5 bytes (40 bits) of uncompressed data 0030003000x. Thus, the token in this example not only accommodates uncompressed data, but also contains metadata (i.e., 4-bit instruction "A" and an extended address). So far, the uncompressed data in the extended data segment is the value of the token or 0030003000x. By convention, the data segment extension is considered incomplete because the first four bits of the token extension are the value A rather than zero, and the data segment extension is continued using the 12-bit extension of the 419x address. The next memory offset to the recovery dictionary is 419xx8, or 20C8x. This value is added to the restoration dictionary start position 48_B6C88000x to obtain 20C8x + 48B60C8000x = 48B60CA0C8x. The restoration dictionary data at that memory location is, for example,

It can be done.

データセグメント内の次のトークンに対する８バイト・トークン拡張がこのアドレスに存在する。トークン拡張は、１６進数「Ａ」で開始し、１２ビット拡張アドレス４１４ｓが続き、このトークンに対する５バイトの未圧縮データ３４００３１００３０ｘが続く。このデータは、拡張データセグメント内の未圧縮データと連結される。今までのところ、拡張データセグメント内の未圧縮データは、３４００３１００３０００３０００３０００ｘである。このトークン拡張の最初の４ビットがゼロではないので、拡張は継続し、４１４ｘアドレスの１２ビット拡張を用いて、上述の方式で新たなアドレス４８Ｂ６０ＣＡ０Ａ０ｘが生成される。そのメモリ位置における復元辞書データは、例えば、

とすることができる。 An 8-byte token extension to the next token in the data segment is present at this address. The token expansion starts with the hexadecimal number "A", followed by the 12 bit expanded address 414s, followed by the 5 bytes of uncompressed data 3400310030x for this token. This data is concatenated with the uncompressed data in the extended data segment. So far, the uncompressed data in the extended data segment is 34003100300030003000x. Since the first four bits of this token extension are not zero, the extension continues and a new address 48B60CA0A0x is generated in the manner described above, using a twelve bit extension of the 414x address. The restoration dictionary data at that memory location is, for example,

It can be done.

次のトークンに対する８バイト・トークン拡張がこのアドレスに存在する。トークン拡張は、１６進数「Ａ」で開始し、１２ビット拡張アドレス４１４ｘが続き、５バイトの未圧縮データ０５００００６０００ｘが続く。このデータは、拡張データセグメント内の未圧縮データと連結される。今までのところ、拡張データセグメント内の未圧縮データは、０５００００６０００３４００３１００３０００３０００３０００ｘである。プロセスは、後続の８バイト・トークン拡張の最初の４ビットがゼロになるまで、後続のトークに対して継続する。例として、第４回目から第６回目までの拡張の繰返しは、以下のトークンを与える。

An 8-byte token extension to the next token is present at this address. The token expansion starts with a hexadecimal number "A", followed by a 12 bit expanded address 414x, followed by 5 bytes of uncompressed data 0500006000x. This data is concatenated with the uncompressed data in the extended data segment. So far, the uncompressed data in the extended data segment is 050000600034003100300030003000x. The process continues for the following talks until the first four bits of the following 8-byte token extension are zero. As an example, repeating the fourth through sixth expansions gives the following tokens:

この例において、第６回目の繰返しにより、８バイトトークン拡張の最初の４ビットがゼロになり（下線）、それにより、このデータセグメントに対する拡張が完了したことが示される。次の桁はアドレスではないが４の値（下線）であり、これは４バイトのデータが残っていることを示す。この例において、４バイトは００１２００１８ｘ（下線）である。このデータセグメントに対する最終的な連結された又は拡張された値は、以下の例を用いて表される。
４１Ｅｘ＝００１２００１８３Ｅ００４０００４２００２Ｃ００３Ｃ００００５０００６０００３４００３１００３０００３０００３０００ｘ In this example, the sixth iteration makes the first four bits of the 8-byte token extension zero (underlined), thereby indicating that the extension for this data segment is complete. The next digit is not an address but a value of 4 (underlined), which indicates that 4 bytes of data remain. In this example, the four bytes are 00120018x (underlined). The final concatenated or expanded value for this data segment is represented using the following example.
41 Ex = 0012000183E 00400042002 C 003 C000050 0060 00 003 003000 3 000 3000 x

データセグメントに対する拡張は、第２の修正復元辞書内に格納され、初期トークン拡張内のメモリ位置に基づいて取り出すことができる。上記の例に関して、第１回目の繰返しに由来する元のメモリ位置である４１Ｅｘを、第２の修正復元辞書内の拡張データセグメントに対するデータエントリ点として用いることができる。換言すれば、第２の修正復元辞書は、基本的に、拡張データセグメントを取り出すためのルックアップ・テーブルとして動作する。プロセスは、ステップ２３０において復元辞書内の全てのデータセグメント（及びトークン）が拡張されて第２の修正復元辞書内に格納されるまで続く。 Extensions to the data segment may be stored in a second modified decompression dictionary and retrieved based on memory locations within the initial token extension. For the above example, 41Ex, the original memory location from the first iteration, can be used as a data entry point for the extended data segment in the second modified recovery dictionary. In other words, the second modified recovery dictionary basically operates as a look-up table for retrieving extended data segments. The process continues until all data segments (and tokens) in the recovery dictionary are expanded at step 230 and stored in the second modified recovery dictionary.

上記のデータセグメントは、既知の長さ２９バイトを有する。圧縮効率を達成するために、復元を行うときに各データセグメント又はトークンを１回より多く用いることができる。データセグメントを復元するために、第２の復元辞書内の拡張データセグメントを上述の方式と同じ方式で連結することができる。 The above data segment has a known length of 29 bytes. Each data segment or token can be used more than once when performing decompression to achieve compression efficiency. The extended data segments in the second recovery dictionary can be concatenated in the same manner as described above to recover the data segments.

復元されたデータセグメントを提供するために、ステップ２３５において、１つ又は複数の命令を生成し、第２の修正復元辞書内に格納することができる（例えば、命令コード化モジュール２０によって行われる）。一例において、ステップ２３５において、ムーブ（又はコピー）命令を生成して格納することができる。ムーブ命令は、データを１つのメモリ位置から別の位置へ移動させる。例えば、復元辞書内の復元データセグメントは、復元データセグメントを一緒に共通ストレージ領域内へ移動することによって連結される。 One or more instructions may be generated and stored in a second modified decompression dictionary (eg, performed by instruction coding module 20) at step 235 to provide decompressed data segments. . In one example, at step 235, a move (or copy) instruction may be generated and stored. A move instruction moves data from one memory location to another. For example, restoration data segments in the restoration dictionary are concatenated by moving the restoration data segments together into a common storage area.

例示的なムーブ命令を、以下、アセンブリ又はアセンブラ言語（例えば、３６０アセンブリ又はｘ８６アセンブリ）に関して説明する。アセンブリ言語は、下層のプロセッサ・アーキテクチャに結びつけることができ、偽英語様コード（ｐｓｅｕｄｏ−Ｅｎｇｌｉｓｈｌｉｋｅｃｏｄｅ）を用いてプロセッサ命令を表す。実行速度のために、多くのコンピュータ・コンパイラ及び処理サブルーチンは、アセンブリ言語で記述されている。従って、以下のコード例によって説明される論理を、処理プラットフォームに適合させることができる。一例において、アセンブリ言語命令ムーブ・キャラクタ（ＭＶＣ）は、ステップ２３５において、第２の修正復元辞書内の各復元データセグメントの位置に隣接した位置（メモリ内又はディスク上）に配置され又は格納される。コンピューティング環境に応じて、その他の命令コードを用いてデータをコピー又は移動することができる（例えば、ＭＶＩ、ＭＨＨＶＩ、ＭＯＶなど）。例示的なＭＶＣ命令を以下に示す。

Exemplary move instructions are described below with respect to assembly or assembly language (eg, 360 assembly or x86 assembly). Assembly language can be tied to the underlying processor architecture, and uses pseudo-English like code to represent processor instructions. For speed of execution, many computer compilers and processing subroutines are written in assembly language. Thus, the logic described by the following code example can be adapted to the processing platform. In one example, an assembly language instruction move character (MVC) is placed or stored at step 235 in a location (in memory or on disk) adjacent to the location of each recovered data segment in the second modified recovery dictionary . Depending on the computing environment, other instruction codes may be used to copy or move data (eg, MVI, MHHVI, MOV, etc.). An exemplary MVC instruction is shown below.

第１の例のＭＶＣ命令は、データ（例えば、文字）をソース・ストレージ位置からターゲット・ストレージ位置へ移動させる。ソース及びターゲット・メモリ位置は、コンピュータ・プログラム内で定義することができる。第２の例のＭＶＣ命令は、４０データ要素（例えば、文字）をソース・ストレージ位置からターゲット・ストレージ位置へ移動させる。ＭＶＣ命令は、上限２５６データ要素（例えば、文字）までの移動を規定しており、これにより、データセグメントを連結して最終的に復元データセットを生成することが可能になる。この例において、ＴＡＲＧＥＴは、上述の共通ストレージ位置とすることができ、ＳＯＵＲＣＥは、第２の修正復元辞書内の復元データセグメント（例えば、復元モジュール１６によって生成される）とすることができる。 The first example MVC instruction moves data (eg, characters) from a source storage location to a target storage location. Source and target memory locations can be defined in a computer program. The MVC instruction of the second example moves 40 data elements (e.g., characters) from the source storage location to the target storage location. The MVC instruction defines movement up to a maximum of 256 data elements (e.g., characters), which allows data segments to be concatenated to ultimately generate a reconstructed data set. In this example, TARGET can be the common storage location described above, and SOURCE can be a restore data segment (eg, generated by restore module 16) in the second modified restore dictionary.

この例では、ステップ２３５において、第２の修正復元辞書内の復元データセグメントに隣接して配置された各ＭＶＣ命令のメモリ位置又は相対メモリ位置もまた、分離した命令ジャンプ・テーブル内に格納される。上記の例の２９バイト又は文字のデータセグメントを、対応するＭＶＣ命令（ＭＶＣＴＡＲＧＥＴ（２９），ＳＯＵＲＣＥ）によって移動させることができる。命令ジャンプ・テーブルは、圧縮データの第１回目の繰返しの復元を考慮したものである。 In this example, in step 235, the memory location or relative memory location of each MVC instruction located adjacent to the recovered data segment in the second modified recovery dictionary is also stored in the separate instruction jump table . The 29 byte or character data segment of the above example can be moved by the corresponding MVC instruction (MVC TARGET (29), SOURCE). The instruction jump table takes into account the first iteration of decompression of the compressed data.

さらに明確にするために、第１のトークンを繰返し処理して、第１の拡張データセグメントを生成する。第１の拡張データセグメント（例えば、２９バイト文字列）及びＭＶＣ命令（又はその表現）は、第２の修正復元辞書内に格納される。第１のトークン及びゼロのメモリ・オフセット（すなわち、第１のエントリに対してゼロ）は、ジャンプ・テーブル内に格納される。メモリ・オフセットを、第２の修正復元辞書に格納されるような文字列の長さ及びＭＶＣ命令に適応させるために、現在のオフセットへ前進させる。次の（又は第２の）トークンを繰返し処理して、第２の拡張文字列を生成する。第２のトークン及び現在のメモリ・オフセットが、次のエントリとしてジャンプ・テーブル内に格納される。メモリ・オフセットを、第２の文字列の長さ及びＭＶＣ命令に適応するように前進させ、プロセスは、トークン処理が完了するまで続く。 To further clarify, the first token is iteratively processed to generate a first extended data segment. The first extended data segment (e.g., a 29 byte string) and the MVC instruction (or a representation thereof) are stored in a second modified decompression dictionary. The first token and a memory offset of zero (ie, zero for the first entry) are stored in the jump table. The memory offset is advanced to the current offset to accommodate the string length and MVC instructions as stored in the second modified recovery dictionary. The next (or second) token is iteratively processed to generate a second extended string. The second token and the current memory offset are stored in the jump table as the next entry. The memory offset is advanced to accommodate the second string length and the MVC instruction, and the process continues until token processing is complete.

従って、ジャンプ・テーブルは、トークンのリストと、対応する第２の修正復元辞書へのポインタ又はメモリアドレス・オフセットとを有し、ここでポインタは、本来的に拡張文字列長（すなわち、オフセットの前進による）を説明するものである。それゆえ、ＭＶＣ命令に対するソースアドレスは、ジャンプ・テーブルによって入手可能であり、一方、ターゲットアドレスは、ＭＶＣ命令に対する指定ターゲットアドレスを提供する制御ソフトウェアにより提供される。ジャンプ・テーブル内のエントリを処理することで、一連の拡張文字列の効率的な連結が可能になる。第２の修正復元辞書を復元するための例示的なアセンブリ言語のサンプルを以下に示し、図３に関連して説明する。コード・サンプルは、上述のように生成されたジャンプ・テーブルを含む。 Thus, the jump table has a list of tokens and a pointer or memory address offset to the corresponding second modified recovery dictionary, where the pointer is by nature the extension string length (i.e. of the offset) Is described). Therefore, the source address for the MVC instruction is available via the jump table, while the target address is provided by the control software which provides the specified target address for the MVC instruction. Processing the entries in the jump table enables efficient concatenation of a series of extended strings. An example assembly language sample for recovering the second modified recovery dictionary is shown below and described in connection with FIG. The code sample contains the jump table generated as described above.

全てのデータセグメント又は所与の数のデータセグメントが既に拡張されたか否かをステップ２４０で判定する。全てのデータセグメントが拡張されていない場合、プロセスはステップ２２０において繰り返される。そうでない場合には、プロセスは２４５において継続し、図３に進む。この時点で、圧縮データセット内のデータセグメントの予備拡張が完了する。データセットの復元は、後の時点で、元のソースデータが取り出されるときに行うことができる。 It is determined at step 240 whether all data segments or a given number of data segments have already been expanded. If all data segments have not been expanded, the process is repeated at step 220. If not, the process continues at 245 and proceeds to FIG. At this point, pre-expansion of data segments in the compressed data set is complete. Data set recovery can be performed at a later point in time as the original source data is retrieved.

第２の修正復元辞書を復元のために利用する方法を、図３に関連して説明する。復元プロセスは、２５０において図２から継続することができ、又は元のソースデータが要求されたときに行うことができる（例えば、復元モジュール１６により）。復元は、ステップ２５５において開始する。ステップ２６０において、データセグメントの初期トークンが圧縮データから得られる。トークンは、第２の修正復元辞書へのポインタを含むことができる（例えば、上述の４１Ｅｘ）。ステップ２７０において、対応する拡張の命令が実行される。図２に関連して上述したジャンプ・テーブルの例を続けると、対応する又は第１のＭＶＣ命令が、ジャンプ・テーブル内で位置探索され、実行される。アセンブリコードセグメントを以下にリストとして示し、これはジャンプ・テーブルと例示的な修正復元辞書とを有する。 The manner in which the second modified recovery dictionary is utilized for recovery is described in connection with FIG. The restoration process can continue from FIG. 2 at 250 or can occur when the original source data is requested (eg, by the restoration module 16). The restoration starts in step 255. At step 260, an initial token of the data segment is obtained from the compressed data. The token may include a pointer to a second modified recovery dictionary (e.g. 41Ex described above). At step 270, the instruction of the corresponding extension is executed. Continuing with the example of the jump table described above in connection with FIG. 2, the corresponding or first MVC instruction is located and executed in the jump table. The assembly code segments are listed below, which has a jump table and an exemplary revision dictionary.

以下のコードは、予備拡張データベーステーブルの列又はレコード内のデータに対して全ての１２ビットコード化エントリを処理する。このコード・サンプルは、３１ビットアドレス指定を使用するが、４８ビット、６４ビット、又は本明細書で提示するような３１ビットアドレス指定コードと同様の論理を用いるその他の任意の適切な既存の又は将来的なコード化スキームを使用することができる。コード・サンプルの上部分は、予備拡張データセグメントを連結する例示的なコードを含む。下部分は、ジャンプ・テーブル及び予備拡張辞書エントリを含む辞書エントリ又は「ＤＩＣＴＥＮＴＳ」で始まる。ＤＩＣＴＥＮＴＳコード及びデータは、以前に予備拡張されて第２の修正復元辞書内に格納された（例えば、図２に関連して説明したように行われる動作の結果として）拡張辞書の結果とすることができる。ＤＩＣＴＥＮＴＳラベル付きコード（例えば、ＥＮＴラベル）、及びアセンブリ言語コード（例えばＭＯＶラベル）を有する辞書エントリは、上部コードセグメントの実行又はインスタンス化に先立ってメモリ内にロードされる。拡張コード及び辞書エントリは、メモリの異なる領域内に常駐することができ、これを以下にリストとして示す。
＊Ｒ０は、処理するための１２ビット辞書エントリのカウントを収容する
＊Ｒ１は、次の圧縮列バイト（ソース）を指示する
＊Ｒ３は、ターゲット（拡張）データを指示する
＊Ｒ９は、生成された辞書情報（「ＤＩＣＴＥＮＴＳ」）を指示する

The following code processes all 12 bit coded entries for data in a column or record of a preliminary extended database table. This code sample uses 31-bit addressing but any other suitable existing or 48-bit, 64-bit or any other suitable using logic similar to the 31-bit addressing code as presented herein. Future coding schemes can be used. The upper part of the code sample contains an exemplary code that concatenates the pre-expanded data segments. The lower part starts with a dictionary entry or "DICTENTS" containing jump tables and preliminary extended dictionary entries. The DICTENTS code and data should be the result of an extended dictionary (eg as a result of the operations performed as described in connection with FIG. 2) previously stored in a second modified recovery dictionary pre-extended Can. A dictionary entry having DICTENTS labeled code (eg, ENT label) and assembly language code (eg, MOV label) is loaded into memory prior to execution or instantiation of the upper code segment. Extension codes and dictionary entries can reside in different areas of memory, which are listed below.
* R0 contains count of 12-bit dictionary entries to process * R1 indicates next compressed column byte (source) * R3 indicates target (extended) data * R9 generated Indicate dictionary information ("DICTENTS")

上記ＤＩＣＴＥＮＴＳコード及びデータは、０００ｘからＦＦＦｘ（すなわちＥＮＴ０００−ＥＮＴＦＦＦ）まで列挙されるエントリ「ＥＮＴ」ラベルを伴うジャンプ・テーブルである。０００ｘからＦＦＦｘまでのＥＮＴ列挙値は、１２ビット拡張によって提供される全ての可能な値（順列）を表す。ラベルは、アセンブリ言語コードにおいて、プログラマ（及びアセンブリ言語コンパイラ）がそのラベルに関連付けられるコードの１つ又は複数の行を定義することを可能にする擬英語構文として用いられる。各ＥＮＴラベルは、対応するムーブ「ＭＯＶ」ラベルを有する。ジャンプ・テーブル内のＥＮＴエントリが処理されるときに、対応するＭＯＶコードが実行される。例えば、ラベルＥＮＴＦＦＥは、ラベルＭＯＶＦＦＥを指示する。そしてまたＭＯＶＦＦＥは、１０２バイトの予備拡張（未圧縮）データを第１の１２ビット拡張からレジスタＲ３によって指示されるターゲットアドレス（このターゲットアドレスは、呼出しソフトウェアによって提供することができる）へ移動する命令（例えば、これはＭＶＣ命令を含む）を有する。ＭＶＣ命令の次は、ロードアドレス「ＬＡ」命令であり、これは、Ｒ３レジスタ内のターゲットアドレスを１０２だけインクリメントする又は前進させる。次のエントリＥＮＴＦＦＦは、ＭＯＶＦＦＦを指示し、これは、１０３バイトのデータを第２の１２ビット拡張から、既に１０２だけインクリメントされたＲ３によって指示されるターゲットアドレスへ移動する命令を有する。各予備拡張データセグメントの長さは、異なる修正拡張辞書間で変化し得る。 The DICTENTS code and data is a jump table with the entry "ENT" label listed from 000x to FFFx (i.e. ENT 000-ENTFFF). The ENT enumeration values from 000x to FFFx represent all possible values (permutations) provided by the 12-bit extension. Labels are used in assembly language code as pseudo-English syntax that allows a programmer (and an assembly language compiler) to define one or more lines of code associated with the label. Each ENT label has a corresponding move "MOV" label. When the ENT entries in the jump table are processed, the corresponding MOV code is executed. For example, the label ENTFFE indicates the label MOVFFE. And also, the MOVFFE moves the 102 bytes of pre-expanded (uncompressed) data from the first 12-bit extension to the target address pointed to by register R3, which can be provided by the calling software. (For example, this includes an MVC instruction). Following the MVC instruction is the load address "LA" instruction, which increments or advances the target address in the R3 register by 102. The next entry ENTFFF points to the MOVFFF, which has an instruction to move the 103 bytes of data from the second 12 bit extension to the target address pointed to by R3, which has already been incremented by 102. The length of each preliminary extension data segment may vary between different modified extension dictionaries.

ＭＯＶラベルに関連付けられたアセンブリ言語命令の実行後、プログラムは次の命令のためにジャンプ・テーブルへ戻る。ＭＶＣ命令の代わりに、第２の修正復元辞書へのポインタがジャンプ・テーブル内に格納される。第２の修正復元辞書は、ＭＶＣ命令を収容し、ＭＶＣ命令の終了後、プログラムは、次のポインタのためにジャンプ・テーブルへ戻る。全てのデータセグメントが既に処理されたか否かをステップ２７５で判定する。一例において、ジャンプ・テーブル内の最後のエントリは、主呼出し（復元）プログラムへ戻る戻り命令を収容する。否であれば、プロセスは、ステップ２６０へ戻る。そうでなければ、プロセスは、ステップ２８０において終了する。 After execution of the assembly language instruction associated with the MOV label, the program returns to the jump table for the next instruction. Instead of the MVC instruction, a pointer to the second modified recovery dictionary is stored in the jump table. The second modified recovery dictionary contains the MVC instruction, and after completion of the MVC instruction, the program returns to the jump table for the next pointer. In step 275 it is determined whether all data segments have already been processed. In one example, the last entry in the jump table contains a return instruction to return to the main call (restore) program. If not, the process returns to step 260. Otherwise, the process ends at step 280.

命令の生成及び格納（例えば、ステップ２３５において命令コード化モジュール２０により）並びにこれらの命令の取得及び実行（例えば、ステップ２７０において復元モジュール１６により）の幾つかの変形例をここで説明する。第１の例において、ＭＶＣ命令をジャンプ・テーブル内に格納する代わりに、ＭＶＣ命令及び任意の所与の復元文字列の長さが、第２の修正復元辞書内の各復元データセグメントの隣に格納される（例えば、ステップ２３５において命令コード化モジュール２０により）。次いで、各復元文字列を予備バイトでパディングして、固定長の復元データセグメントを得る。パディング・バイトは、拡張文字列の一部ではないが、指定された固定長を得るために用いられるものであり、何らかの特定の情報で埋める必要はない。従って、第２の修正復元辞書内の各エントリは、ＭＶＣ命令、拡張データセグメント、及びパディング（例えば、２５６バイトの固定長を達成するための）を含むことができる。 Several variations of instruction generation and storage (e.g., by the instruction coding module 20 at step 235) and acquisition and execution of these instructions (e.g., by the restoration module 16 at step 270) are now described. In the first example, instead of storing the MVC instruction in the jump table, the length of the MVC instruction and any given decompression string is next to each decompression data segment in the second correction decompression dictionary Are stored (eg, by the instruction coding module 20 at step 235). Each recovery string is then padded with spare bytes to obtain fixed length recovery data segments. The padding byte is not part of the extension string but is used to obtain the specified fixed length and does not need to be filled with any particular information. Thus, each entry in the second modified decompression dictionary can include an MVC instruction, an extended data segment, and padding (eg, to achieve a fixed length of 256 bytes).

この第１の変形例は、固定長アドレス指定を用いることができるので、ジャンプ論理を簡略化することができるという利点を有する。例えば、復元中に、復元のために用いられる１２ビット値を用いて、この１２ビット値に、パディングされたエントリのサイズを掛けることにより、復元辞書へ直接に索引付けすることができる。ひとたびこの復元辞書内の位置へ分岐されると、データセグメント長が得られ、復元データセグメントが連結のために移動される（例えば、ステップ２７０において復元モジュール１６により）。 This first variant has the advantage that the jump logic can be simplified since fixed-length addressing can be used. For example, during recovery, the recovery dictionary can be indexed directly by multiplying this 12-bit value by the size of the padded entry using the 12-bit value used for recovery. Once branched to a position in this reconstruction dictionary, the data segment length is obtained and the reconstructed data segment is moved for concatenation (e.g. by the reconstruction module 16 at step 270).

第２の変形例において、ＭＶＣ命令は、ジャンプ・テーブルの代わりに命令アレイ内に格納される。ひとたび各復元データセグメントが構築されると、ＭＶＣ命令は、命令アレイ内に挿入される（例えば、ステップ２３５において命令コード化モジュール２０により）。この技術が使用される場合、ＭＶＣ命令のみが命令アレイ内に格納され、主（呼出し）アプリケーションは、ＭＶＣアレイへと分岐する。命令アレイ内のＭＶＣとデータセグメントとを、第２の復元辞書内の別々のストレージ・セクション又はパーティション内に格納することができる。命令アレイとデータセグメントとが別々に格納されるので（すなわち、命令アレイはデータセグメントと混ざらない）、メモリ内にロードするときに、ＭＶＣ命令アレイをキャッシュメモリ（例えば、命令キャッシュ）内にロードすることができ、データセグメントを分離したメモリ（例えば、ＲＡＭ、データキャッシュなど）にロードすることができる。従って、命令アレイ技術は、キャッシュ構造（例えば、命令キャッシュ、データキャッシュなど）を利用することができる。復元中に、命令アレイ内のＭＶＣ命令が繰返し実行され、データセグメントを連結のために移動させる（例えば、ステップ２７０において復元モジュール１６により）。 In a second variant, the MVC instruction is stored in the instruction array instead of the jump table. Once each decompressed data segment is constructed, an MVC instruction is inserted into the instruction array (eg, by instruction coding module 20 at step 235). When this technique is used, only the MVC instruction is stored in the instruction array, and the main (calling) application branches to the MVC array. The MVC and data segments in the instruction array may be stored in separate storage sections or partitions in the second reconstruction dictionary. Because the instruction array and data segment are stored separately (ie, the instruction array does not mix with the data segment), load the MVC instruction array into cache memory (eg, instruction cache) when loading into memory Data segments can be loaded into a separate memory (eg, RAM, data cache, etc.). Thus, instruction array technology can utilize cache structures (eg, instruction cache, data cache, etc.). During decompression, the MVC instructions in the instruction array are repeatedly executed to move data segments for concatenation (eg, by recovery module 16 at step 270).

第３の変形例において、継続論理がＭＶＣ命令と共に命令アレイ内に格納される（例えば、ステップ２３５において命令コード化モジュール２０により）。復元中に、アレイ内のＭＶＣ命令が連結のために実行され、継続命令がチェックされる（例えば、ステップ２７０において復元モジュール１６により）。さらなるＭＶＣ命令が残っていると継続論理が判断した場合、継続論理は、次のＭＶＣ命令を実行する。そうでない場合、復元は完了する。上記の例を通じて、復元を単一の呼出しで行うことができ、復元は自己実行され、それにより自己復元データセグメントが可能になる。換言すれば、このアルゴリズムは、全てのデータセグメントを、主呼出しアプリケーションに戻る前に結合する。 In a third variant, the continuation logic is stored in the instruction array with the MVC instruction (e.g. by the instruction coding module 20 at step 235). During decompression, the MVC instructions in the array are executed for concatenation, and the continuation instructions are checked (eg, by recovery module 16 at step 270). If the continuation logic determines that more MVC instructions remain, the continuation logic executes the next MVC instruction. If not, the restore is complete. Through the above example, restoration can be done in a single call, and restoration is self-executing, thereby enabling self-restoring data segments. In other words, this algorithm combines all data segments before returning to the main call application.

要約すれば、本発明の実施形態において、一連のムーブ命令が、全ての予備拡張データセグメントが連結されるまで繰返し実行され、これにより、圧縮（例えば、Ｚｉｖ−Ｌｅｍｐｅｌ圧縮）に用いられた元のデータセットがもたらされる。上記の例において、単に最初の２９バイトデータセグメントを得るために何回かの繰返しを行っていた。しかしながら、これらの繰返しは、本発明の実施形態によって、図３に関連して説明した復元（例えば、復元モジュール１６によって行われるような）中に実行する必要はない。伝統的なＬＺ復元と本発明の実施形態とを比較評価すると、経過時間及びＣＰＵ処理オーバーヘッドにおいてそれぞれ４３％及び８１％の削減が示された。 In summary, in embodiments of the present invention, a series of move instructions are repeatedly executed until all pre-expanded data segments are concatenated, thereby causing the original used for compression (eg, Ziv-Lempel compression). A data set is provided. In the above example, several iterations were performed to simply obtain the first 29 byte data segment. However, these iterations do not need to be performed during the reconstruction (eg, as performed by the reconstruction module 16) described in connection with FIG. 3 according to an embodiment of the present invention. A comparative evaluation of traditional LZ restoration with embodiments of the present invention showed 43% and 81% reductions in elapsed time and CPU processing overhead, respectively.

上で説明し図面に示した実施形態は、復元中ステップ２７０においての処理効率を高めるために予備拡張された自己復元する文字列を利用するデータ復元に関する実施形態を実装する多くの方法のうちのごく少数を表したに過ぎないことが認識されるであろう。 The embodiment described above and illustrated in the drawings is one of many ways to implement an embodiment for data restoration that utilizes pre-expanded self-healing strings to enhance processing efficiency in step 270 during restoration. It will be appreciated that it represents only a small number.

本発明の実施形態の環境は、任意の所望の様式で配置された、任意の数のコンピュータ又はその他の処理システム（例えば、クライアント又はエンドユーザ・システム１４、サーバ・システム１０など）及びデータベース又は他のリポジトリを含むことができ、本発明の実施形態は、任意の所望のタイプのコンピューティング環境（例えば、クラウド・コンピューティング、クライアント−サーバ、ネットワーク・コンピューティング、メインフレーム、独立型システムなど）に適用することができる。本発明の実施形態で使用されるコンピュータ又はその他の処理システムは、任意の数の任意のパーソナル型又はその他の型式のコンピュータ又は処理システム（例えば、デスクトップ、ラップトップ、ＰＤＡ、移動体装置など）によって実装することができ、任意の市販のオペレーティングシステム並びに市販のソフトウェア及びカスタム・ソフトウェア（例えば、ブラウザ・ソフトウェア、通信ソフトウェア、サーバ・ソフトウェア、文字列復元モジュール１６、命令コード化モジュール２０など）の任意の組合せを含むことができる。これらのシステムは、情報の入力及び／又は閲覧のために任意の型式のモニタ及び入力装置（例えば、キーボード、マウス、音声認識など）を含むことができる。 The environment of embodiments of the present invention may be any number of computers or other processing systems (e.g., client or end-user systems 14, server systems 10, etc.) and databases or other arranged in any desired manner. Repository, and embodiments of the present invention can be used in any desired type of computing environment (eg, cloud computing, client-server, network computing, mainframe, stand-alone systems, etc.) It can apply. The computer or other processing system used in embodiments of the present invention may be by any number of any personal or other type of computer or processing system (eg, desktop, laptop, PDA, mobile device, etc.) Any commercially available operating system and any commercially available software and custom software (eg, browser software, communication software, server software, string recovery module 16, instruction coding module 20, etc.) that can be implemented. It can include combinations. These systems can include any type of monitor and input device (eg, keyboard, mouse, voice recognition, etc.) to enter and / or view information.

本発明の実施形態のソフトウェア（例えば、文字列復元モジュール１６、命令コード化モジュール２０など）は、任意の所望のコンピュータ言語で実装することができ、本明細書に含まれる機能的説明及び図面に示されたフローチャートに基づいてコンピュータ分野の当業者が開発することができるであろうことを理解されたい。さらに、本明細書における種々の機能を実行するソフトウェアへのいかなる言及も、一般に、ソフトウェア制御下でこれらの機能を実行するコンピュータ・システム又はプロセッサに言及するものとする。本発明の実施形態のコンピュータ・システムは、任意の型式のハードウェア及び／又はその他の処理回路によって代替的に実装することができる。 The software (eg, string recovery module 16, instruction coding module 20, etc.) of the embodiments of the present invention may be implemented in any desired computer language and may be included in the functional descriptions and figures contained herein. It should be understood that one of ordinary skill in the computer arts could develop based on the flowcharts shown. Further, any reference to software that implements the various functions herein generally refers to a computer system or processor that performs those functions under software control. The computer system of embodiments of the present invention may alternatively be implemented by any type of hardware and / or other processing circuitry.

コンピュータ又はその他の処理システムの種々の機能は、任意の数のソフトウェア及び／又はハードウェア・モジュール又はユニット、処理又はコンピュータ・システム及び／又は回路間で、任意の方式で分散させることができ、ここでコンピュータ又は処理システムは、互いにローカルに又は遠隔に配置することができ、任意の適切な通信媒体（例えば、ＬＡＮ、ＷＡＮ、イントラネット、インターネット、配線、モデム接続、無線など）を介して通信することができる。例えば、本発明の実施形態の機能は、種々のエンドユーザ／クライアント及びサーバ・システム、及び／又は任意のその他の中間処理装置間で、任意の方式で分散させることができる。上で説明されフローチャートに示されたソフトウェア及び／又はアルゴリズムは、本明細書で説明される機能を達成する任意の方式で修正することができる。さらに、フローチャート又は説明内の機能は、所望の動作を達成する任意の順序で行うことができる。 The various functions of the computer or other processing system may be distributed in any manner among any number of software and / or hardware modules or units, processing or computer systems and / or circuits, and The computers or processing systems can be located locally or remotely from one another and communicate via any suitable communication medium (eg, LAN, WAN, intranet, Internet, wired, modem connected, wireless, etc.) Can. For example, the functionality of embodiments of the present invention may be distributed in any manner among various end user / client and server systems, and / or any other intermediate processing devices. The software and / or algorithms described above and illustrated in the flowcharts may be modified in any manner that achieves the functions described herein. Additionally, the functions in the flowcharts or descriptions may be performed in any order that achieves the desired operation.

本発明の実施形態のソフトウェア（例えば、文字列復元モジュール１６、命令コード化モジュール２０など）は、独立型システム又はネットワーク若しくはその他の通信媒体によって接続されるシステムと共に使用するための固定型又は携帯型プログラム製品機器又は装置の非一時的なコンピュータ可読媒体又は使用可能媒体（例えば、磁気又は光学媒体、光磁気媒体、フロッピーディスケット、ＣＤ−ＲＯＭ、ＤＶＤ、メモリ装置など）上で利用可能なものとすることができる。 The software (eg, string recovery module 16, instruction coding module 20, etc.) of embodiments of the present invention may be fixed or portable for use with a stand alone system or a system connected by a network or other communication medium. Shall be made available on non-transitory computer readable or usable media (eg magnetic or optical media, magneto-optical media, floppy diskettes, CD-ROM, DVD, memory devices etc.) of the program product equipment or device be able to.

通信ネットワークは、任意の数の任意の型式の通信ネットワーク（例えば、ＬＡＮ、ＷＡＮ、インターネット、イントラネット、ＶＰＮなど）によって実装することができる。本発明の実施形態のコンピュータ又はその他の処理システムは、任意の従来の又はその他のプロトコルを介してネットワーク上で通信する任意の従来の又はその他の通信装置を含むことができる。コンピュータ又はその他の処理システムは、ネットワークへのアクセスのための任意の型式の接続（例えば、有線、無線など）を利用することができる。ローカル通信媒体は、任意の適切な通信媒体（例えば、ローカル・エリア・ネットワーク（ＬＡＮ）、配線、無線リンク、イントラネットなど）によって実装することができる。 A communications network may be implemented by any number of any type of communications network (eg, LAN, WAN, Internet, intranet, VPN, etc.). The computer or other processing system of embodiments of the present invention may include any conventional or other communication device that communicates over the network via any conventional or other protocol. The computer or other processing system may utilize any type of connection (eg, wired, wireless, etc.) for access to the network. The local communication medium may be implemented by any suitable communication medium (eg, local area network (LAN), wired, wireless link, intranet, etc.).

システムは、情報（例えば、拡張辞書、圧縮データなど）を格納するために、任意の数の任意の従来の又はその他のデータベース、データストア又はストレージ構造（例えば、ファイル、データベース、データ構造、データ又その他のリポジトリなど）を使用することができる。データベース・システムは、情報（例えば、拡張辞書、圧縮データなど）を格納するために、任意の数の任意の従来の又はその他のデータベース、データストア又はストレージ構造（例えば、ファイル、データベース、データ構造、データ又その他のリポジトリなど）によって実装することができる。データベース・システムは、サーバ及び／又はクライアント・システム内に含まれるか又はこれに結合されるものとすることができる。データベース・システム及び／又はストレージ構造は、コンピュータ又はその他の処理システムの遠隔又はローカルにあるものとすることができ、任意の所望のデータ（例えば、拡張辞書、圧縮データなど）を格納することができる。 The system may include any number of any conventional or other databases, data stores or storage structures (eg, files, databases, data structures, data, etc.) for storing information (eg, expanded dictionaries, compressed data, etc.). Other repositories can be used. A database system may store any information (eg, extended dictionaries, compressed data, etc.) in any number of any conventional or other databases, data stores or storage structures (eg, files, databases, data structures, etc.). It can be implemented by data or other repositories. A database system may be included in or coupled to a server and / or client system. The database system and / or storage structure may be remote or local to a computer or other processing system, and may store any desired data (eg, expanded dictionaries, compressed data, etc.) .

本発明の実施形態は、情報（例えば、予備拡張命令、データ検索、レポートなど）を取得する又は提供するために、任意の数の任意の型式のユーザ・インタフェース（例えば、グラフィカル・ユーザ・インタフェース（ＧＵＩ）、コマンドライン、プロンプトなど）を使用することができ、ここでインタフェースは、任意の様式で配置された任意の情報を含むことができる。インタフェースは、任意の適切な入力装置（例えば、マウス、キーボードなど）を介して情報を入力／表示し、及び所望の動作を開始するために、任意の位置に配置された任意の数の任意の型式の入力又は作動機構（例えば、ボタン、アイコン、フィールド、ボックス、リンクなど）を含むことができる。インタフェース・スクリーンは、スクリーン間で任意の様式でナビゲートするために、任意の適切なアクチュエータ（例えば、リンク、タブなど）を含むことができる。 Embodiments of the present invention may be any number of any type of user interface (eg, a graphical user interface (eg, a graphical user interface (eg, a graphical user interface) to obtain or provide information (eg, pre-expansion instructions, data retrieval, reports, etc.)). GUI), command line, prompt etc.) can be used, where the interface can include any information arranged in any manner. The interface inputs / displays information via any suitable input device (eg, a mouse, a keyboard, etc.), and any number of any locations located at any position to initiate the desired action It may include type input or actuation mechanisms (e.g., buttons, icons, fields, boxes, links, etc.). The interface screen can include any suitable actuator (eg, links, tabs, etc.) to navigate between the screens in any manner.

レポートは、任意の様式で配置された任意の情報を含むことができ、所望の情報（例えば、圧縮効率、予備拡張効率、完全復元中のＣＰＵ使用量など）をユーザに提供するために規則又はその他の基準に基づいて構成可能なものとすることができる。 The report can include any information arranged in any manner, and the rules or rules to provide the user with the desired information (eg, compression efficiency, pre-expansion efficiency, CPU usage during full recovery, etc.) It may be configurable based on other criteria.

本発明の実施形態は、上述の特定のタスク又はアルゴリズムに限定されないが、圧縮データの予備拡張のために利用することができる。データは、Ｚｉｖ−Ｌｅｍｐｅｌに基づく技術を含むがこれに限定されない、任意の損失なし圧縮技術によって圧縮することができる。 Embodiments of the invention are not limited to the specific tasks or algorithms described above, but can be used for pre-expansion of compressed data. The data may be compressed by any lossless compression technique, including but not limited to Ziv-Lempel based techniques.

任意の数及び型式の復元（又は拡張）辞書を使用し格納することができる。拡張辞書は、復元中の処理効率を向上させるために、基礎をなすデータを完全に又は部分的に復元した任意の型式の予備拡張を含むことができる。 Any number and type of reconstruction (or expansion) dictionaries can be used and stored. The expanded dictionary can include any type of pre-expansion that fully or partially restore the underlying data to improve processing efficiency during restoration.

予備拡張辞書内に格納される命令は、拡張中のデータの連結を可能にする任意の形態のものとすることができる（例えば、ムーブ、コピー、ＭＶＣ命令など）。連結命令を予備拡張文字列長と共に用いて、正しい文字列長を得ることができる。予備拡張文字列は、任意の所望の長さを得るためにパディングを含むことができる。連結命令、文字列長などは、復元辞書内に格納することもでき、又は分離したジャンプ・テーブル若しくは命令テーブルとして格納することもできる。連結命令は、継続論理、又は命令の継続実行を可能にする若しくは任意の所望のコード分岐又はジャンプを可能にするその他の命令を含むことができ、また、所望の処理特性（例えば、辞書拡張時間、ＣＰＵ利用率など）を得るための単一の関数呼出し又は複数の呼出しを伴うデータ拡張を考慮に入れたものとすることができる。 The instructions stored in the preliminary expansion dictionary can be of any form that allows concatenation of data being expanded (eg, move, copy, MVC instructions, etc.). Concatenation instructions can be used with the preliminary extension string length to obtain the correct string length. The pre-extension string can include padding to obtain any desired length. Concatenation instructions, character string lengths, etc. may be stored in the recovery dictionary or as separate jump tables or instruction tables. Concatenation instructions may include continuation logic, or other instructions that allow for continued execution of the instruction or any desired code branch or jump, and may also include desired processing characteristics (eg, dictionary extension time). , Data expansion with a single function call or multiple calls to obtain CPU utilization, etc.) can be taken into account.

圧縮データエントリは、任意の所望の長さ（例えば、８ビット、１２ビットなど）のものとすることができ、圧縮され及び予備拡張されたデータを拡張するために任意の形態のアドレス指定を含むことができる。拡張トークンは、任意の所望の長さ（例えば、８バイト、１６バイトなど）のものとすることができ、任意の所望のメタデータ（例えば、継続命令、停止命令、拡張アドレスなど）を含むことができる。拡張データセットは、復元プロセスを補助する任意の所望の命令（例えば、ムーブ、コピー、継続命令など）を含むことができる。任意の形態のキャッシュ構造を使用し又は活用して、予備拡張、拡張、命令ストレージを容易にすることができる（例えば、ジャンプ・テーブル、命令キャッシュ、一時的ストレージ、命令オペランドなど）。 Compressed data entries can be of any desired length (eg, 8 bits, 12 bits, etc.), including any form of addressing to extend compressed and pre-expanded data be able to. The extension token can be of any desired length (eg, 8 bytes, 16 bytes, etc.) and contain any desired metadata (eg, continue instructions, stop instructions, extended addresses, etc.) Can. The extended data set can include any desired instructions (eg, move, copy, continue instructions, etc.) that aid in the restoration process. Any form of cache structure can be used or exploited to facilitate pre-expansion, expansion, instruction storage (eg, jump tables, instruction cache, temporary storage, instruction operands, etc.).

本明細書で用いられる用語は、特定の実施形態を説明するためのものにすぎず、本発明を限定することを意図したものではない。本明細書で用いられる場合、単数形「１つの（a）」、「１つの（an）」及び「その（the）」は、文脈が明らかにそうでないことを示さない限り、複数形も同様に含むことを意図したものである。「含む（comprise）」、「含んでいる（comprising）」、「含む（includes)」、「含んでいる（including）」、「有する（has、have）」、「有している（having）」、「伴う（with）」及び同様の用語は、本明細書で用いられる場合、記述された特徴、整数、ステップ、動作、要素、及び／又はコンポーネントの存在を指定するが、１つ又は複数のその他の特徴、整数、ステップ、動作、要素、コンポーネント、及び／又はそれらの群の存在又は付加を排除するものではないことがさらに理解されるであろう。 The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms "one (a)", "an" and "the" mean plural as well, unless the context clearly indicates otherwise. Intended to be included. "Comprise", "comprise", "includes", "includes", "including", "has", "haves" , "With" and like terms, as used herein, designate the presence of the described feature, integer, step, action, element, and / or component, but one or more It will be further understood that the existence or addition of other features, integers, steps, operations, elements, components, and / or groups thereof is not excluded.

以下の特許請求の範囲における全ての「手段又はステップと機能との組み合わせ（ミーンズ又はステップ・プラス・ファンクション）」要素の対応する構造、材料、動作、及び均等物は、その機能を、明確に特許請求された他の請求要素との組み合わせで実行するためのあらゆる構造、材料、又は動作を含むことが意図されている。本発明の説明は、例証及び説明を目的として提示されたものであり、網羅的であること又は本発明を開示された形態に限定することを意図したものではない。本発明の範囲及び趣旨から逸脱しない多くの修正及び変形が当業者には明らかとなるであろう。実施形態は、本発明の原理及び実際の用途を最も良く説明するために、そして、当業者が、企図した特定の用途に適した種々の修正を伴う種々の実施形態に関して本発明を理解できるように、選択及び説明したものである。 The corresponding structure, material, operation and equivalents of all "means or step and function combinations (means or step plus function)" elements in the following claims are expressly patented their function. It is intended to include any structure, material, or acts for performing in combination with the other claimed elements. The description of the present invention is presented for purposes of illustration and description, and is not intended to be exhaustive or to limit the invention to the disclosed form. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the present invention. The embodiments are provided to best explain the principles and practical applications of the present invention, and to enable the person skilled in the art to understand the present invention in terms of various embodiments with various modifications suitable for the particular application intended. , As selected and described.

本発明の種々の実施形態の説明を例証のために提示したが、これらは網羅的であること又は開示された実施形態に限定することを意図したものではない。当業者には、説明した実施形態の範囲及び趣旨から逸脱しない多くの修正及び変形が明白となるであろう。本明細書で用いた用語は、実施形態の原理、実際的用途、又は市場で見いだされる技術に対する技術的改良点を最も良く説明するように、又は、当業者が本明細書で開示した実施形態を理解することができるように、選択されたものである。 While the description of various embodiments of the invention has been presented for the purposes of illustration, these are not intended to be exhaustive or to limit the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terms used herein are to best explain the principles of the embodiments, practical applications, or technical improvements to techniques found in the market, or embodiments disclosed herein by those skilled in the art. It is chosen so that you can understand.

当業者であれば理解するように、本発明の態様は、システム、方法又はコンピュータ・プログラム製品として具体化することができる。従って、本発明の態様は、完全にハードウェアの実施形態、完全にソフトウェアの実施形態（ファームウェア、常駐ソフトウェア、マイクロコード等を含む）、又はソフトウェアの態様とハードウェアの態様とを組み合わせた実施形態の形を取ることができ、本明細書においてはこれらの全てを一般に「回路」、「モジュール」又は「システム」と呼ぶことができる。さらに、本発明の態様は、コンピュータ可読プログラム・コードが組み入れられた１つ又は複数のコンピュータ可読媒体内として具体化されたコンピュータ・プログラム製品の形を取ることができる。 As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or computer program product. Thus, aspects of the present invention may be entirely hardware embodiments, entirely software embodiments (including firmware, resident software, microcode, etc.), or embodiments combining software and hardware aspects. And all of these may be generally referred to herein as "circuits", "modules" or "systems". Further, aspects of the invention may take the form of a computer program product embodied in one or more computer readable media in which the computer readable program code is embodied.

１つ又は複数のコンピュータ可読媒体の任意の組合せを用いることができる。コンピュータ可読媒体は、コンピュータ可読信号媒体又はコンピュータ可読ストレージ媒体とすることができる。コンピュータ可読ストレージ媒体は、例えば、電子、磁気、光学、電磁気、赤外線若しくは半導体のシステム、装置、若しくはデバイス、又は上記のもののいずれかの適切な組合せとすることができる。コンピュータ可読ストレージ媒体のより具体的な例（非網羅的なリスト）として、以下のもの：即ち、１つ又は複数の配線を有する電気的接続、ポータブル・コンピュータ・ディスケット、ハード・ディスク、ランダム・アクセス・メモリ（ＲＡＭ）、読み出し専用メモリ（ＲＯＭ）、消去可能なプログラム可能読み出し専用メモリ（ＥＰＲＯＭ又はフラッシュ・メモリ）、光ファイバ、ポータブル・コンパクト・ディスク読み出し専用メモリ（ＣＤ−ＲＯＭ）、光記憶装置、磁気記憶装置、又は上記のもののいずれかの適切な組合せが挙げられる。本明細書の文脈においては、コンピュータ可読ストレージ媒体は、命令実行システム、装置若しくはデバイスによって、又はこれらと関連して用いるためのプログラムを収容又は格納することができる任意の有形媒体とすることができる。 Any combination of one or more computer readable media may be used. Computer readable media can be computer readable signal media or computer readable storage media. A computer readable storage medium may be, for example, an electronic, magnetic, optical, electromagnetic, infrared or semiconductor system, apparatus or device, or any suitable combination of the above. As a more specific example (non-exhaustive list) of computer readable storage media, the following: electrical connection with one or more wires, portable computer diskette, hard disk, random access Memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read only memory (CD-ROM), optical storage device, Magnetic storage devices or any suitable combination of the above may be mentioned. In the context of the present description, a computer readable storage medium may be any tangible medium capable of containing or storing a program for use by or in connection with an instruction execution system, apparatus or device .

コンピュータ可読信号媒体は、例えばベースバンド内に又は搬送波の一部として、具体化されたコンピュータ可読プログラム・コードをその中に有する、伝搬されるデータ信号を含むことができる。このような伝搬信号は、これらに限定されるものではないが、電磁気、光又はそれらのいずれかの適切な組合せを含む、種々の形態のいずれかを取ることができる。コンピュータ可読信号媒体は、コンピュータ可読ストレージ媒体ではなく、かつ、命令実行システム、装置若しくはデバイスによって、又はこれらと関連して用いるためのプログラムを伝達し、伝搬し、又は搬送することができる任意のコンピュータ可読媒体とすることができる。 A computer readable signal medium may include a propagated data signal having computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such propagated signals may take any of a variety of forms including, but not limited to, electromagnetic, optical or any suitable combination thereof. A computer readable signal medium is not a computer readable storage medium, and any computer capable of transmitting, propagating or otherwise carrying programs for use by or in connection with an instruction execution system, apparatus or device. It can be a readable medium.

コンピュータ可読媒体上に具体化されたプログラム・コードは、これらに限定されるものではないが、無線、有線、光ファイバ・ケーブル、ＲＦ等、又は上記のもののいずれかの適切な組合せを含む、任意の適切な媒体を用いて伝送することができる。 The program code embodied on the computer readable medium is any, including but not limited to, wireless, wired, fiber optic cable, RF etc, or any suitable combination of the above. Can be transmitted using any suitable medium.

本発明の態様の操作を実行するためのコンピュータ・プログラム・コードは、Ｊａｖａ、ＳｍａｌｌＴａｌｋ、Ｃ＋＋等のようなオブジェクト指向型プログラミング言語、及び「Ｃ」プログラミング言語又は同様のプログラミング言語のような従来の手続き型プログラミング言語を含む、１つ又は複数のプログラミング言語のいずれかの組合せで記述することができる。プログラム・コードは、完全にユーザのコンピュータ上で実行される場合もあり、一部がユーザのコンピュータ上で、独立型ソフトウェア・パッケージとして実行される場合もあり、一部がユーザのコンピュータ上で実行され、一部が遠隔コンピュータ上で実行される場合もあり、又は完全に遠隔コンピュータ若しくはサーバ上で実行される場合もある。一番最後のシナリオにおいては、遠隔コンピュータは、ローカル・エリア・ネットワーク（ＬＡＮ）若しくは広域ネットワーク（ＷＡＮ）を含むいずれかのタイプのネットワークを通じてユーザのコンピュータに接続される場合もあり、又は外部コンピュータへの接続が為される場合もある（例えば、インターネット・サービス・プロバイダを用いたインターネットを通じて）。 Computer program code for performing the operations of aspects of the invention may be used in conventional procedures such as object oriented programming languages such as Java, SmallTalk, C ++, etc., and "C" programming languages or similar programming languages. It can be written in any combination of one or more programming languages, including type programming languages. The program code may run entirely on the user's computer, some may execute on the user's computer as a standalone software package, and some may execute on the user's computer Some may be run on the remote computer or may be run entirely on the remote computer or server. In the last scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or to an external computer Connection may be made (eg, through the Internet using an Internet service provider).

本発明の態様は、以下、本発明の実施形態による方法、システム及び／又はコンピュータ・プログラム製品のフローチャート図及び／又はブロック図を参照して説明される。フローチャート図及び／又はブロック図の各ブロック、並びにフローチャート図及び／又はブロック図内のブロックの組合せは、コンピュータ・プログラム命令によって実装することができることが理解されるであろう。これらのコンピュータ・プログラム命令を、汎用コンピュータ、専用コンピュータ、又は他のプログラム可能データ処理装置のプロセッサに与えてマシンを製造し、それにより、コンピュータ又は他のプログラム可能データ処理装置のプロセッサによって実行される命令が、フローチャート及び／又はブロック図の１つ又は複数のブロック内で指定された機能／動作を実装するための手段を作り出すようにすることができる。 Aspects of the invention are described below with reference to flowchart illustrations and / or block diagrams of methods, systems and / or computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing device to produce a machine and thereby executed by the processor of the computer or other programmable data processing device The instructions may be arranged to create means for implementing the specified function / operation in one or more blocks of the flowchart and / or block diagram.

これらのコンピュータ・プログラム命令を、コンピュータ、他のプログラム可能データ処理装置、又は他のデバイスを特定の方式で機能させるように指示することができるコンピュータ可読媒体内に格納し、それにより、そのコンピュータ可読媒体内に格納された命令が、フローチャート及び／又はブロック図の１つ又は複数のブロックにおいて指定された機能／動作を実装する命令を含む製品を製造するようにすることもできる。 These computer program instructions are stored in a computer readable medium capable of instructing a computer, other programmable data processing device, or other device to function in a specific manner, whereby the computer readable The instructions stored in the medium may be adapted to produce a product including instructions implementing the specified function / action in one or more blocks of the flowchart and / or block diagram.

コンピュータ・プログラム命令を、コンピュータ、他のプログラム可能データ処理装置、又は他のデバイス上にロードして、一連の動作ステップをコンピュータ、他のプログラム可能データ処理装置、又は他のデバイス上で行わせてコンピュータ実施のプロセスを生成し、それにより、コンピュータ又は他のプログラム可能装置上で実行される命令が、フローチャート及び／又はブロック図の１つ又は複数のブロックにおいて指定された機能／動作を実行するためのプロセスを提供するようにすることもできる。 Loading computer program instructions onto a computer, other programmable data processing device, or other device to cause a series of operational steps to be performed on the computer, other programmable data processing device, or other device Computer-generated processes for causing instructions executed on a computer or other programmable device to perform the functions / actions specified in one or more blocks of the flowcharts and / or block diagrams. It is also possible to provide the process of

図面内のフローチャート及びブロック図は、本発明の種々の実施形態による、システム、方法、及びコンピュータ・プログラム製品の可能な実装の、アーキテクチャ、機能及び動作を示す。この点に関して、フローチャート又はブロック図内の各ブロックは、指定された論理機能を実装するための１つ又は複数の実行可能命令を含む、モジュール、セグメント、又はコードの一部を表すことができる。幾つかの代替的な実装において、ブロック内に記された機能は、図中に記された順序とは異なる順序で生じることがあることにも留意されたい。例えば、連続して示された２つのブロックは、関与する機能に応じて、実際には実質的に同時に実行されることもあり、又はこれらのブロックはときとして逆順で実行されることもある。ブロック図及び／又はフローチャート図の各ブロック、及びブロック図及び／又はフローチャート図中のブロックの組合せは、指定された機能又は動作を実行する専用ハードウェア・ベースのシステム、又は専用のハードウェアとコンピュータ命令との組合せによって実装することができることにも留意されたい。 The flowcharts and block diagrams in the drawings illustrate the architecture, functionality and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagram can represent a module, segment, or portion of code that includes one or more executable instructions for implementing the specified logical function. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may in fact be executed substantially simultaneously, depending on the functions involved, or they may sometimes be executed in the reverse order. Each block in the block diagrams and / or flowchart illustrations and combinations of blocks in the block diagram and / or flowchart illustrations are dedicated hardware based systems or specialized hardware and computers that perform specified functions or operations. It should also be noted that it can be implemented in combination with instructions.

１０：サーバ・システム
１４：クライアント・システム 10: Server system 14: Client system

Claims

コンピュータによって実行される、圧縮データを復元するための方法であって、
第１の復元辞書を解析するステップであって、前記第１の復元辞書は、アドレス指定スキームに基づいて前記第１の復元辞書内に非連続方式で分散された未圧縮データ部分を各々が有する複数の連鎖を含み、ここで各連鎖の前記未圧縮データ部分は、圧縮データの対応する未圧縮バージョンを形成する、第１の復元辞書を解析するステップと、
前記第１の復元辞書内の前記連鎖の各々の前記未圧縮データ部分を組み合わせて圧縮データの未圧縮バージョンを形成するステップにより第２の復元辞書を生成し、圧縮データを復元するための命令を前記第２の復元辞書内に挿入するステップと、
圧縮データを前記第２の復元辞書に適用することにより、前記圧縮データを復元するステップと、
を含む、コンピュータによって実行される方法。 A computer implemented method for decompressing compressed data, comprising:
Analyzing a first reconstruction dictionary, each of the first reconstruction dictionaries having uncompressed data portions distributed in a non-sequential manner in the first reconstruction dictionary based on an addressing scheme Analyzing a first reconstruction dictionary comprising a plurality of chains, wherein the uncompressed data portion of each chain forms a corresponding uncompressed version of compressed data;
Generating a second decompression dictionary by combining the uncompressed data portions of each of the chains in the first decompression dictionary to form an uncompressed version of the compressed data, and instructions for decompressing the compressed data Inserting into the second reconstruction dictionary;
Decompressing the compressed data by applying the compressed data to the second decompression dictionary;
Computer-implemented methods, including:

前記圧縮データが、ＺｉｖＬｅｍｐｅｌ圧縮スキームに従って圧縮されている、請求項１に記載のコンピュータによって実行される方法。 Wherein said compressed data has been compressed according to Ziv Lempel compression scheme, being executed by a computer, according to claim 1.

前記第２の復元辞書内の前記命令が、前記圧縮データの単一のトークンを一回の移動で
復元する命令を含む、請求項１に記載のコンピュータによって実行される方法。 The computer-implemented method of claim 1, wherein the instructions in the second decompression dictionary include instructions to decompress a single token of the compressed data in a single move.

前記第２の復元辞書内の前記命令が、前記圧縮データの全てのトークンを一回の呼出しで復元する命令を含む、請求項１に記載のコンピュータによって実行される方法。 The computer-implemented method of claim 1, wherein the instructions in the second decompression dictionary include instructions to decompress all tokens of the compressed data in a single call.

前記第２の復元辞書内の前記命令が、コンピュータ・アーキテクチャから独立したものである、請求項１に記載のコンピュータによって実行される方法。 The computer implemented method of claim 1, wherein the instructions in the second reconstruction dictionary are independent of computer architecture.

前記命令と未圧縮バージョンとが一緒に前記第２の復元辞書内に存在し、キャッシュ構造を利用する、請求項１に記載のコンピュータによって実行される方法。 The computer-implemented method of claim 1, wherein the instruction and uncompressed version are together in the second decompression dictionary and utilize a cache structure.

前記第２の復元辞書内の前記命令が、前記圧縮データの未圧縮バージョンを連結するムーブ命令、前記圧縮データの未圧縮バージョンに関連付けられた長さ、及び継続命令のうちの１つ又は複数を含む、請求項１に記載のコンピュータによって実行される方法。 The instruction in the second decompression dictionary comprises one or more of a move instruction that concatenates the uncompressed version of the compressed data, a length associated with the uncompressed version of the compressed data, and a continuation instruction The computer-implemented method of claim 1 comprising.

第１の復元辞書を解析し、前記第１の復元辞書は、アドレス指定スキームに基づいて前記第１の復元辞書内に非連続方式で分散された未圧縮データ部分を各々が有する複数の連鎖を含み、ここで各連鎖の前記未圧縮データ部分は、圧縮データの対応する未圧縮バージョンを形成し、A first recovery dictionary is analyzed, said first recovery dictionary comprising a plurality of chains each having uncompressed data portions distributed in a non-continuous manner in said first recovery dictionary based on an addressing scheme. Where the uncompressed data portion of each chain forms a corresponding uncompressed version of the compressed data,
前記第１の復元辞書内の前記連鎖の各々の前記未圧縮データ部分を組み合わせて圧縮データの未圧縮バージョンを形成することにより前記第２の復元辞書を生成し、圧縮データを復元するための命令を前記第２の復元辞書内に命令を挿入し、 An instruction to generate the second decompression dictionary by combining the uncompressed data portions of each of the chains in the first decompression dictionary to form an uncompressed version of compressed data, and to decompress compressed data Insert instructions into the second reconstruction dictionary,
圧縮データを前記第２の復元辞書に適用することにより、前記圧縮データを復元する、 Restoring the compressed data by applying the compressed data to the second decompression dictionary;
ように構成された少なくとも１つのプロセッサを含むコンピュータ・システムを含む、Including a computer system including at least one processor configured as
システム。system.

請求項１〜７の何れか１項に記載の方法の各ステップを、コンピュータに実行させる、コンピュータ・プログラム。 A computer program which causes a computer to execute each step of the method according to any one of claims 1 to 7.

請求項９記載の前記コンピュータ・プログラムをコンピュータ可読ストレージ媒体に記録した、コンピュータ可読ストレージ媒体。
A computer readable storage medium having the computer program according to claim 9 recorded on a computer readable storage medium.