JPH08166966A

JPH08166966A - Dictionary retrieval device, database device, character recognizing device, speech recognition device and sentence correction device

Info

Publication number: JPH08166966A
Application number: JP6311727A
Authority: JP
Inventors: Katsuki Minamino; 活樹南野
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1994-12-15
Filing date: 1994-12-15
Publication date: 1996-06-25

Abstract

PURPOSE: To retrieve a symbol string being similar to an ambiguous input symbol string from a dictionary. CONSTITUTION: The input symbol string is converted into an input vector in which the number of the respective symbols included therein is defined as the size of an element in a vector converting part 2. In the vector converting part 4, the symbol string registered in the dictionary 3 is converted into a standard vector in which the number of the respective symbols included therein is defined as the size of the element. Then, in a similarity degree calculating part 5, the similarity degree of the input vector to the standard vector is calculated and the symbol string being similar to the input symbol string is retrieved from the dictionary 3 based on the similarity degree.

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、記号列の集合で構成さ
れる辞書から、入力された記号列に類似するものを検索
する辞書検索装置、データベース装置、文字認識装置、
音声認識装置、並びに文章修正装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a dictionary search device, a database device, a character recognition device for searching a dictionary composed of a set of symbol strings for a string similar to an input symbol string.
The present invention relates to a voice recognition device and a sentence correction device.

【０００２】[0002]

【従来の技術】例えば、アルファベットなどの記号列で
構成される英単語で構成される辞書から、あるアルファ
ベット列（記号列）と一致する単語を検索する場合にお
いては、そのアルファベット列と辞書に登録された単語
とを、一文字目から順に照合していくことにより行われ
る。2. Description of the Related Art For example, when searching for a word that matches an alphabet string (symbol string) from a dictionary composed of English words composed of symbol strings such as alphabets, the word is registered in the alphabet string and the dictionary. It is performed by collating the generated words with the first character in order.

【０００３】[0003]

【発明が解決しようとする課題】ところで、上述したよ
うな検索方法は、綴り（スペル）が、正確に入力された
アルファベット列に対しては、そのアルファベット列に
一致する単語が得られる。しかしながら、入力されたア
ルファベット列に誤りがある場合には、通常、それに一
致する単語は得られない。By the way, in the above-described search method, for an alphabet string whose spelling has been correctly input, a word that matches the alphabet string can be obtained. However, if there is an error in the input alphabet string, a word that matches it is usually not obtained.

【０００４】すなわち、例えば、ｂｏｏｋ，ｂｏｔｔｌ
ｅ，ｂｕｔｔｅｒという３単語で構成される辞書が与え
られた場合に、ｂｔｔｌｅというアルファベット列の入
力があったときには、そのアルファベット列に一致する
単語が辞書に登録されていないため、入力されたアルフ
ァベット列に対応する単語が得られないことになる。具
体的には、例えば入力アルファベット列ｂｔｔｌｅの１
文字目から順次照合が行われる。そして、この場合、そ
の１文字目ｂは、辞書に登録された全ての単語の１文字
目と一致するが、２文字目のｔは、辞書に登録された単
語のいずれの２文字目とも一致しない。従って、このよ
うな場合には、例えば入力に誤りがあるとしてエラーと
される。That is, for example, book, bottl
When a dictionary consisting of three words, e and butter is given, when the alphabet string “bttle” is input, since the word matching the alphabet string is not registered in the dictionary, the input alphabet string You will not get the word corresponding to. Specifically, for example, 1 in the input alphabet string bttle
The collation is sequentially performed from the first character. In this case, the first character b matches the first character of all words registered in the dictionary, but the second character t matches any second character of the words registered in the dictionary. do not do. Therefore, in such a case, it is determined that there is an error in the input, for example, and an error occurs.

【０００５】しかしながら、このように入力に誤りがあ
る場合であっても、エラーとするよりは、それに類似す
る（例えば、最も類似する）単語が検索結果として得ら
れるのが望ましい。このようないわば曖昧な記号列の入
力に対し、検索を行う際に生じる問題は、電子化された
辞書からの検索の他、例えば音声認識などで行われるパ
ターンマッチングなどにおいても、頻繁に発生する問題
である。However, even if there is an error in the input as described above, it is desirable that a word (for example, the most similar) similar to it be obtained as a search result rather than an error. Such a problem that occurs when searching for an ambiguous symbol string input occurs frequently not only in searching from an electronic dictionary but also in pattern matching performed by voice recognition, for example. It's a problem.

【０００６】本発明はこのような状況に鑑みてなされた
ものであり、曖昧な記号列の入力があった場合に、それ
に類似する記号列を検索することができるようにするも
のである。The present invention has been made in view of such a situation, and when an ambiguous symbol string is input, a symbol string similar to the symbol string can be searched.

【０００７】[0007]

【課題を解決するための手段】請求項１に記載の辞書検
索装置は、所定の記号列の集合で構成される辞書から、
入力記号列に類似するものを検索する辞書検索装置であ
って、入力記号列を、それに含まれる各記号の個数を要
素の大きさとするベクトルである入力ベクトルに変換す
る入力記号列変換手段（例えば、図１に示すベクトル変
換部２など）と、辞書に登録されている所定の記号列そ
れぞれを、それに含まれる各記号の個数を要素の大きさ
とするベクトルである標準ベクトルに変換する辞書記号
列変換手段（例えば、図１に示すベクトル変換部４な
ど）と、入力ベクトルと標準ベクトルとが類似している
度合いを表す類似度を算出する算出手段（例えば、図１
に示す類似度計算部５など）と、算出手段により算出さ
れた類似度どうしを比較する類似度比較手段（例えば、
図１に示す類似度比較部６など）とを備え、類似度比較
手段の比較結果に基づいて、辞書から、入力記号列に類
似する所定の記号列を検索することを特徴とする。A dictionary search device according to a first aspect of the present invention comprises a dictionary composed of a set of predetermined symbol strings,
A dictionary search device for searching for a character similar to an input symbol string, wherein the input symbol string converting means converts the input symbol string into an input vector that is a vector having the number of each symbol included therein as the element size (for example, , A vector conversion unit 2 shown in FIG. 1) and a predetermined symbol string registered in the dictionary are converted into a standard symbol vector which is a vector whose element size is the number of each symbol included in the dictionary symbol string. A conversion unit (for example, the vector conversion unit 4 shown in FIG. 1) and a calculation unit for calculating a similarity representing the degree of similarity between the input vector and the standard vector (for example, FIG.
2) and a similarity comparing unit (for example, a similarity calculating unit 5) shown in FIG.
The similarity comparing unit 6 shown in FIG. 1) is provided, and a predetermined symbol string similar to the input symbol string is searched from the dictionary based on the comparison result of the similarity comparing unit.

【０００８】所定の記号列および入力記号列を、その両
方に同一の数だけ含まれる記号だけで構成される比較記
号列に変換し、それらの比較記号列を比較する記号列比
較手段（例えば、図１に示す類似度比較部７など）をさ
らに備える場合、類似度が一致する所定の記号列が複数
存在するときには、記号列比較手段の比較結果に基づい
て、辞書から、入力記号列に類似する所定の記号列を検
索させることができる。A symbol string comparing means (for example, a symbol string comparing means for converting a predetermined symbol string and an input symbol string into a comparison symbol string composed of only the same number of symbols in both of them and comparing the comparison symbol strings (for example, 1 is further included), and when there are a plurality of predetermined symbol strings having similarities in similarity, the dictionary is similar to the input symbol string based on the comparison result of the symbol string comparison means. It is possible to search for a predetermined symbol string to be executed.

【０００９】さらに、この辞書検索装置においては、辞
書から、類似度の高い順に所定数の所定の記号列を検索
させることができる。Further, in this dictionary search device, a predetermined number of predetermined symbol strings can be searched from the dictionary in descending order of similarity.

【００１０】また、辞書記号列変換手段を備えることに
代えて、辞書を、標準ベクトルを含めて構成することが
できる。Further, instead of providing the dictionary symbol string converting means, the dictionary can be configured to include standard vectors.

【００１１】さらに、入力記号列と辞書に登録されてい
る所定の記号列それぞれとを比較し、入力記号列と一致
する所定の記号列が存在する場合には、その記号列を検
索結果とさせ、入力記号列と一致する所定の記号列が存
在しない場合には、類似度比較手段の比較結果に基づい
て、辞書から、入力記号列に類似する所定の記号列を検
索させることができる。Furthermore, the input symbol string is compared with each of the predetermined symbol strings registered in the dictionary, and if there is a predetermined symbol string that matches the input symbol string, that symbol string is set as the search result. If there is no predetermined symbol string that matches the input symbol string, it is possible to retrieve a predetermined symbol string similar to the input symbol string from the dictionary based on the comparison result of the similarity comparison means.

【００１２】請求項６に記載のデータベース装置は、請
求項１乃至５のいずれかに記載の辞書検索装置と、辞書
に登録された所定の記号列に付随する情報を記憶してい
るデータベース（例えば、図３に示すデータベース２４
など）とを備え、辞書検索装置による検索の結果得られ
た所定の記号列に付随する情報を出力することを特徴と
する。A database device according to a sixth aspect is a database that stores the dictionary search device according to any one of the first to fifth aspects and information associated with a predetermined symbol string registered in the dictionary (for example, a database). , The database 24 shown in FIG.
And the like), and outputs information associated with a predetermined symbol string obtained as a result of the search by the dictionary search device.

【００１３】請求項７に記載の文字認識装置は、請求項
１乃至５のいずれかに記載の辞書検索装置と、手書き文
字を入力する入力手段（例えば、図４に示す手書き文字
入力部３１など）と、入力手段に入力された手書き文字
を、入力記号列に変換する手書き文字変換手段（例え
ば、図４に示す文字判定部３２など）とを備え、辞書検
索装置による検索の結果得られた所定の記号列に対応し
て、手書き文字を認識することを特徴とする。A character recognition apparatus according to a seventh aspect is a dictionary search apparatus according to any one of the first to fifth aspects, and an input means for inputting a handwritten character (for example, a handwritten character input section 31 shown in FIG. 4 or the like). ) And a handwritten character conversion unit (for example, the character determination unit 32 shown in FIG. 4) for converting a handwritten character input to the input unit into an input symbol string, and the result obtained by the dictionary search device is obtained. The feature is that handwritten characters are recognized in correspondence with a predetermined symbol string.

【００１４】請求項８に記載の音声認識装置は、請求項
１乃至５のいずれかに記載の辞書検索装置と、音声を入
力する入力手段（例えば、図５に示す音声入力部４１な
ど）と、入力手段に入力された音声を、入力記号列に変
換する音声変換手段（例えば、図５に示す記号列変換部
４２など）とを備え、辞書検索装置による検索の結果得
られた所定の記号列に対応して、音声を認識することを
特徴とする。A voice recognition apparatus according to claim 8 is a dictionary search apparatus according to any one of claims 1 to 5, and input means for inputting voice (for example, voice input section 41 shown in FIG. 5). , A predetermined symbol obtained as a result of the search by the dictionary search device, which is provided with a voice converting unit that converts the voice input to the input unit into an input symbol string (for example, the symbol string converting unit 42 shown in FIG. 5). It is characterized by recognizing voices corresponding to columns.

【００１５】請求項９に記載の文章修正装置は、文章中
の単語の綴りの誤りを修正する文章修正装置であって、
請求項５に記載の辞書検索装置と、単語を、入力記号列
として、辞書検索装置に供給する供給手段（例えば、図
６に示す制御部５２など）と、入力記号列と一致する所
定の記号列が存在しない場合に、入力記号列に対応する
単語を、辞書検索装置による検索の結果得られた記号列
に対応する単語に修正する修正手段（例えば、図６に示
す制御部５２など）とを備えることを特徴とする。A sentence correction device according to a ninth aspect is a sentence correction device for correcting a spelling error of a word in a sentence,
The dictionary search device according to claim 5, a supply unit that supplies a word as an input symbol string to the dictionary search device (for example, the control unit 52 shown in FIG. 6), and a predetermined symbol that matches the input symbol string. A correction means (for example, the control unit 52 shown in FIG. 6) that corrects the word corresponding to the input symbol string to the word corresponding to the symbol string obtained as a result of the search by the dictionary search device when there is no string. It is characterized by including.

【００１６】[0016]

【作用】請求項１に記載の辞書検索装置においては、入
力記号列がそれに含まれる各記号の個数を要素の大きさ
とする入力ベクトルに変換されるとともに、辞書に登録
されている所定の記号列それぞれがそれに含まれる各記
号の個数を要素の大きさとする標準ベクトルに変換され
る。そして、入力ベクトルと標準ベクトルとの類似度が
算出され、その算出された類似度同士が比較される。そ
の後、その比較結果に基づいて、辞書から入力記号列に
類似する記号列が検索される。従って、入力された記号
列が曖昧、あるいは誤ったものであっても、それと対応
すると考えられる記号列を得ることができる。According to the dictionary retrieval apparatus of the present invention, the input symbol string is converted into an input vector having the number of each symbol included therein as an element size, and a predetermined symbol string registered in the dictionary. Each is converted into a standard vector whose element size is the number of each symbol contained in it. Then, the similarity between the input vector and the standard vector is calculated, and the calculated similarities are compared. Then, based on the comparison result, the dictionary is searched for a symbol string similar to the input symbol string. Therefore, even if the input symbol string is ambiguous or erroneous, it is possible to obtain the symbol string considered to correspond to it.

【００１７】請求項６に記載のデータベース装置におい
ては、辞書検索装置の検索による結果得られた所定の記
号列に付随する情報が出力される。従って、曖昧な、あ
るいは誤った記号列の入力に対しても、その記号列に相
当すると考えられるものに付随する情報を得ることがで
きる。In the database device according to the sixth aspect, information associated with a predetermined symbol string obtained as a result of the search by the dictionary search device is output. Therefore, even if an ambiguous or erroneous symbol string is input, it is possible to obtain information associated with what is considered to correspond to the symbol string.

【００１８】請求項７に記載の文字認識装置において
は、辞書検索装置による検索の結果得られた所定の記号
列に対応して、手書き文字が認識される。従って、認識
率を向上させることができる。In the character recognition device according to the seventh aspect, handwritten characters are recognized in correspondence with a predetermined symbol string obtained as a result of the search by the dictionary search device. Therefore, the recognition rate can be improved.

【００１９】請求項８に記載の音声認識装置において
は、辞書検索装置による検索の結果得られた所定の記号
列に対応して、音声が認識される。従って、音声の認識
率を向上させることができる。In the voice recognition device according to the eighth aspect, the voice is recognized in correspondence with the predetermined symbol string obtained as a result of the search by the dictionary search device. Therefore, the voice recognition rate can be improved.

【００２０】請求項９に記載の文章修正装置において
は、入力記号列と一致する所定の記号列が存在しない場
合には、入力記号列に対応する単語が辞書検索装置によ
る検索の結果得られた所定の記号列に対応する単語に修
正される。従って、綴りの誤った単語を正しい綴りの単
語に修正することができる。In the sentence correction device according to the ninth aspect, when the predetermined symbol string that matches the input symbol string does not exist, the word corresponding to the input symbol string is obtained as a result of the search by the dictionary search device. It is corrected to the word corresponding to the predetermined symbol string. Therefore, a misspelled word can be corrected to a correctly spelled word.

【００２１】[0021]

【実施例】以下、本発明の実施例について、図面を参照
して説明するが、その前段階の準備として、本発明の原
理について説明する。Embodiments of the present invention will be described below with reference to the drawings. The principle of the present invention will be described as preparation for the preceding step.

【００２２】まず、記号列を構成する記号の集合をＣ＝
｛ｃ₁，ｃ₂，…，ｃ_n｝と表すことにする。ここで、ｎ
は記号の種類数を表す。従って、例えば記号としてアル
ファベットを考えた場合、大文字および小文字を区別し
ないとすれば、Ｃ＝｛ａ，ｂ，ｃ，ｄ，…，ｘ，ｙ，
ｚ｝が記号の集合であり、記号の種類数ｎは２６とな
る。なお、この場合、例えば、ハイフンやアポストロフ
ィなどを記号に含める場合には、その分だけ記号の集合
の要素と、その種類数ｎを増やすようにすればよい。但
し、ｎは有限の値とする。First, a set of symbols forming a symbol string is C =
It is represented as {c ₁ , c ₂ , ..., C _n }. Where n
Represents the number of types of symbols. Therefore, for example, when considering the alphabet as a symbol, C = {a, b, c, d, ..., x, y,
z} is a set of symbols, and the number of types of symbols n is 26. In this case, for example, when hyphens and apostrophes are included in a symbol, the number of types of elements and the number n of types of symbols may be increased accordingly. However, n is a finite value.

【００２３】また、例えば、ヘボン式表記のローマ字を
考えた場合には、Ｃ＝｛ａ，ｉ，ｕ，ｅ，ｏ，ｋａ，ｋ
ｉ，ｋｕ，ｋｅ，ｋｏ，ｓａ，ｓｈｉ，ｓｕ，ｓｅ，ｓ
ｏ，…，｝が記号の集合となる。なお、このような記号
の表記の仕方は、例えば、「日本語教育辞典」、大修館
書店、５０６乃至５０８頁に記載されている。In addition, for example, when considering the Roman character in the Hepburn notation, C = {a, i, u, e, o, ka, k
i, ku, ke, ko, sa, shi, su, se, s
o, ...,} is a set of symbols. Note that such notation of symbols is described in, for example, “Japanese Education Dictionary”, Daishukan Bookstore, pages 506 to 508.

【００２４】以上のような有限個の記号の集合Ｃを構成
する各記号を組み合わせて並べた記号列をワードと言う
こととする。記号の集合をＣとして、例えば、上述した
ようにアルファベットを考えた場合には、例えば、ｂｏ
ｏｋなどがワードとなる。A symbol string in which the respective symbols constituting the finite number of symbols C are combined and arranged is called a word. If a set of symbols is C and the alphabet is considered as described above, for example, bo
Ok is a word.

【００２５】次に、本発明では、上述したようなワード
の集合をリスト形式にしたものが辞書として用いられ
る。記号の集合Ｃとして、例えば、アルファベットを考
えた場合、例えばワードｂｏｏｋ，ｂｏｔｔｌｅ，ｂｕ
ｔｔｅｒなどをリスト形式にしたものが辞書となる。こ
こで、記号の集合Ｃに対して与えられる辞書Ｗを、Ｗ＝
｛ｗ₁，ｗ₂，…，ｗ_m｝と表すこととする。但し、ｗ₁，
ｗ₂，…は、記号の集合Ｃに含まれる記号で構成される
ワードを表し、ｍはワード数（辞書Ｗに登録されたワー
ド数）を表す。Next, in the present invention, a list of the above-mentioned word sets is used as a dictionary. When the alphabet is considered as the set C of symbols, for example, the words book, bottle, bu are used.
A dictionary is a list of tter and the like. Here, the dictionary W given to the set C of symbols is W =
It is represented as {w ₁ , w ₂ , ..., W _m }. However, w ₁ ,
w ₂ ... Represents a word composed of symbols included in the symbol set C, and m represents the number of words (the number of words registered in the dictionary W).

【００２６】いま、ある入力記号列ｘが与えられた場合
に、辞書Ｗの中から入力記号列ｘに類似したワードｗ_i
を検索することを考える。入力記号列ｘとワードｗ_iと
が類似している度合を示す類似度をｄ（ｘ，ｗ_i）と表
し、この類似度ｄ（ｘ，ｗ_i）が小さいほど両者が類似
していることを意味するものとすると、入力記号列ｘに
最も類似するワードｗ_iを検索するという問題は、ｍｉｎ｛ｄ（ｘ，ｗ_i）｝（１≦ｉ≦ｍ）・・・（１）を求める問題に等しい（但し、ｍｉｎ｛｝は｛｝内の最
小値を意味する）。Given an input symbol string x, a word w _i similar to the input symbol string x in the dictionary W is now given.
Think about searching. Input symbol sequence x and word w _i and is similar to that the similarity d (x, w _i) that indicates the degree and represents that this similarity d (x, w _i) are both the smaller are similar , The problem of retrieving the word w _i most similar to the input symbol string x is: min {d (x, w _i )} (1 ≦ i ≦ m) (1) Equal to the problem (where min {} means the minimum value in {}).

【００２７】なお、類似度ｄ（ｘ，ｗ_i）が大きいほど
ｘとｗ_iとが類似していることを意味するものとした場
合には、検索問題は、ｍａｘ｛ｄ（ｘ，ｗ_i）｝（１≦ｉ≦ｍ）を解く問題に等しくなる（但し、ｍａｘ｛｝は｛｝内の
最大値を意味する）。ここで、以下では、ｘとｗ_iが類
似していることを、類似度が高いと表現する。When it is assumed that x and w _i are more similar to each other as the similarity d (x, w _i ) is larger, the search problem is max {d (x, w _i). )} (1≤i≤m) (where max {} means the maximum value in {}). Here, in the following, the fact that x and w _i are similar is expressed as having a high degree of similarity.

【００２８】以上のような問題を解く場合においては、
類似度ｄ（ｘ，ｗ_i）をどのように定義するかが重要で
ある。そこで、ここでは、まずワードｗ_iについて、そ
れに含まれる各記号の個数を要素の大きさとするベクト
ル（標準ベクトル）ｖ（ｗ_i）を定義する。具体的に
は、例えば、記号の集合Ｃを上述したような２６個のア
ルファベットとした場合、ワードｗ_iについて、次のよ
うな２６次元の標準ベクトルを定義する。即ち、例え
ば、ワードｂｏｏｋについての標準ベクトルｖ（ｂｏｏ
ｋ）は、ワードｂｏｏｋが１つのｂ、１つのｋ、および
２つのｏから構成されているので、ｖ（ｂｏｏｋ）＝
［０１００００００００１０００２０００００００００
００］^Tとなる。但し、Ｔは転置を表すものとする。In solving the above problem,
It is important how to define the similarity d (x, w _i ). Therefore, here, for the word w _i , a vector (standard vector) v (w _i ) whose element size is the number of symbols included in the word w _i is defined. Specifically, for example, when the set C of symbols is 26 alphabets as described above, the following 26-dimensional standard vector is defined for the word w _i . That is, for example, the standard vector v (book for the word book
k) is v (book) =, since the word book consists of one b, one k, and two o.
[01000000000000000000000000
00] ^T. However, T represents transposition.

【００２９】標準ベクトルｖ（ｂｏｏｋ）を構成する各
要素（後述する入力ベクトルについても同様）は、左か
らａの個数、ｂの個数、ｃの個数、・・・、ｚの個数を
表している。従って、ｎ種類の記号の集合Ｃによって与
えられるワードについての標準ベクトルは、ｎ次元のベ
クトルとなる。Each element constituting the standard vector v (book) (the same applies to an input vector described later) represents the number of a, the number of b, the number of c, ..., The number of z from the left. . Therefore, the standard vector for the word given by the set C of n kinds of symbols is an n-dimensional vector.

【００３０】一方、入力記号列ｘについても同様に、そ
れに含まれる各記号の個数を要素の大きさとするベクト
ル（入力ベクトル）ｖ（ｘ）を定義する。例えば、入力
記号列がｂｏｔｔｌｅのうちのｏが欠落したｂｔｔｌｅ
であった場合、これは１つのｂ、１つのｅ、１つのｌ、
２つのｔで構成されるので、入力ベクトルｖ（ｂｔｔｌ
ｅ）は次のようになる。ｖ（ｂｔｔｌｅ）＝［０１００１００００００１０００
００００２００００００」^T On the other hand, similarly for the input symbol string x, a vector (input vector) v (x) whose element size is the number of each symbol included therein is defined. For example, when the input symbol string is o
, Then one b, one e, one l,
Since it is composed of two t, the input vector v (bttl
e) is as follows. v (bttle) = [01010000001000
0000000000 " ^T

【００３１】以上のように、入力ベクトルおよび標準ベ
クトルを定義することで、ｘとｗ_iの類似度ｄ（ｘ，
ｗ_i）は、例えば入力ベクトルと標準ベクトルとの差の
ノルムとして次のように定義することができる。ｄ（ｘ，ｗ_i）＝［ｖ（ｘ）−ｖ（ｗ_i）］^T［ｖ（ｘ）−ｖ（ｗ_i）］・・・（２）[0031] As described above, the input vector and by defining a standard vector, x and w _i of the similarity d (x,
w _i ) can be defined as, for example, the norm of the difference between the input vector and the standard vector as follows. d (x, w _i ) = [v (x) −v (w _i )] ^T [v (x) −v (w _i )] (2)

【００３２】なお、上式で定義されるｄ（ｘ，ｗ_i）
は、非負の実数値を取り、その値が小さいほどｘとｗ_i
とが類似していることを表す。Note that d (x, w _i ) defined by the above equation
Takes a non-negative real value, and x and w _i are smaller as the value is smaller.
And are similar.

【００３３】このような類似度ｄ（ｘ，ｗ_i）を用いる
場合には、式（１）を解くことにより、辞書Ｗから入力
記号列ｘに類似した記号列（ワード）を検索することが
できる。即ち、例えば、今、ワードｂｏｏｋ，ｂｏｔｔ
ｌｅ，ｂｕｔｔｅｒからなる辞書に対して、例えば、ｂ
ｔｔｌｅという記号列の入力があった場合、上述したよ
うにして、入力ベクトルおよび標準ベクトルを求め、式
（２）に従って、入力記号列と３つのワードそれぞれと
の間の類似度を算出すると、ｄ（ｂｔｔｌｅ，ｂｏｏｋ）＝１１ｄ（ｂｔｔｌｅ，ｂｏｔｔｌｅ）＝１ｄ（ｂｔｔｌｅ，ｂｕｔｔｅｒ）＝３となる。従って、入力された記号列ｂｔｔｌｅに最も類
似したワードは、類似度の最も小さいｂｏｔｔｌｅとい
うことになる。なお、以上の原理を用いた場合において
は、上述したように、類似度の最も小さいワードのみを
検索結果として出力することも可能であるが、例えば、
類似度の昇順に所定数のワードを検索結果として出力す
ることも可能である。When such a similarity d (x, w _i ) is used, a symbol string (word) similar to the input symbol string x can be retrieved from the dictionary W by solving the equation (1). it can. That is, for example, now, the words book, bott
For a dictionary consisting of le and butter, for example, b
When the symbol string called “ttle” is input, the input vector and the standard vector are obtained as described above, and the similarity between the input symbol string and each of the three words is calculated according to Expression (2). (Bittle, book) = 11 d (bttle, bottle) = 1 d (btttle, butter) = 3. Therefore, the word most similar to the input symbol string btle is the bottle having the smallest similarity. When the above principle is used, as described above, it is possible to output only the word having the smallest similarity as the search result.
It is also possible to output a predetermined number of words as a search result in ascending order of similarity.

【００３４】次に、図１は、以上の原理に基づいて、辞
書３から検索を行う辞書検索装置の一実施例の構成を示
している。Next, FIG. 1 shows the configuration of an embodiment of a dictionary search device for searching the dictionary 3 based on the above principle.

【００３５】辞書３には、ある記号の集合Ｃによって与
えられるワードの集合が記憶されている。そして、入力
部１に記号の集合Ｃに含まれる記号で構成される記号列
ｘが入力されると、そこからはその記号列ｘがベクトル
変換部２に出力される。The dictionary 3 stores a set of words given by a certain set C of symbols. Then, when the symbol string x composed of the symbols included in the symbol set C is input to the input unit 1, the symbol string x is output from the symbol string x to the vector conversion unit 2.

【００３６】ベクトル変換部２では、入力記号列ｘが上
述したような入力ベクトルｖ（ｘ）に変換され、類似度
計算部５に出力される。類似度計算部５に、入力ベクト
ルｖ（ｘ）が供給されると、ベクトル変換部４では、辞
書３に登録されているワードｗ_iが順次読み出され、上
述したような標準ベクトルｖ（ｗ_i）に変換されて、類
似度計算部５に供給される。The vector conversion section 2 converts the input symbol string x into the input vector v (x) as described above, and outputs it to the similarity calculation section 5. When the input vector v (x) is supplied to the similarity calculation unit 5, the vector conversion unit 4 sequentially reads the words w _i registered in the dictionary 3, and the standard vector v (w _It is converted to _i ) and supplied to the similarity calculation unit 5.

【００３７】類似度計算部５では、入力ベクトルｖ
（ｘ）および標準ベクトルｖ（ｗ_i）から式（２）にし
たがって、類似度ｄ（ｘ，ｗ_i）（１≦ｉ≦ｍ）が計算
され、類似度比較部６に出力される。In the similarity calculator 5, the input vector v
The similarity d (x, w _i ) (1 ≦ i ≦ m) is calculated from (x) and the standard vector v (w _i ) according to the equation (2), and is output to the similarity comparison unit 6.

【００３８】類似度比較部６では、類似度計算部５から
供給された類似度ｄ（ｘ，ｗ_i）同士が互いに比較さ
れ、そのうちの最も値の小さいものを与えるワードが、
辞書３から検索され、検索結果出力部８に供給される。In the similarity comparing section 6, the similarity d (x, w _i ) supplied from the similarity calculating section 5 are compared with each other, and the word giving the smallest value among them is
It is searched from the dictionary 3 and supplied to the search result output unit 8.

【００３９】検索結果出力部８では、類似度比較部６か
ら供給されたワードが出力される（例えば、表示された
り、あるいは音声で出力される）。なお、類似度比較部
６には、類似度の小さい順に所定数のワードを辞書３か
ら検索させ、検索結果出力部８に供給させるようにする
こともできる。この場合、検索結果出力部８からは、入
力記号列ｘに類似した順に所定数のワードが出力される
ことになる。The search result output section 8 outputs the word supplied from the similarity comparison section 6 (for example, is displayed or is output by voice). The similarity comparing unit 6 may be made to search the dictionary 3 for a predetermined number of words in the ascending order of similarity, and supply the search result output unit 8 with the same. In this case, the search result output unit 8 outputs a predetermined number of words in an order similar to the input symbol string x.

【００４０】また、上述の辞書検索装置においては、辞
書３に登録された全てのワードｗ_i（１≦ｉ≦ｍ）につ
いて、入力記号列ｘとの類似度を計算した後に、類似度
比較部６において、それらの類似度を比較するようにし
てもよいし、類似度計算部５において、１つのワードに
ついての類似度を計算するごとに、類似度比較部６にお
いて、その類似度と既に計算された類似度とを比較する
ようにしてもよい。Further, in the above-mentioned dictionary retrieval device, after calculating the similarity with the input symbol string x for all the words w _i (1 ≦ i ≦ m) registered in the dictionary 3, the similarity comparison unit 6, the similarity may be compared, or each time the similarity calculating unit 5 calculates the similarity for one word, the similarity comparing unit 6 calculates the similarity with the similarity. You may make it compare with the performed similarity.

【００４１】さらに、類似度計算部５における類似度の
計算は、１つのプロセッサを用いて行うことも可能であ
るが、例えば、複数のプロセッサを用いて、並列的に行
うようにしても良い。Further, the similarity calculation in the similarity calculator 5 can be performed by using one processor, but may be performed in parallel by using a plurality of processors, for example.

【００４２】また、辞書３に登録されているワードに対
応した標準ベクトルを木構造に構造化することにより、
類似度を計算するワード数を制限するようにしてもよ
い。Further, by structuring the standard vector corresponding to the words registered in the dictionary 3 into a tree structure,
The number of words for which the degree of similarity is calculated may be limited.

【００４３】さらに、辞書３には、ワードだけでなく、
そのワードを標準ベクトルに変換したものも併せて登録
するようにしておくことも可能である。この場合、ベク
トル変換部４を設ける必要がなくなり、装置の小型化が
図れるとともに、処理の高速化を図ることが可能とな
る。Further, in the dictionary 3, not only words but
It is also possible to register the word converted into a standard vector together. In this case, it is not necessary to provide the vector conversion unit 4, the device can be downsized, and the processing speed can be increased.

【００４４】また、入力部１には、入力記号列ｘの入力
があった場合に、その入力記号列ｘと、辞書３に登録さ
れたワードそれぞれとを比較させ、入力記号列ｘと一致
するワードが存在する場合には、そのワードを検索結果
として即座に検索結果出力部８から出力させ、入力記号
列ｘと一致するワードが存在しない場合にのみ、上述し
たように、入力記号列ｘに類似するワードの検索を行わ
せるようにすることも可能である。なお、入力記号列ｘ
と辞書３に登録されているワードとの比較は、それぞれ
１文字目から順次行うようにしても良いし、その他の手
法で行うようにしても良い。When the input symbol string x is input to the input unit 1, the input symbol string x is compared with each word registered in the dictionary 3, and the input symbol string x is matched. If there is a word, that word is immediately output from the search result output unit 8 as a search result, and only when there is no word that matches the input symbol string x, as described above, the input symbol string x It is also possible to search for a similar word. The input symbol string x
And the words registered in the dictionary 3 may be sequentially compared from the first character, or may be performed by another method.

【００４５】以上のように、図１の辞書検索装置によれ
ば、入力記号列が曖昧であったり、誤りを含んでいる場
合、即ち、例えば、入力記号列中に余分な記号が挿入さ
れている場合や、あるいは、本来必要な記号が欠落して
いるような場合であっても、その入力記号列に対応する
ワードの検索を行うことができる。As described above, according to the dictionary search device of FIG. 1, when the input symbol string is ambiguous or contains an error, that is, for example, an extra symbol is inserted in the input symbol string. Even when the input symbol string is present or when the originally necessary symbol is missing, the word corresponding to the input symbol string can be searched.

【００４６】ところで、例えば検索結果出力部８から類
似度の最も高いワードが１つだけ出力されることを希望
する場合であっても、最小の類似度を有するワードが複
数存在することがある。また、例えばワードが類似度順
に出力されることを希望する場合であっても、同一の類
似度を有するワードが複数存在する場合がある。そこ
で、図１の類似度比較部７では、類似度比較部６におけ
る比較の結果、類似度が同一のワードが複数あった場合
には、その類似度が同一のワードについて、さらに比較
を行い、どちらがより入力記号列ｘに類似しているかを
順序付けるようになされている。By the way, even when it is desired to output only one word having the highest degree of similarity from the search result output unit 8, there may be a plurality of words having the lowest degree of similarity. Further, for example, even when it is desired to output words in order of similarity, there may be a plurality of words having the same degree of similarity. Therefore, in the similarity comparing unit 7 of FIG. 1, when there are a plurality of words having the same similarity as a result of the comparison in the similarity comparing unit 6, the words having the same similarity are further compared, It is arranged to order which is more similar to the input symbol string x.

【００４７】図２は、類似度比較部７の詳細構成例を示
している。同図に示すように、類似度比較部７は、ベク
トルの要素比較部１１、記号除去部１２、記号一致個数
の計算部１３、および記号一致個数の比較部１４から構
成され、そこには、入力記号列ｘおよびその入力ベクト
ルｖ（ｘ）、並びに、類似度が同一であった全てのワー
ドｗ_jおよびその標準ベクトルｖ（ｗ_j）が入力されるよ
うになされている（但し、１≦ｊ≦ｍ）。FIG. 2 shows a detailed configuration example of the similarity comparison unit 7. As shown in the figure, the similarity comparing unit 7 is composed of a vector element comparing unit 11, a symbol removing unit 12, a symbol matching number calculating unit 13, and a symbol matching number comparing unit 14, and The input symbol string x and its input vector v (x), and all the words w _j having the same degree of similarity and its standard vector v (w _j ) are input (where 1 ≦ j ≦ m).

【００４８】要素比較部１１では、入力記号列ｘおよび
ワードｗ_jの両方に同一の数だけ含まれる記号が検出さ
れる。これは、入力ベクトルｖ（ｘ）の要素と、標準ベ
クトルｖ（ｗ_j）の要素とを比較することにより行われ
る。即ち、この比較の結果、値の一致する要素に対応す
る記号が入力記号列およびワードの両方に同一の数だけ
含まれる記号として検出される。The element comparison unit 11 detects the same number of symbols included in both the input symbol string x and the word w _j . This is done by comparing the elements of the input vector v (x) with the elements of the standard vector v (w _j ). That is, as a result of this comparison, the symbols corresponding to the elements having the same values are detected as the symbols included in the same number in both the input symbol string and the word.

【００４９】この記号（以下、適宜、同一記号と言う）
は、入力記号列ｘおよびワードｗ_jとともに、要素比較
部１１から記号除去部１２に供給される。This symbol (hereinafter appropriately referred to as the same symbol)
Is supplied from the element comparison unit 11 to the symbol removal unit 12 together with the input symbol string x and the word w _j .

【００５０】記号除去部１２では、入力記号列ｘおよび
ワードｗ_jの両方から同一記号以外の記号が削除され
る。これにより、入力記号列ｘおよびワードｗ_jは、そ
の両方に同一の数だけ含まれる記号だけで構成される記
号列に変換される。In the symbol removing unit 12, symbols other than the same symbol are deleted from both the input symbol string x and the word w _j . As a result, the input symbol string x and the word w _j are converted into a symbol string composed only of symbols included in the same number in both of them.

【００５１】ここで、以下、上述のように、入力記号列
ｘまたはワードｗ_jを変換した記号列を、それぞれ変換
入力記号列または変換ワードと言う。Here, hereinafter, the symbol string obtained by converting the input symbol string x or the word w _j as described above is referred to as a converted input symbol string or a converted word, respectively.

【００５２】変換入力記号列および変換ワードは、同一
の記号を同一の個数だけ用いて構成されているものであ
るから、その長さ（記号の個数）も同一となる。Since the converted input symbol string and the converted word are formed by using the same number of the same symbols, their lengths (the number of symbols) are also the same.

【００５３】記号除去部１２において、以上のようにし
て求められた変換入力記号列および変換ワードは、計算
部１３に出力される。計算部１３では、変換入力記号列
と変換ワードとが比較され、同一の記号であって、位置
も一致している記号の個数が計算される。計算部１３で
求められた同一の記号であって、位置も一致する記号の
個数（以下、適宜、記号一致個数と言う）は、比較部１
４に出力される。The conversion input symbol string and the conversion word obtained as described above in the symbol removing unit 12 are output to the calculating unit 13. The calculation unit 13 compares the converted input symbol string and the converted word, and calculates the number of symbols having the same symbol and having the same position. The number of symbols that are the same symbols that are calculated by the calculation unit 13 and that also match in position (hereinafter, appropriately referred to as the number of symbol matches) is calculated by
4 is output.

【００５４】比較部１４では、類似度が同一のワードに
ついての記号一致個数を全て受信すると、それらを比較
し、その数の多い順に高い類似度を割り当てる。そし
て、その後は、類似度比較部６における場合と同様にし
て、求められた類似度に従って、ワードが検索結果出力
部８に供給される。When the comparing section 14 receives all the symbol matching numbers for the words having the same degree of similarity, they are compared, and the higher degree of similarity is assigned in descending order of the number. Then, in the same manner as in the similarity comparing unit 6, the words are supplied to the search result output unit 8 according to the calculated similarity.

【００５５】いま、例えば、辞書３に、ｂｏｒｅおよび
ｒｏｂｅという２つのワードが少なくとも記憶されてお
り、入力記号列ｘとして、例えば、ｂｏｒが入力された
とすると、ｂｏｒｅおよびｒｏｂｅについて、式（２）
に従って計算した類似度は同一の値となる。そこで、ｂ
ｏｒに対応する入力ベクトルと、ワードｂｏｒｅに対応
する標準ベクトルとの要素どうしを比較すると、ｂｏｒ
は、ｂ，ｏ，ｒを１ずつ含み、ｂｏｒｅは、ｂ，ｏ，
ｒ，ｅを１つずつ含んでいるから、値の一致する要素に
対応する記号は、ｂ，ｏ，ｒとなる。そして、この記号
ｂ，ｏ，ｒで構成されるように、入力記号列ｂｏｒとワ
ードｂｏｒｅを変換すると、いずれもｂｏｒとなる。従
って、この変換後の記号列の中で、同一の記号であって
位置が一致する記号は、ｂ，ｏ，ｒの３個となる。一
方、入力記号列ｂｏｒおよびワードｒｏｂｅに対応する
入力ベクトルおよび標準ベクトルについては、値の一致
する要素に対応する記号は、上述の場合と同様に、記号
ｂ，ｏ，ｒとなる。そして、これらの記号だけで、入力
記号列ｂｏｒとワードｒｏｂｅを構成するように変換す
ると、それぞれはｂｏｒとｒｏｂとなる。従って、これ
らの記号列の中で、同一の記号であって位置が一致する
記号は、ｏの１つだけとなる。以上から、入力記号列ｂ
ｏｒに対しては、ワードｒｏｂｅよりも、ワードｂｏｒ
ｅの方が類似度が高いとされることになる。Now, for example, at least two words "bore" and "robe" are stored in the dictionary 3, and if, for example, "bor" is input as the input symbol string x, equation (2) is applied to both bore and robe.
The similarities calculated according to the above have the same value. Therefore, b
The elements of the input vector corresponding to or and the standard vector corresponding to the word bore are compared with each other.
Contains 1 each of b, o, r, and bore is b, o, r
Since r and e are included one by one, the symbols corresponding to the elements having the same values are b, o, and r. Then, when the input symbol string bor and the word bor are converted so as to be composed of the symbols b, o, and r, both become bor. Therefore, in this converted symbol string, there are three identical symbols whose positions match, b, o, and r. On the other hand, regarding the input vector and the standard vector corresponding to the input symbol string bor and the word robe, the symbols corresponding to the elements having the same values are the symbols b, o, and r, as in the case described above. Then, when these input symbols are converted to form the input symbol string bor and the word robe, they become bor and rob, respectively. Therefore, in these symbol strings, only one symbol, which is the same symbol and whose position matches, is o. From the above, the input symbol string b
For or, the word bor rather than the word robe
e is considered to have a higher degree of similarity.

【００５６】また、例えば、入力記号列ｂｕｔｔｌｅに
対して、ワードｂｏｔｔｌｅおよびワードｂｕｔｔｅｒ
の類似度は同一となる。そこで、入力記号列ｂｕｔｔｌ
ｅおよびワードｂｏｔｔｌｅの両方に同一の数だけ含ま
れる記号を検出すると、それは、記号ｂ，ｅ，ｌ，ｔと
なる。従って、これらの記号だけで構成されるように、
入力記号列ｂｕｔｔｌｅとワードｂｏｔｔｌｅを変換す
ると、いずれもｂｔｔｌｅとなる。よって、同一の記号
であって、位置（並び）の一致する記号の数は５とな
る。一方、入力記号列ｂｏｔｔｌｅとワードｂｕｔｔｅ
ｒについて、その両方に同一の数だけ含まれる記号は、
ｂ，ｅ，ｔであり、これらの記号だけで、入力記号列ｂ
ｏｔｔｌｅとワードｂｕｔｔｅｒを構成するように変換
すると、いずれもｂｔｔｅとなる。よって、この場合
は、同一の記号であって、位置も一致している記号の数
は４となる。以上から、記号一致個数の多いワードｂｏ
ｔｔｌｅの方が、ｂｕｔｔｅｒよりも、入力記号列ｂｕ
ｔｔｌｅに類似しているとされる。Further, for example, with respect to the input symbol string buttle, a word bottle and a word butter are provided.
Have the same degree of similarity. Therefore, the input symbol string buttl
If we find the same number of symbols in both e and the word bottle, it becomes the symbols b, e, l, t. Therefore, to consist only of these symbols,
When the input symbol string "bottle" and the word "bottle" are converted, both become "blttle". Therefore, the number of symbols that are the same and have the same position (alignment) is 5. On the other hand, the input symbol string bottle and the word butte
For r, the same number of symbols in both of them is
b, e, t, and the input symbol string b with only these symbols.
When converted to form the word and the word butter, both become btte. Therefore, in this case, the number of symbols having the same symbol and matching positions is four. From the above, the word bo with a large number of symbol matches
The input symbol string bu for the title is larger than that for the butter
It is said to be similar to the title.

【００５７】類似度比較部７において、以上のような処
理を行うようにすることにより、入力記号列ｘに対する
ワードの類似度を、より高精度に区別することができ
る。By performing the above processing in the similarity comparing section 7, it is possible to more accurately distinguish the similarity of the word with respect to the input symbol string x.

【００５８】なお、図２に示した場合においては、計算
部１３において、類似度が同一のワードについての記号
一致個数を全て求めてから、比較部１４において、その
記号一致個数の比較を行うようにしたが、その他、例え
ば、計算部１３で、あるワードについての記号一致個数
を計算するごとに、比較部１４において、その記号一致
個数と、既に計算された記号一致個数とを比較するよう
にしてもよい。In the case shown in FIG. 2, the calculation unit 13 obtains all the symbol matching numbers for words having the same degree of similarity, and then the comparing unit 14 compares the symbol matching numbers. However, in addition, for example, every time the calculation unit 13 calculates the number of symbol matches for a certain word, the comparison unit 14 compares the number of symbol matches with the already calculated number of symbol matches. May be.

【００５９】また、図１の辞書検索装置は、類似度比較
部７を設けずに構成することも可能である。The dictionary search device of FIG. 1 can also be configured without the similarity comparison section 7.

【００６０】次に、図３は、本発明のデータベース装置
の一実施例の構成を示している。入力部２１は、例え
ば、キーボードなどで構成され、操作者によって操作さ
れると、その操作に対応した入力記号列ｘを辞書検索装
置２２に出力するようになされている。辞書検索装置２
２は、図１で点線で囲んだ部分に相当するものである。
データベース２４には、辞書検索装置２２を構成する辞
書３に登録されたワードに付随する情報が記憶されてい
る。即ち、辞書３に、例えば、英単語が登録されている
場合には、データベース２４には、その英単語に付随す
る情報として、例えば、その英単語の意味やその英単語
を用いた構文などが記憶されている。データベース２４
に記憶されている情報は、辞書検索装置２２における検
索結果に対応して読み出され、辞書検索装置２２を介し
て出力部２３に供給されるようになされている。出力部
２３は、例えば、ディスプレイなどで構成され、データ
ベース２４から辞書検索装置２２を介して供給される情
報を表示するようになされている。Next, FIG. 3 shows the configuration of an embodiment of the database device of the present invention. The input unit 21 is composed of, for example, a keyboard, and when operated by an operator, the input symbol string x corresponding to the operation is output to the dictionary search device 22. Dictionary search device 2
2 corresponds to the portion surrounded by the dotted line in FIG.
The database 24 stores information associated with the words registered in the dictionary 3 that constitutes the dictionary search device 22. That is, when, for example, an English word is registered in the dictionary 3, the database 24 stores information such as the meaning of the English word and the syntax using the English word as information associated with the English word. Remembered Database 24
The information stored in (1) is read according to the search result in the dictionary search device 22, and is supplied to the output unit 23 via the dictionary search device 22. The output unit 23 is composed of, for example, a display, and displays the information supplied from the database 24 via the dictionary search device 22.

【００６１】次に、その動作について説明する。入力部
２１が操作され、これにより所定の記号列ｘが入力され
ると、その入力記号列ｘは、辞書検索装置２２に供給さ
れる。辞書検索装置２２では、上述したようにして、辞
書３から、入力記号列ｘに最も類似するワードが検索さ
れる。このワードは、データベース２４に供給され、デ
ータベース２４では、そこから、そのワードに付随する
情報が読み出される。データベース２４から読み出され
た情報は、辞書検索装置２２を介して、出力部２３に供
給され、そこから出力される。Next, the operation will be described. When the input unit 21 is operated to input a predetermined symbol string x, the input symbol string x is supplied to the dictionary search device 22. As described above, the dictionary search device 22 searches the dictionary 3 for a word that is most similar to the input symbol string x. This word is provided to database 24, from which the information associated with that word is read. The information read from the database 24 is supplied to and output from the output unit 23 via the dictionary search device 22.

【００６２】なお、出力部２３から出力された情報が所
望する情報でない場合には、利用者は、入力部２１を次
候補の情報を出力するように操作することができる。入
力部２１が次候補の情報を出力するように操作された場
合には、辞書検索装置２２において、類似度が次に高い
ワードがデータベース２４に出力される。データベース
２４では、そのワードに付随する情報が検索され、辞書
検索装置２２を介して出力部２３に供給される。従っ
て、この場合、次に類似度が高いワードに付随する情報
が出力されることとなる。When the information output from the output unit 23 is not the desired information, the user can operate the input unit 21 to output the information of the next candidate. When the input unit 21 is operated to output the information of the next candidate, the dictionary search device 22 outputs the word having the next highest degree of similarity to the database 24. The database 24 searches for information associated with the word and supplies it to the output unit 23 via the dictionary search device 22. Therefore, in this case, the information associated with the word having the next highest degree of similarity is output.

【００６３】なお、入力部２１はキーボードに限定され
るものではなく、例えば、マウスやジョイスティックな
どのポインティングデバイスなどとすることが可能であ
る。この場合、出力部２３に記号列とともにカーソルを
表示させるようにすることにより、ユーザは、ポインテ
ィングデバイスを操作し、カーソルを移動することで、
記号列を選択、入力するようにことができる。The input unit 21 is not limited to the keyboard, and may be, for example, a pointing device such as a mouse or a joystick. In this case, by causing the output unit 23 to display the cursor together with the symbol string, the user operates the pointing device to move the cursor,
You can select and enter a symbol string.

【００６４】また、出力部２３は、ディスプレイに限定
されるものではなく、データベース２４から供給される
情報が、例えば、音声や、曲、その他、効果音などであ
る場合には、出力部２３はスピーカなどとすることがで
きる。The output unit 23 is not limited to a display, and if the information supplied from the database 24 is, for example, a voice, a song, a sound effect, etc., the output unit 23 will not. It can be a speaker or the like.

【００６５】さらに、上述したように、辞書３に英単語
を登録しておき、データベース２４にその英単語の意味
やその英単語を用いた構文などを記憶させておくことに
より、図３のデータベース装置は、いわば電子的な英語
辞書として機能するが、辞書３およびデータベース２４
にその他の言語に関する情報を記憶させておくようにす
れば、データベース装置は、その言語に関する電子的な
辞書として機能することとなる。Further, as described above, the English words are registered in the dictionary 3, and the meaning of the English words and the syntax using the English words are stored in the database 24. Although the device functions as an electronic English dictionary, so to speak, the dictionary 3 and the database 24
If the information regarding other languages is stored in the database, the database device will function as an electronic dictionary regarding the language.

【００６６】次に、図４は、本発明の文字認識装置の一
実施例の構成を示している。なお、図中、図３における
場合と対応する部分については、同一の符号を付してあ
る。この文字認識装置においては、手書き入力された文
字が、例えば単語単位で認識され、例えば、活字に変換
されて、表示されるようになされている。Next, FIG. 4 shows the construction of an embodiment of the character recognition apparatus of the present invention. In the figure, the same reference numerals are given to the portions corresponding to those in FIG. In this character recognition device, characters input by handwriting are recognized, for example, on a word-by-word basis, and are converted into, for example, printed characters and displayed.

【００６７】手書き文字入力部３１は、例えば、入力ペ
ンとタブレットなどから構成される。入力ペンにより、
タブレット上に手書き文字が描かれると、その文字軌跡
が文字判別部３２に出力される。文字判別部３２では、
文字軌跡が所定の記号列に変換され、辞書検索装置２２
に出力される。即ち、文字判別部３２では、手書き入力
された文字が判別され、その文字に対応する記号に変換
される。そして、手書き文字入力部３１から文字の入力
の終了を示す信号を受信すると、今までに変換した記号
を入力記号列として辞書検索装置２２に供給する。これ
により、辞書検索装置２２には、手書き入力された単語
に対応する記号列が入力される。The handwritten character input section 31 is composed of, for example, an input pen and a tablet. With the input pen,
When handwritten characters are drawn on the tablet, the character locus is output to the character determination unit 32. In the character determination unit 32,
The character locus is converted into a predetermined symbol string, and the dictionary search device 22
Is output to That is, the character discrimination unit 32 discriminates a character input by handwriting and converts it into a symbol corresponding to the character. When a signal indicating the end of character input is received from the handwritten character input unit 31, the symbols converted so far are supplied to the dictionary search device 22 as an input symbol string. As a result, the symbol string corresponding to the handwritten input word is input to the dictionary search device 22.

【００６８】ここで、この場合においては、辞書検索装
置２２を構成する辞書３（図１）には、手書き入力され
ると予想される単語が記憶されている。辞書検索装置２
２では、文字判別部３２から供給された入力記号列に最
も類似する記号列に対応する単語が辞書３から検索さ
れ、出力部２３に供給される。出力部２３では、辞書検
索装置２２から供給された単語が画面上に表示される。
即ち、以上のようにして、出力部２３では、手書き入力
された単語に対応する活字が表示される。Here, in this case, the dictionary 3 (FIG. 1) constituting the dictionary search device 22 stores words that are expected to be input by handwriting. Dictionary search device 2
In 2, the word corresponding to the symbol string most similar to the input symbol string supplied from the character discriminating unit 32 is searched from the dictionary 3 and supplied to the output unit 23. In the output unit 23, the word supplied from the dictionary search device 22 is displayed on the screen.
That is, as described above, the output unit 23 displays the print characters corresponding to the words input by handwriting.

【００６９】以上のように、この文字認識装置によれ
ば、手書き入力された単語に最も類似する単語が得られ
るので、例えば、誤った手書き入力、あるいは曖昧な手
書き入力があったとしても、その入力に対する正しい単
語を得ることが可能となる。As described above, according to this character recognition device, the word most similar to the word input by handwriting can be obtained. Therefore, even if an erroneous handwriting input or ambiguous handwriting input is made, It is possible to get the correct word for the input.

【００７０】なお、以上のような文字認識装置は、例え
ば、ワードプロセッサや電子手帳装置など、単語単位で
手書き入力文字を認識する装置などに適用を可能であ
る。The character recognition device as described above can be applied to, for example, a device for recognizing handwritten input characters on a word-by-word basis, such as a word processor or an electronic notebook device.

【００７１】また、上述の場合においては、入力記号列
に最も類似する単語を出力するようにしたが、その他、
類似度の高い順に複数個の単語を表示するようにし、使
用者にその中から正しいものを選択させるようにするこ
となども可能である。In the above case, the word most similar to the input symbol string is output.
It is also possible to display a plurality of words in descending order of similarity and allow the user to select the correct word from them.

【００７２】さらに、上述の場合では、単語単位の手書
き入力を処理するようにしたが、その他、例えば文節な
どの単位などの手書き入力を処理するようにすることも
可能である。Further, in the above-mentioned case, the handwriting input in word units is processed, but it is also possible to process the handwriting input in units such as phrases.

【００７３】次に、図５は、本発明の音声認識装置の一
実施例の構成を示している。なお、図中、図３における
場合と対応する部分については同一の符号を付してあ
る。音声入力部４１は、例えば、マイクとＡ／Ｄ変換器
などから構成されている。そして、そこでは、入力され
た音声が電気信号としての音声信号に変換され、その音
声信号がＡ／Ｄ変換されて、記号列変換部４２に出力さ
れる。記号列変換部４２では、音声信号が音響分析さ
れ、これにより音声の特徴パラメータが抽出される。Next, FIG. 5 shows the configuration of an embodiment of the voice recognition apparatus of the present invention. In the figure, the same reference numerals are given to the portions corresponding to those in FIG. The voice input unit 41 includes, for example, a microphone and an A / D converter. Then, there, the input voice is converted into a voice signal as an electric signal, the voice signal is A / D converted, and output to the symbol string conversion unit 42. In the symbol string converter 42, the voice signal is acoustically analyzed, and thereby the feature parameter of the voice is extracted.

【００７４】なお、記号列変換部４２では、例えば、線
形予測分析やバンドパスフィルタ群でなるフィルタバン
クによる分析、パワー計算、零交差数の計算などの音響
分析が行われる。In the symbol string converter 42, acoustic analysis such as linear prediction analysis, analysis by a filter bank composed of bandpass filters, power calculation, calculation of the number of zero crossings, etc. is performed.

【００７５】さらに、記号列変換部４２においては、音
響分析の結果得られた特徴パラメータが、例えば、ＨＭ
Ｍ（ＨｉｄｄｅｎＭａｒｋｏｖＭｏｄｅｌｓ）法な
どにおける確率モデルに基づいて、尤度（出現確率）
の、例えば最も高い記号に変換される。あるいは、記号
列変換部４２では、特徴パラメータが、例えば、ベクト
ル量子化され、所定のコード（記号）に変換される。Further, in the symbol string conversion unit 42, the characteristic parameter obtained as a result of the acoustic analysis is, for example, HM.
Likelihood (appearance probability) based on a probabilistic model in the M (Hidden Markov Models) method or the like.
, For example, converted to the highest symbol. Alternatively, the symbol string converter 42 vector-quantizes the feature parameter, for example, and converts it into a predetermined code (symbol).

【００７６】以上のようにして、記号列変換部４２にお
いて得られた音声に対応する記号列は、入力記号列とし
て辞書検索装置２２に供給される。なお、ここでは、音
声入力部１に、例えば単語単位で音声が入力されるよう
になされている。従って辞書検索装置２２には、単語単
位の音声に相当する記号列が供給される。As described above, the symbol string corresponding to the voice obtained in the symbol string converter 42 is supplied to the dictionary retrieval device 22 as an input symbol string. In addition, here, a voice is input to the voice input unit 1 in units of words, for example. Therefore, the dictionary search device 22 is supplied with a symbol string corresponding to a voice in word units.

【００７７】ここで、記号列変換部４２においては、記
号として、例えば、アルファベット、ローマ字、発音記
号、音韻記号（音韻を記号で表現したもの）、音素単位
記号（音素を記号で表現したもの）などの音声の特徴を
表すものを記号で表現したものを用いるようにすること
ができる。また、この場合、辞書検索装置２２を構成す
る辞書３には、認識対象とする標準的な語彙が記憶され
ている。さらに、この場合、ベクトル変換部４では、ま
ず辞書３に登録されている語彙が記号列変換部４２より
出力される記号列に対応する記号列に変換され、その
後、標準ベクトルに変換されるようになされている。Here, in the symbol string conversion unit 42, as the symbols, for example, alphabets, Roman letters, phonetic symbols, phoneme symbols (phonemes expressed in symbols), phoneme unit symbols (phonemes expressed in symbols). It is possible to use a symbol that expresses the characteristics of voice such as. Further, in this case, the standard vocabulary to be recognized is stored in the dictionary 3 constituting the dictionary search device 22. Further, in this case, the vector conversion unit 4 first converts the vocabulary registered in the dictionary 3 into a symbol string corresponding to the symbol string output from the symbol string conversion unit 42, and then converts it into a standard vector. Has been done.

【００７８】辞書検索装置２２では、記号列変換部４２
より供給された入力記号列に最も類似する記号列に対応
する単語（語彙）が辞書３から検索され、出力部２３に
供給される。出力部２３では、辞書検索装置２２から供
給された単語が入力された音声の認識結果として表示さ
れる。In the dictionary retrieval device 22, the symbol string converter 42
The word (vocabulary) corresponding to the symbol string most similar to the supplied input symbol string is searched from the dictionary 3 and supplied to the output unit 23. On the output unit 23, the word supplied from the dictionary search device 22 is displayed as the recognition result of the input voice.

【００７９】従って、この音声認識装置によれば、入力
記号列に類似する記号列が検索されるので、曖昧な音声
の入力に対しても、正しい認識結果を得ることが可能と
なる。Therefore, according to this speech recognition apparatus, since a symbol string similar to the input symbol string is searched, a correct recognition result can be obtained even for ambiguous speech input.

【００８０】なお、以上のような音声認識装置は、音声
でシステムを操作するような場合に適用することができ
る。The voice recognition device as described above can be applied to the case where the system is operated by voice.

【００８１】また、上述の音声認識装置においては、入
力記号列に最も類似する記号列を検索し、その記号列に
対応する単語を認識結果として出力するようにしても良
いし、その他、例えば、類似度の高い順に所定数の記号
列に対応する単語を出力し、ユーザにその中から正しい
認識結果を選択させるようにすることも可能である。さ
らに、音声の入力は、単語単位以外の単位で行うことも
可能である。Further, in the above speech recognition apparatus, a symbol string most similar to the input symbol string may be searched and the word corresponding to the symbol string may be output as a recognition result. It is also possible to output the words corresponding to a predetermined number of symbol strings in the descending order of similarity and allow the user to select the correct recognition result from them. Furthermore, the voice input can be performed in units other than words.

【００８２】次に、図６は、本発明の文章修正装置の一
実施例の構成を示している。なお、図中、図３における
場合と対応する部分については同一の符号を付してあ
る。この文章修正装置においては、文章データ中の誤り
が検出され、その誤りが修正されるようになされてい
る。なお、ここでは、文章データとしては、例えば、ア
ルファベットで構成される英語の文章が入力されるもの
とする。また、単語と単語の間はスペース（空白）、コ
ンマ、あるいはピリオドなどで区切られているものとす
る。Next, FIG. 6 shows the construction of an embodiment of the sentence correction device of the present invention. In the figure, the same reference numerals are given to the portions corresponding to those in FIG. This text correction device detects an error in the text data and corrects the error. Note that, here, as the text data, for example, English texts composed of alphabets are input. Also, it is assumed that words are separated from each other by a space (blank), a comma, or a period.

【００８３】入力部５１には、修正すべき文章データが
入力される。入力部５１に入力された文章データは、制
御部５２を介して文章データ記憶部５３に供給されて、
記憶される。文章データが文章データ記憶部５３に全て
記憶されると、制御部５２は文章データ記憶部５３から
文章データをその最初の部分から、所定の単位ごとに順
次読み出し、出力部２３に供給して、表示させる。即
ち、これにより、出力部２３では、文章データがその最
初の部分から、所定の単位ごとに順次表示される。Text data to be corrected is input to the input section 51. The text data input to the input unit 51 is supplied to the text data storage unit 53 via the control unit 52,
Remembered. When all the sentence data is stored in the sentence data storage unit 53, the control unit 52 sequentially reads the sentence data from the first portion of the sentence data storage unit 53 in predetermined units and supplies the sentence data to the output unit 23. Display it. That is, as a result, the output unit 23 sequentially displays the text data for each predetermined unit from the first portion.

【００８４】さらに、制御部５２は、出力部２３に所定
の単位の文章データを出力すると、その文章データから
それを構成する単語を順次取り出す。なお、この単語の
取り出しは、文章中の空白、コンマ、ピリオドなどを検
出することによって行われる。Further, when the control unit 52 outputs the sentence data in a predetermined unit to the output unit 23, the words constituting the sentence data are sequentially taken out from the sentence data. The extraction of this word is performed by detecting blanks, commas, periods, etc. in the sentence.

【００８５】制御部５２では、取り出された単語が辞書
検索装置２２に供給される。ここで、辞書検索装置２２
の辞書３には、英単語が登録されているものとする。辞
書検索装置２２（図１）では、まず、入力部１におい
て、制御部５２から供給された単語が辞書３から検索さ
れる。そして、制御部５２から供給された単語と一致す
る単語が辞書３から検索された場合には、その単語には
誤りがないので、その旨が辞書検索装置２２から制御部
５２に対して知らされる。一方、制御部５２から供給さ
れた単語と一致する単語が辞書３から検索することがで
きなかった場合には、ベクトル変換部２以降のブロック
において、その単語に類似する単語が辞書３から検索さ
れる。そして、類似度の高い順に所定数の単語が検索結
果として制御部５２に供給される。The control section 52 supplies the extracted words to the dictionary retrieval device 22. Here, the dictionary search device 22
It is assumed that English words are registered in the dictionary 3 of. In the dictionary search device 22 (FIG. 1), first, the input unit 1 searches the dictionary 3 for a word supplied from the control unit 52. Then, when the dictionary 3 is searched for a word that matches the word supplied from the control unit 52, there is no error in that word, and the dictionary search device 22 notifies the control unit 52 to that effect. It On the other hand, if a word that matches the word supplied from the control unit 52 cannot be searched from the dictionary 3, words similar to that word are searched from the dictionary 3 in blocks after the vector conversion unit 2. It Then, a predetermined number of words in descending order of similarity are supplied to the control unit 52 as search results.

【００８６】制御部５２では、所定の単位の文章データ
を出力部２３に表示させた後、辞書検索装置２２から表
示文章中の全ての単語について誤りがなかった旨が辞書
検索装置２２から知らされると、次の所定の単位の文章
データを文章データ記憶部５３から読み出して、出力部
２３に表示させる。また、制御部５２では、辞書検索装
置２２から複数の単語が検索結果として受信された場合
には、即ち、ある単語の綴りが誤っており、その単語に
類似する単語が供給された場合には、その単語が誤って
いることを警告し、さらに、辞書検索装置２２から供給
されたその単語に類似する複数の単語が修正候補として
表示される。In the control unit 52, after the sentence data of a predetermined unit is displayed on the output unit 23, the dictionary search unit 22 informs the dictionary search unit 22 that there is no error in all the words in the displayed sentence. Then, the next predetermined unit of text data is read from the text data storage unit 53 and displayed on the output unit 23. Further, in the control unit 52, when a plurality of words are received as search results from the dictionary search device 22, that is, when a certain word is misspelled and a word similar to the word is supplied. , Warns that the word is incorrect, and a plurality of words similar to the word supplied from the dictionary search device 22 are displayed as correction candidates.

【００８７】具体的には、出力部２３において、例え
ば、図７に示すように、文章データの一部が表示部６１
に表示される。そして、その中の単語に誤りがある場合
には、その単語が反転されたり、あるいは点滅されて表
示され、これにより誤った単語が使用者に知らされる。
なお、図７においては、ｂｔｔｌｅが誤った単語として
表示されている。More specifically, in the output unit 23, for example, as shown in FIG.
Is displayed in. Then, if there is an error in the word, the word is displayed in reverse or blinking, thereby informing the user of the incorrect word.
In addition, in FIG. 7, bttle is displayed as an incorrect word.

【００８８】さらに、この場合、出力部２３の画面の上
部に、ｂｔｔｌｅに対する類似度が高い順に複数の単語
が表示される。Further, in this case, a plurality of words are displayed at the top of the screen of the output unit 23 in descending order of similarity to bttle.

【００８９】以上のような表示がなされた後、装置は入
力待ちの状態になる。そして、入力部５１が操作され、
出力部２３の画面上部に表示された複数の単語の中から
正しい単語が選択されると、表示部６１に表示されてい
た誤った単語がその選択された単語に修正される。After the above display is made, the apparatus is in a state of waiting for input. Then, the input unit 51 is operated,
When a correct word is selected from the plurality of words displayed on the upper part of the screen of the output unit 23, the incorrect word displayed on the display unit 61 is corrected to the selected word.

【００９０】即ち、例えば、入力部５１は、図８に示す
ように、カーソルキー７１、確定キー７２、数字キー７
３、及びカーソルキー７４を含んで構成されている。ま
ず、使用者は、カーソルキー７１を操作することによ
り、画面に表示されたカーソル６２（図７）を正しい単
語の位置に移動させる。そして、正しい単語の位置にカ
ーソル６２を移動させた後、確定キー７２を操作するこ
とにより、その単語に誤った単語（図７に示した場合に
おいては、ｂｔｔｌｅ）が修正される。That is, for example, as shown in FIG. 8, the input section 51 includes a cursor key 71, a confirm key 72, and a numeric key 7.
3 and a cursor key 74. First, the user operates the cursor key 71 to move the cursor 62 (FIG. 7) displayed on the screen to the correct word position. Then, by moving the cursor 62 to the correct word position and then operating the confirm key 72, the incorrect word (bttle in the case shown in FIG. 7) is corrected.

【００９１】なお、本実施例においては、辞書検索装置
２２から供給された複数の単語は、出力部２３におい
て、図７に示すように、番号とともに表示されるように
なされており、数字キー７３（図８）のうちの正しい単
語に対応する数字のボタンを操作することによっても、
その単語に誤った単語を修正することができるようにな
されている。In the present embodiment, the plurality of words supplied from the dictionary search device 22 are displayed together with the numbers on the output section 23 as shown in FIG. By operating the button of the number corresponding to the correct word in (Fig. 8),
It is designed so that the wrong word can be corrected.

【００９２】また、カーソルキー７４は、今の表示状態
から、前の、あるいは次の誤りのある単語を表示させる
ときに操作される。The cursor key 74 is operated to display the previous or next erroneous word from the current display state.

【００９３】確定キー７２が操作され、誤った単語が正
しい単語に修正された後は、次の誤りがある単語の部分
が表示される。なお、表示部６１に表示されている所定
の単位の文章データに誤りがある単語が含まれていない
場合には、次の単位の文章データが表示される。After the enter key 72 is operated and the incorrect word is corrected to the correct word, the next incorrect word portion is displayed. If the sentence data in the predetermined unit displayed on the display unit 61 does not include an error word, the sentence data in the next unit is displayed.

【００９４】以上のようにして、文章データ記憶部５３
に記憶された文章データは、いわば逐次的に出力部２３
に表示され、その中の誤った単語が修正されていくこと
になる。As described above, the sentence data storage unit 53
The text data stored in the so-called
Will be displayed, and the wrong word in it will be corrected.

【００９５】なお、入力部５１は、キーボードを含めて
構成し、そのキーボードを操作することによって、正し
い単語を入力するようにすることも可能である。It is also possible to configure the input section 51 including a keyboard and input the correct word by operating the keyboard.

【００９６】以上のように、この文章修正装置によれ
ば、誤った単語の修正を容易に行うことができる。As described above, according to this sentence correction device, it is possible to easily correct an incorrect word.

【００９７】[0097]

【発明の効果】以上のように、本発明の辞書検索装置に
よれば、例えば、余分な記号が挿入されていたり、ある
いは、必要な記号が欠落したような入力記号列に対し、
類似している記号列を辞書から検索することが可能とな
る。As described above, according to the dictionary search device of the present invention, for example, for input symbol strings in which extra symbols are inserted or necessary symbols are missing,
It is possible to retrieve a similar symbol string from the dictionary.

【００９８】また、本発明のデータベース装置によれ
ば、曖昧な入力記号列の入力があった場合でも、正確な
記号列に付随する情報を得ることが可能となる。Further, according to the database device of the present invention, even if an ambiguous input symbol string is input, it is possible to obtain the information associated with the correct symbol string.

【００９９】さらに、本発明の文字認識装置によれば、
曖昧な手書き文字列の入力があったとしても、それに対
応する文字列（例えば、単語など）を得ることが可能と
なる。Further, according to the character recognition device of the present invention,
Even if an ambiguous handwritten character string is input, it is possible to obtain a character string (for example, a word) corresponding to it.

【０１００】また、本発明の音声認識装置によれば、曖
昧な音声の入力に対しても、正確な認識結果を得ること
が可能となる。Further, according to the voice recognition device of the present invention, it is possible to obtain an accurate recognition result even for ambiguous voice input.

【０１０１】さらに、本発明の文章修正装置によれば、
文章中の誤りを容易に修正することが可能となる。Further, according to the sentence correction device of the present invention,
It is possible to easily correct mistakes in the text.

【図面の簡単な説明】[Brief description of drawings]

【図１】本発明の辞書検索装置の一実施例の構成を示す
ブロック図である。FIG. 1 is a block diagram showing the configuration of an embodiment of a dictionary search device of the present invention.

【図２】図１の類似度比較部７の詳細な構成例を示すブ
ロック図である。FIG. 2 is a block diagram showing a detailed configuration example of a similarity comparison unit 7 in FIG.

【図３】本発明のデータベース装置の一実施例の構成を
示すブロック図である。FIG. 3 is a block diagram showing a configuration of an embodiment of a database device of the present invention.

【図４】本発明の文字認識装置の一実施例の構成を示す
ブロック図である。FIG. 4 is a block diagram showing a configuration of an embodiment of a character recognition device of the present invention.

【図５】本発明の音声認識装置の一実施例の構成を示す
ブロック図である。FIG. 5 is a block diagram showing a configuration of an embodiment of a voice recognition device of the present invention.

【図６】本発明の文章修正装置の一実施例の構成を示す
ブロック図である。FIG. 6 is a block diagram showing a configuration of an embodiment of a text correction device of the present invention.

【図７】図６の出力部２３の表示画面例を示す図であ
る。7 is a diagram showing an example of a display screen of the output unit 23 of FIG.

【図８】図６の入力部５１の構成例を示す図である。8 is a diagram showing a configuration example of an input unit 51 of FIG.

【符号の説明】[Explanation of symbols]

１入力部２ベクトル変換部３辞書４ベクトル変換部５類似度計算部６，７類似度比較部８検索結果出力部１１ベクトルの要素比較部１２記号除去部１３記号一致個数の計算部１４記号一致個数の比較部２１入力部２２辞書検索装置２３出力部２４データベース３１手書き文字入力部３２文字判別部４１音声入力部４２記号列変換部５１入力部５２制御部５３文章データ記憶部６１表示部６２カーソル７１カーソルキー７２確定キー７３数字キー７４カーソルキー 1 Input Part 2 Vector Conversion Part 3 Dictionary 4 Vector Conversion Part 5 Similarity Calculation Part 6, 7 Similarity Comparison Part 8 Search Result Output Part 11 Vector Element Comparison Part 12 Symbol Elimination Part 13 Symbol Matching Number Calculation Part 14 Symbol Matching Number comparison unit 21 Input unit 22 Dictionary search device 23 Output unit 24 Database 31 Handwritten character input unit 32 Character discrimination unit 41 Voice input unit 42 Symbol string conversion unit 51 Input unit 52 Control unit 53 Text data storage unit 61 Display unit 62 Cursor 71 cursor key 72 confirm key 73 number key 74 cursor key

フロントページの続き (51)Int.Cl.⁶ 識別記号庁内整理番号ＦＩ技術表示箇所Ｇ１０Ｌ 3/00 ５５１Ｂ 9194−5ＬＧ０６Ｆ 15/403 ３５０Ｃ Continuation of the front page (51) Int.Cl. ⁶ Identification number Office reference number FI Technical display location G10L 3/00 551 B 9194-5L G06F 15/403 350 C

Claims

【特許請求の範囲】[Claims]

【請求項１】所定の記号列の集合で構成される辞書か
ら、入力記号列に類似するものを検索する辞書検索装置
であって、前記入力記号列を、それに含まれる各記号の個数を要素
の大きさとするベクトルである入力ベクトルに変換する
入力記号列変換手段と、前記辞書に登録されている所定の記号列それぞれを、そ
れに含まれる各記号の個数を要素の大きさとするベクト
ルである標準ベクトルに変換する辞書記号列変換手段
と、前記入力ベクトルと標準ベクトルとが類似している度合
いを表す類似度を算出する算出手段と、前記算出手段により算出された類似度どうしを比較する
類似度比較手段とを備え、前記類似度比較手段の比較結果に基づいて、前記辞書か
ら、前記入力記号列に類似する前記所定の記号列を検索
することを特徴とする辞書検索装置。1. A dictionary retrieval device for retrieving a dictionary similar to an input symbol string from a dictionary consisting of a set of predetermined symbol strings, wherein the input symbol string is the number of symbols included in the dictionary. Input symbol string conversion means for converting into an input vector which is a vector having a size of, and a predetermined symbol string registered in the dictionary, each of which is a vector having a number of each symbol included therein as an element size Dictionary symbol string conversion means for converting into a vector, calculation means for calculating the degree of similarity indicating the degree of similarity between the input vector and the standard vector, and similarity degree for comparing the similarity degrees calculated by the calculation means And comparing the predetermined symbol string similar to the input symbol string from the dictionary based on the comparison result of the similarity comparing unit. Book search device.

【請求項２】前記所定の記号列および入力記号列を、
その両方に同一の数だけ含まれる記号だけで構成される
比較記号列に変換し、それらの比較記号列を比較する記
号列比較手段をさらに備え、前記類似度が一致する前記所定の記号列が複数存在する
ときには、前記記号列比較手段の比較結果に基づいて、
前記辞書から、前記入力記号列に類似する前記所定の記
号列を検索することを特徴とする請求項１に記載の辞書
検索装置。2. The predetermined symbol string and the input symbol string,
Converting to a comparison symbol string consisting of only the same number of symbols included in both, further comprising a symbol string comparing means for comparing those comparison symbol strings, the predetermined symbol string the similarity is the same When there are multiple, based on the comparison result of the symbol string comparison means,
The dictionary search device according to claim 1, wherein the predetermined symbol string that is similar to the input symbol string is searched from the dictionary.

【請求項３】前記辞書から、前記類似度の高い順に所
定数の前記所定の記号列を検索することを特徴とする請
求項１または２に記載の辞書検索装置。3. The dictionary search device according to claim 1, wherein a predetermined number of the predetermined symbol strings are searched from the dictionary in descending order of similarity.

【請求項４】前記辞書記号列変換手段を備えることに
代えて、前記辞書を、前記標準ベクトルを含めて構成す
るようにしたことを特徴とする請求項１乃至３のいずれ
かに記載の辞書検索装置。4. The dictionary according to claim 1, wherein the dictionary is configured to include the standard vector instead of including the dictionary symbol string conversion means. Search device.

【請求項５】前記入力記号列と前記辞書に登録されて
いる所定の記号列それぞれとを比較し、前記入力記号列
と一致する前記所定の記号列が存在する場合には、その
記号列を検索結果とし、前記入力記号列と一致する前記
所定の記号列が存在しない場合には、前記類似度比較手
段の比較結果に基づいて、前記辞書から、前記入力記号
列に類似する前記所定の記号列を検索することを特徴と
する請求項１乃至４のいずれかに記載の辞書検索装置。5. The input symbol string is compared with each of the predetermined symbol strings registered in the dictionary, and if there is the predetermined symbol string that matches the input symbol string, the symbol string is deleted. As a search result, when the predetermined symbol string that matches the input symbol string does not exist, the predetermined symbol similar to the input symbol string is retrieved from the dictionary based on the comparison result of the similarity comparing means. The dictionary search device according to any one of claims 1 to 4, which searches a column.

【請求項６】請求項１乃至５のいずれかに記載の辞書
検索装置と、前記辞書に登録された所定の記号列に付随する情報を記
憶しているデータベースとを備え、前記辞書検索装置による検索の結果得られた前記所定の
記号列に付随する情報を出力することを特徴とするデー
タベース装置。6. The dictionary search apparatus according to claim 1, further comprising: a database that stores information associated with a predetermined symbol string registered in the dictionary. A database device, which outputs information associated with the predetermined symbol string obtained as a result of a search.

【請求項７】請求項１乃至５のいずれかに記載の辞書
検索装置と、手書き文字を入力する入力手段と、前記入力手段に入力された手書き文字を、前記入力記号
列に変換する手書き文字変換手段とを備え、前記辞書検索装置による検索の結果得られた前記所定の
記号列に対応して、前記手書き文字を認識することを特
徴とする文字認識装置。7. The dictionary search device according to claim 1, an input unit for inputting handwritten characters, and a handwritten character for converting the handwritten characters input to the input unit into the input symbol string. A character recognizing device, comprising: a converting means, and recognizing the handwritten character corresponding to the predetermined symbol string obtained as a result of the search by the dictionary searching device.

【請求項８】請求項１乃至５のいずれかに記載の辞書
検索装置と、音声を入力する入力手段と、前記入力手段に入力された音声を、前記入力記号列に変
換する音声変換手段とを備え、前記辞書検索装置による検索の結果得られた前記所定の
記号列に対応して、前記音声を認識することを特徴とす
る音声認識装置。8. The dictionary search device according to claim 1, an input unit for inputting a voice, and a voice conversion unit for converting the voice input to the input unit into the input symbol string. A voice recognition device comprising: a voice recognition device that recognizes the voice corresponding to the predetermined symbol string obtained as a result of the search by the dictionary search device.

【請求項９】文章中の単語の綴りの誤りを修正する文
章修正装置であって、請求項５に記載の辞書検索装置と、前記単語を、前記入力記号列として、前記辞書検索装置
に供給する供給手段と、前記入力記号列と一致する前記所定の記号列が存在しな
い場合に、前記入力記号列に対応する単語を、前記辞書
検索装置による検索の結果得られた前記記号列に対応す
る単語に修正する修正手段とを備えることを特徴とする
文章修正装置。9. A sentence correction device for correcting a spelling error of a word in a sentence, the dictionary search device according to claim 5, and supplying the word as the input symbol string to the dictionary search device. And a supplying means, and when the predetermined symbol string that matches the input symbol string does not exist, the word corresponding to the input symbol string corresponds to the symbol string obtained as a result of the search by the dictionary search device. A sentence correction device comprising a correction means for correcting a word.