JP3792759B2

JP3792759B2 - Character recognition method and apparatus

Info

Publication number: JP3792759B2
Application number: JP23533495A
Authority: JP
Inventors: 一弘松林; 伸一砂川
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1995-09-13
Filing date: 1995-09-13
Publication date: 2006-07-05
Anticipated expiration: 2015-09-13
Also published as: JPH0981689A

Description

【０００１】
【発明の属する技術分野】
本発明は、画像から抽出した文字を認識する文字認識方法とその装置に関する。
【０００２】
【従来技術】
従来、画像から抽出した文字を認識して対応する文字コードを出力する装置が知られている。図１３は、従来の文字認識手順を示すフローチャートであり、簡単に説明する。
【０００３】
まず、Ｓ１０００で画像データ入力を行い、Ｓ１００２でその画像データ記憶させる。Ｓ１００３では、画像データから文字パターンを含む領域を抽出する。Ｓ１００４では、文字認識辞書１００５を探索して、文字認識処理を行ない、Ｓ１０１０では認識結果を記憶する。
【０００４】
【発明が解決しようとする課題】
しかしながら、前記従来の文字認識方法では、充分な認識の確度が得られないという問題があった。
本発明の目的は、上記従来例に鑑みてなされたもので、より高い確信度で文字認識が可能な文字認識方法とその装置を提供することを目的とする。
【０００５】
【課題を解決するための手段】
上記目的を達成するため、本発明の文字認識方法は、以下の構成を備える。即ち、複数の異なる画像それぞれから複数の文字パターンを抽出して文字認識処理を行うことにより、当該抽出した文字パターンそれぞれに対応する複数の候補文字コードとその確信度を獲得する文字候補獲得工程と、前記複数の異なる画像の各々において、前記抽出した各文字パターンの位置に基づいて前記文字パターンをグループ化することにより、複数の文字グループを生成するグルーピング工程と、前記生成された文字グループそれぞれに含まれる各文字パターンに対応する複数の候補文字コードに基づいて、前記複数の異なる画像間で同一の文字グループを判定する文字グループ判定工程と、前記判定された同一の文字グループそれぞれに含まれる文字パターンに対応する候補文字コードの確信度を集計し、前記集計された確信度に基づき、前記同一の文字グループに含まれる文字パターンに対応する文字コードを決定する文字コード決定工程とを備える。
【０００６】
上記文字認識方法において、前記複数の異なる画像が第１の画像と第２の画像であるとき、前記文字グループ判定工程では、前記第１の画像において生成された第１の文字グループに含まれる各文字パターンの候補文字コードと前記第２の画像において生成された第２の文字グループに含まれる各文字パターンの候補文字コードとを比較し、対応する全ての文字パターンの候補文字コードの中に同一の候補文字コードがある場合、前記第１の文字グループと前記第２の文字グループとが同一の文字グループであると判定することを特徴とする。
【０００７】
上記文字認識方法において、前記複数の異なる画像が動画像における異なるシーンの画像であり、且つ、前記抽出される文字パターンが当該動画像における動かないテロップであるという条件が加えられた場合、前記文字グループ判定工程では、前記生成された複数の文字グループに含まれる文字パターンの、文字数と座標と大きさとに基づいて、前記複数の異なる画像間で同一の文字グループを判定することを特徴とする。
【０００８】
上記文字認識方法において、前記文字候補獲得工程では、前記抽出された文字パターンから所定の特徴抽出を行って得られた特徴と、所定の辞書に格納されている各文字コードに対応する特徴とのマッチングに基づいて、前記候補文字コードとその確信度を獲得することを特徴とする。
【０００９】
上記目的を達成するため、本発明の文字認識装置は、以下の構成を備える。即ち、複数の異なる画像それぞれから複数の文字パターンを抽出して文字認識処理を行うことにより、当該抽出した文字パターンそれぞれに対応する複数の候補文字コードとその確信度を獲得する文字候補獲得手段と、前記複数の異なる画像の各々において、前記抽出した各文字パターンの位置に基づいて前記文字パターンをグループ化することにより、複数の文字グループを生成するグルーピング手段と、前記生成された複数の文字グループそれぞれに含まれる各文字パターンに対応する複数の候補文字コードに基づいて、前記複数の異なる画像間で同一の文字グループを判定する文字グループ判定手段と、前記判定された同一の文字グループそれぞれに含まれる文字パターンに対応する候補文字コードの確信度を集計し、前記集計された確信度に基づき、前記同一の文字グループに含まれる文字パターンに対応する文字コードを決定する文字コード決定手段とを備える。
【００１０】
上記目的を達成するため、本発明の装置は、以下の構成を備える。即ち、メモリに記憶されているコンピュータ実行可能なプログラムを読み出してＣＰＵにおいて実行することにより、画像内の文字パターンに対応する文字コードを決定する装置であって、前記プログラムは、複数の異なる画像それぞれから複数の文字パターンを抽出して文字認識処理を行うことにより、当該抽出した文字パターンそれぞれに対応する複数の候補文字コードとその確信度を獲得する文字候補獲得ステップと、前記異なる複数の画像の各々において、前記抽出した各文字パターンの位置に基づいて前記文字パターンをグループ化することにより、複数の文字グループを生成するグルーピングステップと、前記生成された複数の文字グループそれぞれに含まれる各文字パターンに対応する複数の候補文字コードに基づいて、前記複数の異なる画像間で同一の文字グループを判定する文字グループ判定ステップと、前記判定された同一の文字グループそれぞれに含まれる文字パターンに対応する候補文字コードの確信度を集計し、前記集計された確信度に基づき、前記同一の文字グループに含まれる文字パターンに対応する文字コードを決定する文字コード決定ステップとを、コンピュータに実行させるためのプログラムコードを含む。
【００１１】
【発明の実施の形態】
はじめに、本発明に係る一実施の形態のポイントを要約した後に、その詳細な説明に入る。
本発明に係る文字認識方法の一実施の形態では、複数の画像から文字パターンを抽出し、抽出された文字パターンの認識処理を行って、文字パターンに対応する文字コードの候補とその確信度（得点）を求める。また、各文字パターンの位置関係で、互いに近いもの同士を１つのグループとしてグルーピングする。そして、上述の複数の画像間で、そのグループ単位で、選ばれた文字コードの候補に基づいて、文字パターンの対応をとり、グループ単位でのマッチングをとる。
【００１２】
次に、マッチングが取られたグループに含まれる文字パターンについて、文字コード候補ごとの得点を集計する。そして、その得点集計結果に基づいて、最終的に、文字パターンに対する文字コードを確定する。
この様に、単独の画像単位で認識するのではなく、複数の画像間での文字グループのマッチングをとることに基づいて文字認識を行うことにより、より確度の高い文字認識が可能になった。
【００１３】
即ち、複数の画像間で対応する文字が異なった認識結果が得られた場合でも、上述の得点集計結果に基づいて、単独の画像内での認識結果を修正できるため、より信頼性の高い文字認識を可能とした。
次に、本実施の形態に係る文字認識装置の詳細な説明を行う。
（第１の実施の形態）
図１は、本実施の形態の文字認識装置のハードウェア構成図である。
【００１４】
２１は画像入力装置であり、ビデオカメラやイメージスキャナから画像を取り込んだり、ハードディスクやＣＤ−ＲＯＭから画像データを読み込んだり、通信機器から画像データを受信したりする。
２２は画像メモリであり、入力された画像データを一時的に記憶する。
２３はＣＰＵであり、後述する各文字認識処理のフローチャートに対応するプログラムを格納しているプログラム記憶メモリ２４から、プログラムを逐次読み出し、解釈し、実行する。
【００１５】
２５は辞書パターン記憶メモリであり、文字認識で用いる辞書パターンを記憶する。
２６はデータメモリであり、文字認識処理の途中結果や最終結果のデータを記憶したり、ＣＰＵによる文字認識処理の実行に必要な作業領域として使われる。２７は指示装置であり、キーボード、マウス、ペンなどを備え、本文字認識装置での各種動作の指示やデータ入力を行なう。
【００１６】
２８は表示装置であり、ＣＲＴやＬＣＤなどを備え、処理対象の画像や各種データを画面に表示する。
次に、図２は、図１のハードウェア上で処理される文字認識処理手順を示すフローチャートである。
尚、以下に示すフローチャートに対応するプログラムは、プログラム記憶メモリ２４に格納されており、ＣＰＵ２３によって実行される。
【００１７】
ステップＳ１では、画像入力装置２１から複数の静止画像または動画像を入力する。ステップＳ２では、入力された複数の画像データを画像メモリ２２に記憶する。ステップＳ３では、画像データから各文字パターンを含む領域を抽出する処理を行なう。
【００１８】
ステップＳ４では、辞書パターン記憶メモリ２５の辞書パターンを参照して、文字認識処理を行なう。これは、通常行われている所定の文字特徴に関して、辞書パターンの特徴と認識対象の文字の特徴との距離によって行えばよい。この距離の近い順に認識候補の文字コードが辞書から抽出され、各抽出された認識候補の文字コードにはその距離、即ち、確信度（例えば、距離の小さいものには、高い確信度、即ち、より大きな数字を対応させる）の情報がアタッチされる。
【００１９】
尚、後述する本ステップのさらに詳細な処理手順の一例を図１５に示す。
ステップＳ５では、ステップＳ４の認識結果の認識候補の文字コードと確信度をデータメモリ２６に格納する。また、各文字パターンの画像でのレイアウト位置の近く、また、同じサイズの文字パターンを同じグループとするグルーピングを行う。
【００２０】
ステップＳ６では、複数の入力画像間でグループ単位で、グループ内の文字コードが一致するものを探索する。
ステップＳ７では、ステップＳ６で一致が検出されたグループ単位で、そのグループに属する各文字パターンの認識候補文字コードの確信度の集計を行う。例えば、同じ認識候補文字コードごとの合計を取る。
【００２１】
ステップＳ８では、ステップＳ７での集計結果に基づいて、確信度の最も高い最適な候補を選択し、対応する文字パターンの認識結果として決定する。そして、その認識結果の文字を表示装置２８に出力する。
ステップＳ９では、決定した認識結果をデータメモリ２６に記憶する。
以上説明したように、単独の画像単位で認識するのではなく、複数の画像間での文字グループのマッチングをとることに基づいて文字認識を行うことにより、より確度の高い文字認識が可能になった。
（第２の実施の形態）
以上、第１の実施の形態の文字認識手順の概略を説明した。次に、第１の実施の形態での基本的処理コンセプトは、同じであるが、別の処理手順による実現方法をより詳細に説明する。
【００２２】
まず、図３のフローチャートを用いて、本発明の第２の実施の形態について説明する。
ステップＳ３１において、１枚分の画像データを画像メモリに入力する。例えば図４（ａ）のような地図の画像を入力する。
ステップＳ３２において、画像データからグループごとの文字群を抽出し、辞書２５を参照して、認識候補の文字コードとその確信度を得る。この結果を、データメモリ２６に格納する。
【００２３】
そして、例えば、その抽出され、認識された文字パターンに関して、図５のようなデータ構造を生成する。文字群のグループ分けは、各文字パターンが互いに近く配置されおり、サイズが同じものを集めことによっておこなう。
図５のデータ構造は、大きく３カテゴリに分類できる。第１のカテゴリは、５００、５０１に示すような、各グループごとに有するグループテーブルである。第２のカテゴリは、そのグループテーブル（５００、５０１）からアドレスポイントされ、各グループに含まれる各文字パターン（画像中の）のサイズに関する情報（ｘ，ｙ座標、幅、高さ、等）と文字パターンに対応する文字コードを格納している認識候補テーブル（５０３、５０４）へのアドレスポインタを有する文字パターン情報テーブル（５０２）である。最後の第３のカテゴリは、文字パターン情報テーブル（５０２）からアドレスポイントされ、各文字パターンに対応する認識結果である各認識候補の文字コードとその得点（即ち、確信度）を格納している認識候補テーブル（５０３、５０４）である。
【００２４】
本処理によって、図４（ａ）の画像に関して処理した場合において、生成されたデータ構造が図５であり、この内容に関して以下説明する。
図５を参照して、グループテーブル（５００）は、図４（ａ）の「宮沢町」に関するもので、他方、グループテーブル（５０１）は、図４（ａ）の「東区」に関するものである。グループテーブル（５００、５０１）の左欄は、テーブル自体内での、各文字パターンに関する情報が格納されている位置のアドレスポインタ（右欄）を格納しているアドレスを示す。
【００２５】
尚、各文字の位置や大きさの情報からこれらの５文字のうち“宮”、“沢”、“町”の３文字は“宮沢町”という語を構成するとみなし、１つの文字群としてグループ単位でグループテーブル（５００）で管理している。同様に“東”、“区”の２文字の“東区”という語を構成する１つのグループとして、グループテーブル（５０１）で管理している。
【００２６】
文字パターン情報テーブル（５０２）のアドレス２０００Ｈから順に、“宮”、“沢”、“東”、“区”“町”の各文字の「ｘ座標、ｙ座標、幅、高さ」、と文字認識候補の情報が格納されている位置を示すアドレスポインタがそれぞれ格納される。尚、“東”、“区”、“町”の各パターンに対しては、記述を簡単にするため、アドレスポインタの行き先を示す矢印を省略している。
【００２７】
認識候補テーブル（５０３）のアドレス３０００Ｈからは、“宮”という文字パターンの認識候補として“宮”、“官”、“宜”の文字コードと得点が格納される。同様に、認識候補テーブル（５０４）のアドレス３０１０Ｈからは、“沢”という文字パターンの認識候補として“沢”、“況”という文字コードと得点が格納される。
【００２８】
尚、図示していないが、“東”、“区”“町”の各文字についても、同様に候補の文字コードと得点が格納される。
次に、ステップＳ３３において、上のステップで処理した画像が、１枚目の入力画像であるかチェックし、１枚目であれば、ステップＳ３５へ進む。そして、ステップＳ３５では、全画像について処理を終了したかチェックし、まだ、残りの画像があれば、ステップＳ３１に戻り、次の画像の入力を行う。２枚目の入力画像の例を図４（ｂ）に示す。以下、この画像に関して、上述のステップＳ３２、ステップＳ３３、ステップＳ３４の処理を実行してゆく。
【００２９】
そして、ステップＳ３５の判定で、全画像について処理を終了すれば、ステップＳ３６へ進み、全画像の全文字グループについて認識候補の中から、集計の確信度の高いものを選択して、文字コードを確定し、その結果を表示装置に表示する。尚、ステップＳ３６の処理は、図９を参照して後述する。
上述のステップＳ３３で、処理した画像が１枚目でなければ、ステップＳ３４に進み、ステップＳ３４では、１枚目の画像と２枚目の画像で、同一のグループ化された語があるかどうか判定する。本ステップでの詳細な処理は、図７を参照して後述するが、結果として、１枚目の画像で“宮沢町”と認識された文字グループと、２枚目の画像で“宮沢町”と認識された文字グループは、同一であると判定される。他方、１枚目の画像で“東区”と認識された文字グループと２枚目の画像で“西区”と認識された文字グループは同一でないと判定される。
【００３０】
尚、図４（ａ），（ｂ）の２枚の画像を入力して認識する例を示したが、２枚以上であってもよいことは言うまでもない。
次に、図６のフローチャートを用いて、図３のステップＳ３２での画像データから文字群を抽出する詳細な処理手順を説明する。
ステップＳ４１において、画像から文字を含む領域を抽出する。これには文字と背景の輝度や色の違いによって抽出する方法や、周波数成分の特徴の違いによって抽出する方法などを用いればよい。
【００３１】
ステップＳ４２において、文字を１文字ずつのパターンに分離する。これには特定の色の画素を連結したパターンについて形や大きさが所定の条件を満たすかどうかで判定する。
ステップＳ４３において、分離されたパターンに対して、辞書パターン記憶メモリ２５に格納されている辞書パターンを検索することで文字認識を行ない、候補の文字コード及び得点（認識の確からしさ：確信度）を得る。また、文字のｘ座標、ｙ座標、幅、高さの情報も得る。
【００３２】
ステップＳ４４において、各文字の座標から所定の範囲内の距離にある文字を統合し、文字群としてグループ単位で管理する。例えば、注目する文字に対して上下左右１文字分の範囲内に他の文字があればその文字を同じ文字群に統合する。これにより、縦書き、横書き、斜め書きにかかわらず、１つの語を構成する文字をまとめる。
【００３３】
次に、図７のフローチャートを用いて、図３のステップＳ３４での処理、即ち、複数の画像間で、同一のグループ語があるかどうか判定する動作について説明する。
ステップＳ５１において、ステップＳ３２で選択された、認識文字コードの候補とその確信度を含む文字グループをデータメモリから取り込む。
【００３４】
ステップＳ５２において、ステップＳ５１で取り込んだ文字グループと一致する文字グループが他の画像の文字グループにあるかをチェックするために、含まれる文字コード単位で比較する。そして、同一の文字でなければステップＳ５６へ進み別の語と判定し、ステップＳ５５へ進む。
逆に、同一の文字があれば、ステップＳ５３へ進む。
【００３５】
ステップＳ５３では、文字グループ内の全文字について、比較が終了したかどうかをチェックする。そして、文字グループ内の全文字が一致すれば、ステップＳ５４において同一文字グループとし、その結果をデータメモリ２６に格納する。
ステップＳ５５では、全文字グループの組み合わせにおいて、比較が終了したかどうかチェックし、終了していなければ、ステップＳ５１に戻り、同様の処理を繰り返す。全文字グループの全組み合わせでの比較が終了すれば、本処理を終了する。
【００３６】
尚、上述のステップＳ５４やステップＳ５６での判定は、図８に示す様なグループ一致判定テーブルを、データメモリ２６上に作成して、一致マーク（図８では丸印）を格納しておくことで実現される。図８の表示で示されるように、画像１の文字グループ１と画像２の文字グループ２とは、文字１と文字１、文字２と文字２、文字３と文字３とがそれぞれ同じ候補を含むので同一の語とみなすことができる。また画像１の文字グループ２と画像２の文字グループ１とは、文字２と文字２は同じ候補を含むが文字１と文字１は同じ候補を含まないので別の語とみなす。
【００３７】
次に、図９のフローチャートを用いて、図３のステップ３６の文字コードの確定処理を説明する。
まず、ステップＳ６１において、同一の文字グループの組をデータメモリ２６から取り込む。
ステップＳ６２において、ステップＳ６１で取り込んだ文字グループ内の対応する同じ候補文字コードの得点を集計する。集計方法としては、各候補ごとの全画像の得点を加算してもよいし、また、各候補ごとの全画像中の最高点を取り出して得点としてもよい。
【００３８】
ステップＳ６３において、集計点が高い候補を選び出し、対応する文字の認識結果として確定し、表示装置に表示する。
図１０（ａ）は、集計処理の一例を説明するための図であり、ここでは、５枚の入力画像での文字パターン「宮」に対する、“宮”、“官”、“宜”の候補文字の得点の例として加算する方法を用いた集計結果を示す。この結果、得点合計は、それぞれ２５０点、２００点、４０点となり、“宮”が最も集計点が高い。したがって１から５までのすべての画像において、この文字を“宮”として確定する。すなわち２枚目の画像と５枚目の画像の誤認識が修正されることになる。
【００３９】
本実施の形態では、複数の画像の中に共通に含まれる語について共通の認識結果を出力するので、複数の画像のうち、ある画像の文字が誤認識しても、他の画像の認識結果によって自動的に修正することができ、結果として認識の確度を上げることができるという効果がある。
（第３の実施例）
本発明の第３の実施の形態においては、図１１のように動画像内のテロップの認識方法について説明する。
【００４０】
テロップは、背景画像の影響などで正しく認識されない場合がある。例えば、図１０（ｂ）の画像２のように、正しい文字“宮”の得点が０で、文字候補から漏れてしまう場合もあり得る。
そこで、背景画像が図１１（ａ）から図１１（ｂ）のように変わってもテロップは動かないという条件を加えることで、文字グループの文字数及び各文字の座標、大きさが同じであれば同一の語とみなすことができる。
【００４１】
図１２のフローチャートを用いて、同一の語があるかどうか判定する動作について説明する。
ステップＳ１２１において、２つの画像から文字グループを１つずつ入力する。
ステップＳ１２２において、文字グループの文字数が等しいかどうか判定し、等しくなければステップＳ１２８へ進み、別の語と判定する。
【００４２】
他方、文字数が等しいかどうかチェックし、等しければステップＳ１２３へ進む。
ステップＳ１２３では、文字の座標（ｘ座標とｙ座標の双方）が等しいかどうか判定し、等しくなければステップＳ１２８へ進み別の語と判定する。等しければステップＳ１２４へ進む。
【００４３】
ステップＳ１２４では、文字の大きさ（幅と高さの双方）が等しいかどうか判定し、等しくなければステップＳ１２８へ進み別の語と判定する。等しければステップＳ１２５へ進む。
ステップＳ１２５では、文字グループ内の全文字に関して処理が終了したかチェックし、終了していなければ、ステップＳ１２３に戻り、同様の処理を繰り返す。終了すれば、ステップＳ１２６に進み、文字グループ内の全文字について座標と大きさが等しい文字の組み合わせができたとして、同一文字グループと判定する。
【００４４】
ステップＳ１２７では、全文字グループの組み合わせについて、上述の処理がが終了したかチェックし、終了していなければ、ステップＳ１２１に戻り、終了するまで繰り返す。
上述の処理を、同一文字グループの判定にさらに加えることで、例えば、図１０（ｂ）に示すように、“宮”の合計点が最も高い結果が得られ、１から５までのすべての画像において、この文字を“宮”として確定できる。
【００４５】
尚、本実施の形態では、テロップのように文字が静止しているという条件をさらに加えることで、複数の画像のうち、ある画像が誤認識で候補から漏れても、他の画像の認識結果によって自動的に修正することができ、認識確度を上げることができるという効果がある。
尚、上述の説明では、複数の画像間での文字グループのマッチングに基づいて、文字パターンに対する文字コードの決定を行っている例を示したが、これは、複数の画像間ではなく、同じ画像内での文字グループのマッチングに基づいて、文字パターンに対する文字コードの決定を行ってもよいことは言うまでもない。
【００４６】
次に、図１４は、プログラム記憶メモリ２４にアサインされた上述したフローチャートの各処理に対応する各プログラムのレイアウトの一例を示す。尚、このプログラムは、フロッピーディスクなどの可搬可能な媒体に格納され、実行時に、メモリ２０２にロードされて、ＣＰＵ２３によって実行されてもよいことは言うまでもない。
【００４７】
１４０には、ステップＳ１−Ｓ２の画像入力／格納処理を行うプログラムが格納されている。
１４２には、ステップＳ３の文字パターン抽出処理を行うプログラムが格納されている。
１４３には、ステップＳ３で抽出された文字パターンに対応する候補文字パターンを辞書２５から抽出し、確信度を求める処理を行う文字認識プログラムが格納されている。
【００４８】
１４４には、ステップＳ３で抽出された文字パターンのグルーピング処理を行うプログラムが格納されている。
１４５には、ステップＳ７の得点（確信度）集計処理を行うプログラムが格納されている。
【００４９】
１４６には、ステップＳ８での文字コード決定処理を行うプログラムが格納されている。
１４７には、ステップＳ８での文字コード決定処理を行った後、表示装置にその結果を表示するプログラムが格納されている。
尚、図１４のレイアウトの順に、特別の意味はなく任意の順でよい。
【００５０】
次に、図２のステップＳ４の文字認識処理の詳細な処理の一例を、図１５を参照して説明する。
まず、図１５を参照して、ステップＳ１５１では、ステップＳ３で切り出された各文字パターンを入力する。
ステップＳ１５２では、入力した文字パターンの幅と高さをそれぞれ拡大、または、縮小して、所定の大きさに正規化する。
【００５１】
ステップＳ１５３では、正規化パターンを所定の閾値で２値化する。
ステップＳ１５４では、従来から知られている細線化の方法で２値化されたパターンを細線化処理する。
ステップＳ１５５では、細線化処理されたパターン、または、その所定の特徴量と文字認識辞書５とのマッチングを行う。
【００５２】
ステップＳ１５６では、マッチング結果として、マッチング距離の近い、即ち、確信度の高い文字候補を選択する。
以上、ステップＳ４での詳細な処理の一例を示した。
尚、本発明は、ホストコンピュータ、インタフェース、表示装置等の複数の機器から構成されるシステムに適用しても、１つの機器からなる装置に適用しても良い。また、本発明はシステム或は装置にプログラムを供給することによって達成される場合にも適用できることはいうまでもない。この場合、本発明を達成するためのソフトウェアによって表されるプログラムを格納した記憶媒体から、該プログラムを該システム或は装置に読み出すことによって、そのシステム或は装置が、本発明の効果を享受することが可能となる。
【００５３】
以上述べたように、画像から文字パターンを抽出する手段と、抽出された文字パターンを認識し文字コードの候補及び各候補の得点を出力する手段と、複数の画像間でグループ単位で文字パターンを対応させる手段と、対応するグループの文字パターンについて文字コードの候補ごとの得点を集計する手段と、得点集計結果に基づいて前記文字コードの候補及び各候補の得点を修正する手段とを設けたことにより、より確度の高い文字認識を行うことができる。
【００５４】
【発明の効果】
以上説明したように本発明によれば、より高い確信度で文字認識が可能な文字認識をおこなうことができる。
【図面の簡単な説明】
【図１】本発明の文字認識装置のブロック図である。
【図２】本発明の文字認識処理手順を示すフローチャートである。
【図３】本発明の第２の実施の形態の文字認識処理手順を示すフローチャートである。
【図４】入力画像の一例を示す図である。
【図５】画像データから文字グループを抽出した結果得られるデータの例を示した図である。
【図６】画像データから文字グループを抽出する動作を示したフローチャートである。
【図７】同一グループ語を判定する動作を示したフローチャートである。
【図８】複数の画像間での文字の対応の例を示したグループ一致判定テーブルの図である。
【図９】判定認識候補の中から文字コードを確定する処理手順を示したフローチャートである。
【図１０】複数の入力画像に対する各候補の得点の例を示す図である。
【図１１】入力画像の例を示す図である。
【図１２】同一語を判定する処理手順を示したフローチャートである。
【図１３】従来の文字認識装置のフローチャートである。
【図１４】本実施の形態に係る文字認識処理の各プログラムのレイアウトを示す図である。
【図１５】図２のステップＳ４での文字認識処理の詳細な処理例を示すフローチャートである。
【符号の説明】
２１画像入力装置
２２画像メモリ
２３ＣＰＵ
２４プログラム記憶メモリ
２５辞書パターン記憶メモリ[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a character recognition method and apparatus for recognizing characters extracted from an image.
[0002]
[Prior art]
2. Description of the Related Art Conventionally, an apparatus that recognizes characters extracted from an image and outputs a corresponding character code is known. FIG. 13 is a flowchart showing a conventional character recognition procedure, which will be briefly described. Do .
[0003]
First, image data is input in S1000, and the image data is stored in S1002. In S1003, an area including a character pattern is extracted from the image data. In S1004, the character recognition dictionary 1005 is searched and character recognition processing is performed, and in S1010, the recognition result is stored.
[0004]
[Problems to be solved by the invention]
However, the conventional character recognition method has a problem that sufficient recognition accuracy cannot be obtained.
An object of the present invention has been made in view of the above-described conventional example, and an object thereof is to provide a character recognition method and apparatus capable of character recognition with higher certainty.
[0005]
[Means for Solving the Problems]
In order to achieve the above object, a character recognition method according to the present invention comprises: , The following configuration is provided. That is, Several different image Respectively From plural Extract character patterns By performing character recognition processing, each extracted character pattern Corresponding plural Character candidate acquisition step for acquiring a candidate character code and its certainty, In each of a plurality of different images, based on the position of each extracted character pattern Group the character patterns By making multiple A grouping step for generating a character group and the generated character group Multiple candidate character codes corresponding to each character pattern included in each On the basis of the, The same between the different images Character group Judgment Character group Judgment Process, and Judgment Was Same Character groups Respectively The certainty factor of the candidate character code corresponding to the character pattern included in is totaled, and based on the aggregated certainty factor, Same A character code determining step for determining a character code corresponding to a character pattern included in the character group.
[0006]
In the character recognition method, when the plurality of different images are a first image and a second image, the character group determination step includes each of the first character groups generated in the first image. The candidate character code of the character pattern is compared with the candidate character code of each character pattern included in the second character group generated in the second image, and is the same among the candidate character codes of all corresponding character patterns When there is a candidate character code, it is determined that the first character group and the second character group are the same character group. .
[0007]
In the character recognition method, when the condition that the plurality of different images are images of different scenes in the moving image and the extracted character pattern is a non-moving telop in the moving image is added, In the group determination step, the same character group is determined between the plurality of different images based on the number of characters, coordinates, and size of the character patterns included in the plurality of generated character groups. .
[0008]
In the character recognition method, in the character candidate acquisition step, a feature obtained by performing a predetermined feature extraction from the extracted character pattern and a feature corresponding to each character code stored in a predetermined dictionary The candidate character code and its certainty are acquired based on matching .
[0009]
In order to achieve the above object, a character recognition device of the present invention comprises the following arrangement. That is, by extracting a plurality of character patterns from each of a plurality of different images and performing character recognition processing, a plurality of candidate character codes corresponding to each of the extracted character patterns and a character candidate acquisition unit for acquiring the certainty factor thereof; Grouping means for generating a plurality of character groups by grouping the character patterns based on the positions of the extracted character patterns in each of the plurality of different images; and the plurality of generated character groups Character group determination means for determining the same character group among the plurality of different images based on a plurality of candidate character codes corresponding to the character patterns included in each of the character patterns, and included in each of the determined same character groups The certainty of the candidate character code corresponding to the character pattern Based on Sind, and a character code determining means for determining a character code corresponding to the character pattern included in the same group of characters .
[0010]
In order to achieve the above object, an apparatus of the present invention comprises the following arrangement. That is, An apparatus for determining a character code corresponding to a character pattern in an image by reading a computer-executable program stored in a memory and executing it on a CPU, wherein the program includes a plurality of programs Different image Respectively From plural Extract character patterns By performing character recognition processing, each extracted character pattern Corresponding plural Character candidate acquisition step for acquiring a candidate character code and its certainty, and Different In each of multiple images , Said Based on the position of each extracted character pattern Group character patterns By making multiple A grouping step for generating a character group, and the generated plural Character group Multiple candidate character codes corresponding to each character pattern included in each Based on the plurality of Different Between images Same Character group Judgment Character group Judgment Step and said Judgment Was Same Character groups Respectively The certainty factor of the candidate character code corresponding to the character pattern included in is totaled, and based on the aggregated certainty factor, Same And a program code for causing a computer to execute a character code determination step for determining a character code corresponding to a character pattern included in the character group.
[0011]
DETAILED DESCRIPTION OF THE INVENTION
First, after summarizing the points of an embodiment according to the present invention, a detailed description thereof will be given.
In one embodiment of a character recognition method according to the present invention, a character pattern is extracted from a plurality of images, and the extracted character pattern is recognized, and a character code candidate corresponding to the character pattern and its certainty factor ( Score). Further, the character patterns that are close to each other are grouped as one group. Then, character patterns are matched based on the selected character code candidates for each group among the plurality of images described above, and matching is performed for each group.
[0012]
Next, the score for every character code candidate is totaled about the character pattern contained in the group by which the matching was taken. Based on the score total result, the character code for the character pattern is finally determined.
As described above, character recognition is performed based on matching character groups between a plurality of images instead of being recognized in units of individual images, thereby enabling character recognition with higher accuracy.
[0013]
In other words, even when recognition results with different corresponding characters between a plurality of images are obtained, the recognition results in a single image can be corrected based on the above-mentioned score total result, so that more reliable characters Recognition was possible.
Next, the character recognition device according to the present embodiment will be described in detail.
(First embodiment)
FIG. 1 is a hardware configuration diagram of the character recognition device according to the present embodiment.
[0014]
An image input device 21 captures an image from a video camera or an image scanner, reads image data from a hard disk or a CD-ROM, and receives image data from a communication device.
An image memory 22 temporarily stores the input image data.
A CPU 23 sequentially reads, interprets, and executes a program from a program storage memory 24 that stores a program corresponding to a flowchart of each character recognition process described later.
[0015]
A dictionary pattern storage memory 25 stores a dictionary pattern used for character recognition.
Reference numeral 26 denotes a data memory which is used as a work area necessary for storing the intermediate result and final result data of the character recognition process and for executing the character recognition process by the CPU. An instruction device 27 includes a keyboard, a mouse, a pen, and the like, and performs various operation instructions and data input in the character recognition device.
[0016]
A display device 28 includes a CRT, an LCD, and the like, and displays an image to be processed and various data on a screen.
Next, FIG. 2 is a flowchart showing a character recognition processing procedure processed on the hardware of FIG.
A program corresponding to the flowchart shown below is stored in the program storage memory 24 and executed by the CPU 23.
[0017]
In step S1, a plurality of still images or Video Enter. In step S <b> 2, a plurality of input image data is stored in the image memory 22. In step S3, processing for extracting an area including each character pattern from the image data is performed.
[0018]
In step S4, a character recognition process is performed with reference to the dictionary pattern in the dictionary pattern storage memory 25. This may be performed according to the distance between the feature of the dictionary pattern and the feature of the character to be recognized with respect to a predetermined character feature that is normally performed. Character codes of recognition candidates are extracted from the dictionary in order of increasing distance, and each extracted recognition candidate character code has its distance, that is, a certainty factor (for example, a high certainty factor for a small distance, that is, Information corresponding to a larger number) is attached.
[0019]
An example of a more detailed processing procedure of this step described later is shown in FIG.
In step S5, the character code and the certainty factor of the recognition candidate of the recognition result in step S4 are stored in the data memory 26. Further, grouping is performed in which character patterns of the same size are arranged in the same group near the layout position in the image of each character pattern.
[0020]
In step S6, a search is made for a character code in the group that matches between the input images in groups.
In step S7, the certainty factor of the recognition candidate character code of each character pattern belonging to the group is tabulated for each group in which a match is detected in step S6. For example, the sum for each of the same recognition candidate character codes is taken.
[0021]
In step S8, an optimum candidate having the highest certainty factor is selected based on the tabulation result in step S7, and determined as a recognition result of the corresponding character pattern. Then, the recognition result character is output to the display device 28.
In step S9, the determined recognition result is stored in the data memory 26.
As described above, it is possible to perform character recognition with higher accuracy by performing character recognition based on matching character groups between a plurality of images instead of recognizing in units of individual images. It was.
(Second Embodiment)
The outline of the character recognition procedure of the first embodiment has been described above. Next, although the basic processing concept in the first embodiment is the same, an implementation method according to another processing procedure will be described in more detail.
[0022]
First, a second embodiment of the present invention will be described using the flowchart of FIG.
In step S31, image data for one sheet is input to the image memory. For example, a map image as shown in FIG.
In step S32, a character group for each group is extracted from the image data, and the dictionary 25 is referred to obtain the character code of the recognition candidate and its certainty factor. This result is stored in the data memory 26.
[0023]
Then, for example, a data structure as shown in FIG. 5 is generated for the extracted and recognized character pattern. The grouping of character groups is performed by collecting character patterns that are arranged close to each other and have the same size.
The data structure of FIG. 5 can be roughly classified into three categories. The first category is a group table for each group, such as 500 and 501. The second category is addressed from the group table (500, 501), and information (x, y coordinates, width, height, etc.) regarding the size of each character pattern (in the image) included in each group, and It is a character pattern information table (502) having an address pointer to a recognition candidate table (503, 504) storing a character code corresponding to the character pattern. The last third category is addressed from the character pattern information table (502), and stores the character code of each recognition candidate as a recognition result corresponding to each character pattern and its score (ie, certainty factor). It is a recognition candidate table (503, 504).
[0024]
FIG. 5 shows the data structure generated when the image of FIG. 4A is processed by this processing, and the contents will be described below.
Referring to FIG. 5, the group table (500) relates to “Miyazawa Town” in FIG. 4 (a), while the group table (501) relates to “East Ward” in FIG. 4 (a). is there. The left column of the group table (500, 501) indicates an address in which an address pointer (right column) of a position where information regarding each character pattern is stored in the table itself.
[0025]
From the information on the position and size of each character, among these five characters, the three characters “miya”, “sawa”, and “machi” are considered to constitute the word “Miyazawa”, and are grouped as one character group. In units group Managed by table (500). Similarly, as a group that composes the word “East Ward” with two letters “East” and “Ward”, group Managed by table (501).
[0026]
In order from the address 2000H in the character pattern information table (502), "X coordinate, y coordinate, width, height" and characters of "miya", "sawa", "east", "ku", "machi" Address pointers indicating the positions where the information of recognition candidates are stored are respectively stored. It should be noted that the arrows indicating the destinations of the address pointers are omitted for the “east”, “ward”, and “town” patterns in order to simplify the description.
[0027]
From the address 3000H in the recognition candidate table (503), the character codes and scores of “miya”, “government”, and “good” are stored as recognition candidates for the character pattern “miya”. Similarly, from the address 3010H of the recognition candidate table (504), character codes “sawa” and “situation” and scores are stored as recognition candidates for the character pattern “sawa”.
[0028]
Although not shown, candidate character codes and scores are similarly stored for each character of “East”, “Ku”, and “Machi”.
Next, in step S33, it is checked whether the image processed in the above step is the first input image. If it is the first image, the process proceeds to step S35. In step S35, it is checked whether the processing has been completed for all images. If there are still remaining images, the process returns to step S31 to input the next image. An example of the second input image is shown in FIG. Hereinafter, the processing of step S32, step S33, and step S34 described above is executed for this image.
[0029]
Then, if the processing for all the images is completed in the determination in step S35, the process proceeds to step S36, and from among the recognition candidates for all the character groups of all the images, the one with a high certainty of aggregation is selected, and the character code is selected. Confirm the result and display the result on the display device. The process of step S36 will be described later with reference to FIG.
If the processed image is not the first image in step S33 described above, the process proceeds to step S34. In step S34, whether the first image and the second image have the same grouped word. judge. Detailed processing in this step will be described later with reference to FIG. 7. As a result, the character group recognized as “Miyazawacho” in the first image and “Miyazawacho” in the second image are obtained. It is determined that the character groups recognized as being the same. On the other hand, it is determined that the character group recognized as “East Ward” in the first image and the character group recognized as “West Ward” in the second image are not the same.
[0030]
In addition, although the example which inputs and recognizes two images of Fig.4 (a) and (b) was shown, it cannot be overemphasized that two or more may be sufficient.
Next, a detailed processing procedure for extracting a character group from the image data in step S32 of FIG. 3 will be described using the flowchart of FIG.
In step S41, a region including characters is extracted from the image. For this purpose, a method of extraction based on the difference in brightness and color between the character and the background, a method of extraction based on the difference in the characteristics of frequency components, or the like may be used.
[0031]
In step S42, the characters are separated into character-by-character patterns. This is determined by whether the shape and size of a pattern in which pixels of a specific color are connected satisfy a predetermined condition.
In step S43, character recognition is performed on the separated pattern by searching a dictionary pattern stored in the dictionary pattern storage memory 25, and candidate character codes and scores (reliability of recognition: confidence) are obtained. obtain. Also, information on the x-coordinate, y-coordinate, width, and height of the character is obtained.
[0032]
In step S44, characters within a predetermined range from the coordinates of each character are integrated and managed as a group of characters as a group. For example, if there is another character within the range of one character in the vertical and horizontal directions with respect to the character of interest, that character is integrated into the same character group. Thus, characters constituting one word are collected regardless of vertical writing, horizontal writing, and diagonal writing.
[0033]
Next, the processing in step S34 in FIG. 3, that is, the operation for determining whether or not the same group word exists between a plurality of images will be described using the flowchart in FIG.
In step S51, the character group including the recognized character code candidate and the certainty factor selected in step S32 is fetched from the data memory.
[0034]
In step S52, in order to check whether there is a character group that matches the character group captured in step S51 in the character group of another image, comparison is performed in units of character codes included. If it is not the same character, the process proceeds to step S56, is determined to be another word, and the process proceeds to step S55.
Conversely, if there is an identical character, the process proceeds to step S53.
[0035]
In step S53, it is checked whether or not the comparison has been completed for all characters in the character group. If all the characters in the character group match, the same character group is set in step S54, and the result is stored in the data memory 26.
In step S55, it is checked whether or not the comparison has been completed for all character group combinations. If not, the process returns to step S51 to repeat the same processing. When the comparison is completed for all combinations of all character groups, this process ends.
[0036]
Note that the determination in step S54 and step S56 described above is to create a group match determination table as shown in FIG. 8 on the data memory 26 and store a match mark (circle in FIG. 8). It is realized with. As shown in the display of FIG. 8, the character group of image 1 1 And character group 2 of image 2 can be regarded as the same word because character 1 and character 1, character 2 and character 2, and character 3 and character 3 each contain the same candidate. The character group 2 of the image 1 and the character group 1 of the image 2 are regarded as different words because the character 2 and the character 2 include the same candidate but the character 1 and the character 1 do not include the same candidate.
[0037]
Next, the character code determination process in step 36 of FIG. 3 will be described using the flowchart of FIG.
First, in step S 61, the same character group set is fetched from the data memory 26.
In step S62, the scores of the corresponding candidate character codes in the character group captured in step S61 are totaled. As a totaling method, the scores of all the images for each candidate may be added, or the highest score in all the images for each candidate may be extracted and used as the score.
[0038]
In step S63, a candidate with a high total score is selected, confirmed as a recognition result of the corresponding character, and displayed on the display device.
FIG. 10A is a diagram for explaining an example of the totaling process. Here, candidates for “miya”, “government”, and “good” for the character pattern “miya” in five input images are shown. The total result using the method of adding as an example of the score of a character is shown. As a result, the total score is 250 points, 200 points, and 40 points, respectively, and “miya” has the highest total score. Therefore, in all the images from 1 to 5, this character is determined as “miya”. That is, erroneous recognition of the second image and the fifth image is corrected.
[0039]
In this embodiment, since a common recognition result is output for words that are commonly included in a plurality of images, even if a character of one image among the plurality of images is erroneously recognized, the recognition result of another image Can be automatically corrected, and as a result, the accuracy of recognition can be improved.
(Third embodiment)
In the third embodiment of the present invention, a method for recognizing a telop in a moving image as shown in FIG. 11 will be described.
[0040]
The telop may not be recognized correctly due to the influence of the background image. For example, as shown in the image 2 in FIG. 10B, the score of the correct character “miya” may be 0 and may be leaked from the character candidates.
Therefore, by adding a condition that the telop does not move even if the background image changes from FIG. 11 (a) to FIG. 11 (b), the number of characters in the character group and the coordinates and size of each character are the same. It can be regarded as the same word.
[0041]
The operation for determining whether or not there is the same word will be described using the flowchart of FIG.
In step S121, character groups are input one by one from the two images.
In step S122, it is determined whether the number of characters in the character group is equal. If not, the process proceeds to step S128 to determine another word.
[0042]
On the other hand, it is checked whether the number of characters is equal. If they are equal, the process proceeds to step S123.
In step S123, it is determined whether or not the character coordinates (both x coordinate and y coordinate) are equal. If they are not equal, the process proceeds to step S128 to determine another word. If they are equal, the process proceeds to step S124.
[0043]
In step S124, it is determined whether the character sizes (both width and height) are equal. If not, the process proceeds to step S128 to determine another word. If they are equal, the process proceeds to step S125.
In step S125, it is checked whether the processing has been completed for all characters in the character group. If not, the processing returns to step S123 and the same processing is repeated. If completed, the process proceeds to step S126, and all the characters in the character group are determined to be the same character group on the assumption that a combination of characters having the same coordinates and size is made.
[0044]
In step S127, it is checked whether or not the above-described processing has been completed for all character group combinations. If not completed, the process returns to step S121 and is repeated until it is completed.
By further adding the above processing to the determination of the same character group, for example, as shown in FIG. 10 (b), the result with the highest total sum of “miya” is obtained, and all images from 1 to 5 are obtained. This character can be confirmed as “miya”.
[0045]
In the present embodiment, by further adding a condition that the character is stationary like a telop, even if one image out of a plurality of images is mistakenly recognized as a candidate, the recognition result of other images Can be automatically corrected, and the recognition accuracy can be improved.
In the above description, the example in which the character code is determined for the character pattern based on the matching of the character group between the plurality of images is shown. However, this is not the case between the plurality of images, but the same image. Needless to say, the character code for the character pattern may be determined based on the matching of the character groups.
[0046]
Next, FIG. 14 shows an example of the layout of each program corresponding to each process of the above-described flowchart assigned to the program storage memory 24. Needless to say, this program may be stored in a portable medium such as a floppy disk, loaded into the memory 202 at the time of execution, and executed by the CPU 23.
[0047]
140 stores a program for performing the image input / storage processing of steps S1-S2.
142 stores a program for performing the character pattern extraction process of step S3.
143 stores a character recognition program that extracts a candidate character pattern corresponding to the character pattern extracted in step S3 from the dictionary 25 and performs a process of obtaining a certainty factor.
[0048]
144 stores a program for grouping the character patterns extracted in step S3.
145 stores a program for performing the score (confidence) totaling process of step S7.
[0049]
In 146, a program for performing the character code determination process in step S8 is stored.
147 stores a program for displaying the result on the display device after performing the character code determination process in step S8.
Note that there is no special meaning in the layout order of FIG.
[0050]
Next, an example of detailed processing of the character recognition processing in step S4 of FIG. 2 will be described with reference to FIG.
First, referring to FIG. 15, in step S151, each character pattern cut out in step S3 is input.
In step S152, the width and height of the input character pattern are enlarged or reduced, respectively, and normalized to a predetermined size.
[0051]
In step S153, the normalized pattern is binarized with a predetermined threshold.
In step S154, the binarized pattern is thinned by a conventionally known thinning method.
In step S155, the thinned pattern or the predetermined feature amount is matched with the character recognition dictionary 5.
[0052]
In step S156, as a matching result, a character candidate having a short matching distance, that is, a high certainty factor is selected.
Heretofore, an example of detailed processing in step S4 has been shown.
The present invention may be applied to a system constituted by a plurality of devices such as a host computer, an interface, and a display device, or may be applied to a device constituted by a single device. Needless to say, the present invention can also be applied to a case where the present invention is achieved by supplying a program to a system or apparatus. In this case, the system or apparatus receives the effects of the present invention by reading the program from the storage medium storing the program represented by the software for achieving the present invention into the system or apparatus. It becomes possible.
[0053]
As described above, the means for extracting the character pattern from the image, the means for recognizing the extracted character pattern and outputting the candidate character code and the score of each candidate, and the character pattern in units of groups between the plurality of images Corresponding means, means for totalizing scores for each character code candidate for the corresponding group of character patterns, and means for correcting the character code candidates and the score of each candidate based on the score totaling results Thus, character recognition with higher accuracy can be performed.
[0054]
【The invention's effect】
As described above, according to the present invention, it is possible to perform character recognition that enables character recognition with higher certainty.
[Brief description of the drawings]
FIG. 1 is a block diagram of a character recognition device of the present invention.
FIG. 2 is a flowchart showing a character recognition processing procedure of the present invention.
FIG. 3 is a flowchart showing a character recognition processing procedure according to the second embodiment of the present invention;
FIG. 4 is a diagram illustrating an example of an input image.
FIG. 5 is a diagram showing an example of data obtained as a result of extracting a character group from image data.
FIG. 6 is a flowchart showing an operation of extracting a character group from image data.
FIG. 7 is a flowchart showing an operation for determining the same group word.
FIG. 8 is a diagram of a group match determination table showing an example of character correspondence between a plurality of images.
FIG. 9 is a flowchart showing a processing procedure for determining a character code from among judgment recognition candidates.
FIG. 10 is a diagram illustrating an example of scores for each candidate for a plurality of input images.
FIG. 11 is a diagram illustrating an example of an input image.
FIG. 12 is a flowchart showing a processing procedure for determining the same word.
FIG. 13 is a flowchart of a conventional character recognition device.
FIG. 14 is a diagram showing a layout of each program of character recognition processing according to the present embodiment.
FIG. 15 is a flowchart showing a detailed processing example of character recognition processing in step S4 of FIG. 2;
[Explanation of symbols]
21 Image input device
22 Image memory
23 CPU
24 Program memory
25 Dictionary pattern memory

Claims

複数の異なる画像それぞれから複数の文字パターンを抽出して文字認識処理を行うことにより、当該抽出した文字パターンそれぞれに対応する複数の候補文字コードとその確信度を獲得する文字候補獲得工程と、
前記複数の異なる画像の各々において、前記抽出した各文字パターンの位置に基づいて前記文字パターンをグループ化することにより、複数の文字グループを生成するグルーピング工程と、
前記生成された文字グループそれぞれに含まれる各文字パターンに対応する複数の候補文字コードに基づいて、前記複数の異なる画像間で同一の文字グループを判定する文字グループ判定工程と、
前記判定された同一の文字グループそれぞれに含まれる文字パターンに対応する候補文字コードの確信度を集計し、前記集計された確信度に基づき、前記同一の文字グループに含まれる文字パターンに対応する文字コードを決定する文字コード決定工程と
を備えることを特徴とする文字認識方法。A character candidate acquisition step of acquiring a plurality of candidate character codes corresponding to each of the extracted character patterns and a certainty factor thereof by extracting a plurality of character patterns from each of a plurality of different images and performing character recognition processing ,
In each of the plurality of different images, a grouping step of generating a plurality of character groups by grouping the character patterns based on the positions of the extracted character patterns;
A character group determination step of determining the identity of a character group between on the basis of a plurality of candidate character codes corresponding to each character pattern included in each of the generated character group, the plurality of different images,
The certainty factors of candidate character codes corresponding to the character patterns included in each of the determined identical character groups are totaled, and the characters corresponding to the character patterns included in the same character group based on the aggregated certainty factors A character recognition method comprising: a character code determination step for determining a code.

前記複数の異なる画像が第１の画像と第２の画像であるとき、
前記文字グループ判定工程では、前記第１の画像において生成された第１の文字グループに含まれる各文字パターンの候補文字コードと前記第２の画像において生成された第２の文字グループに含まれる各文字パターンの候補文字コードとを比較し、対応する全ての文字パターンの候補文字コードの中に同一の候補文字コードがある場合、前記第１の文字グループと前記第２の文字グループとが同一の文字グループであると判定することを特徴とする請求項１に記載の文字認識方法。 When the plurality of different images are a first image and a second image;
In the character group determination step, each candidate character code of each character pattern included in the first character group generated in the first image and each included in the second character group generated in the second image. When the candidate character code of the character pattern is compared and the same candidate character code is present among the candidate character codes of all the corresponding character patterns, the first character group and the second character group are the same The character recognition method according to claim 1, wherein the character recognition method is determined to be a character group .

前記複数の異なる画像が動画像における異なるシーンの画像であり、且つ、前記抽出される文字パターンが当該動画像における動かないテロップであるという条件が加えられた場合、
前記文字グループ判定工程では、前記生成された複数の文字グループに含まれる文字パターンの、文字数と座標と大きさとに基づいて、前記複数の異なる画像間で同一の文字グループを判定することを特徴とする請求項１に記載の文字認識方法。 When the condition that the plurality of different images are images of different scenes in the moving image and the extracted character pattern is a non-moving telop in the moving image is added,
In the character group determination step, the same character group is determined between the plurality of different images based on the number of characters, coordinates, and size of character patterns included in the plurality of generated character groups. The character recognition method according to claim 1 .

前記文字候補獲得工程では、前記抽出された文字パターンから所定の特徴抽出を行って得られた特徴と、所定の辞書に格納されている各文字コードに対応する特徴とのマッチングに基づいて、前記候補文字コードとその確信度を獲得することを特徴とする請求項１に記載の文字認識方法。In the character candidate acquisition step, based on matching between a feature obtained by performing predetermined feature extraction from the extracted character pattern and a feature corresponding to each character code stored in a predetermined dictionary, 2. The character recognition method according to claim 1 , wherein a candidate character code and its certainty are acquired.

前記グルーピング工程では、前記複数の異なる画像の各々において、前記抽出した文字パターン間の距離が近い文字パターンを同じグループとしてグループ化することにより、複数の文字グループを生成することを特徴とする請求項１に記載の文字認識方法。In the grouping step , a plurality of character groups are generated by grouping character patterns having a short distance between the extracted character patterns into the same group in each of the plurality of different images. Item 12. The character recognition method according to Item 1 .

前記グルーピング工程では、前記複数の異なる画像の各々において、前記抽出した文字パターン間の距離が近く、さらに、サイズが概同じ文字パターンを同じグループとしてグループ化することにより、複数の文字グループを生成することを特徴とする請求項５に記載の文字認識方法。In the grouping step , in each of the plurality of different images , a plurality of character groups are generated by grouping the character patterns having the same distance between the extracted character patterns and approximately the same size as the same group. character recognition method according to claim 5, characterized in that the.

前記文字コード決定工程では、前記同一の文字グループに含まれる文字パターンに対応する候補文字コードごとに確信度を合計した値を、該候補文字コードの前記集計された確信度とすることを特徴とする請求項１に記載の文字認識方法。In the character code determination step, a value obtained by summing the certainty factor for each candidate character code corresponding to the character pattern included in the same character group is set as the aggregated certainty factor of the candidate character code. The character recognition method according to claim 1 .

前記文字コード決定工程では、前記同一の文字グループに含まれる文字パターンに対応する候補文字コードごとに求めた確信度の最大値を、該候補文字コードの前記集計された確信度とすることを特徴とする請求項１に記載の文字認識方法。In the character code determination step, the maximum certainty factor obtained for each candidate character code corresponding to the character pattern included in the same character group is set as the aggregated certainty factor of the candidate character code. The character recognition method according to claim 1 .

前記複数の異なる画像は、動画像における複数の異なるシーンの画像であることを特徴とする請求項１に記載の文字認識方法。The character recognition method according to claim 1 , wherein the plurality of different images are images of a plurality of different scenes in a moving image.

複数の異なる画像それぞれから複数の文字パターンを抽出して文字認識処理を行うことにより、当該抽出した文字パターンそれぞれに対応する複数の候補文字コードとその確信度を獲得する文字候補獲得手段と、
前記複数の異なる画像の各々において、前記抽出した各文字パターンの位置に基づいて前記文字パターンをグループ化することにより、複数の文字グループを生成するグルーピング手段と、
前記生成された複数の文字グループそれぞれに含まれる各文字パターンに対応する複数の候補文字コードに基づいて、前記複数の異なる画像間で同一の文字グループを判定する文字グループ判定手段と、
前記判定された同一の文字グループそれぞれに含まれる文字パターンに対応する候補文字コードの確信度を集計し、前記集計された確信度に基づき、前記同一の文字グループに含まれる文字パターンに対応する文字コードを決定する文字コード決定手段と
を備えることを特徴とする文字認識装置。Character candidate acquisition means for acquiring a plurality of candidate character codes corresponding to each of the extracted character patterns and their certainty by extracting a plurality of character patterns from each of a plurality of different images and performing character recognition processing ,
Grouping means for generating a plurality of character groups by grouping the character patterns based on the positions of the extracted character patterns in each of the plurality of different images;
A character group judging means for judging a same character group between on the basis of a plurality of candidate character codes corresponding to each character pattern included in each of the plurality of character groups the generated, the plurality of different images,
The certainty factors of candidate character codes corresponding to the character patterns included in each of the determined identical character groups are totaled, and the characters corresponding to the character patterns included in the same character group based on the aggregated certainty factors A character recognition device comprising: a character code determining means for determining a code.

メモリに記憶されているコンピュータ実行可能なプログラムを読み出してＣＰＵにおいて実行することにより、画像内の文字パターンに対応する文字コードを決定する装置であって、
前記プログラムは、複数の異なる画像それぞれから複数の文字パターンを抽出して文字認識処理を行うことにより、当該抽出した文字パターンそれぞれに対応する複数の候補文字コードとその確信度を獲得する文字候補獲得ステップと、前記異なる複数の画像の各々において、前記抽出した各文字パターンの位置に基づいて前記文字パターンをグループ化することにより、複数の文字グループを生成するグルーピングステップと、前記生成された複数の文字グループそれぞれに含まれる各文字パターンに対応する複数の候補文字コードに基づいて、前記複数の異なる画像間で同一の文字グループを判定する文字グループ判定ステップと、前記判定された同一の文字グループそれぞれに含まれる文字パターンに対応する候補文字コードの確信度を集計し、前記集計された確信度に基づき、前記同一の文字グループに含まれる文字パターンに対応する文字コードを決定する文字コード決定ステップとを、コンピュータに実行させるためのプログラムコードを含むことを特徴とする装置。An apparatus for determining a character code corresponding to a character pattern in an image by reading a computer-executable program stored in a memory and executing it on a CPU,
The program extracts a plurality of character patterns from each of a plurality of different images and performs character recognition processing, thereby acquiring a plurality of candidate character codes corresponding to each of the extracted character patterns and a certainty of character candidates. a step, in each of the multiple different images, by grouping the character pattern based on the position of each character pattern the extraction, a grouping step of generating a plurality of character groups, a plurality of the generated based on the plurality of candidate character codes corresponding to each character pattern included in each character group, the character group determination step of determining the identity of a character group between the plurality of different images, each said determined same character group Of certain candidate character codes corresponding to character patterns contained in Characterized in that aggregate, based on the aggregated confidence, and a character code determining step of determining a character code corresponding to the character pattern included in the same group of characters, including program code for causing a computer to execute Equipment.