JP2019133289A

JP2019133289A - Image processing program, image processing method and image processing apparatus

Info

Publication number: JP2019133289A
Application number: JP2018013219A
Authority: JP
Inventors: 雄三荻野; Yuzo Ogino; 和実北原; Kazumi Kitahara
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2018-01-30
Filing date: 2018-01-30
Publication date: 2019-08-08
Anticipated expiration: 2038-01-30
Also published as: JP6984447B2

Abstract

To improve accuracy in character recognition.SOLUTION: An image processing apparatus 101 compares a matching score of a first character of characters of an answer to a question, which is recognized as the first character from image data of the answer by an OCR, to a first threshold for matching score corresponding to at least one of a school year associated with the answer and/or question, a subject, and a question pattern which are stored in a storage unit 110. If the matching score of the first character of characters in the answer is larger than the first threshold, the image processing apparatus 101 determines whether the characters of the answer coincide with characters of a correct answer to the question.SELECTED DRAWING: Figure 1

Description

本発明は、画像処理プログラム、画像処理方法および画像処理装置に関する。 The present invention relates to an image processing program, an image processing method, and an image processing apparatus.

従来、手書きされた文字を光学的に読み取るＯＣＲ（ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅａｄｅｒ）と呼ばれる技術がある。また、ＯＣＲを利用して、手書きで記入されたテストの答案を文字認識し、テストの自動採点を行うシステムがある。 Conventionally, there is a technique called OCR (Optical Character Reader) that optically reads a handwritten character. In addition, there is a system that automatically recognizes a test by using OCR to recognize a test answer written by hand and characterize the test.

先行技術としては、例えば、利用者がペン入力手段を用いて入力した筆跡を、文字認識手段が漢字辞書を使って認識するときに、利用者の漢字に対する習熟度に応じて、漢字辞書中の認識の候補となる漢字の範囲を限定するものがある。 As a prior art, for example, when a handwriting input by a user using a pen input unit is recognized by a character recognition unit using a kanji dictionary, the kanji dictionary contains Some limit the range of kanji that are candidates for recognition.

特開平１０−１３４１４６号公報Japanese Patent Laid-Open No. 10-134146

しかしながら、従来技術では、答案に記入された回答に対するＯＣＲの文字認識精度を確保することが難しい。例えば、テストの自動採点を行うにあたり、文字認識精度を確保できなければ、誤採点を招いてしまう。 However, it is difficult for the conventional technology to ensure the OCR character recognition accuracy for the answer written in the answer. For example, when automatic test scoring is performed, if character recognition accuracy cannot be ensured, erroneous scoring will be caused.

一つの側面では、本発明は、文字認識精度を向上させることを目的とする。 In one aspect, the present invention aims to improve character recognition accuracy.

１つの実施態様では、答案の画像データからＯＣＲにより第１の文字であると認識された設問に対する回答の文字の前記第１の文字との一致度を、記憶部に記憶した前記答案及び／又は前記設問に対応付けた学年、科目および設問の種別のうちの少なくともいずれかに対応する一致度の第１の閾値と比較し、前記回答の文字の前記第１の文字との一致度が前記第１の閾値よりも大きい場合に、前記回答の文字と前記設問の正答の文字とが一致するか否かを判定する、画像処理プログラムが提供される。 In one embodiment, the answer stored in the storage unit and / or the degree of coincidence between the character of the answer to the question recognized as the first character by OCR from the image data of the answer and / or Compared with the first threshold value of the degree of coincidence corresponding to at least one of the grade, subject, and question type associated with the question, the degree of coincidence of the character of the answer with the first character is the first An image processing program is provided for determining whether or not the character of the answer matches the character of the correct answer of the question when it is greater than a threshold value of 1.

本発明の一側面によれば、文字認識精度を向上させることができる。 According to an aspect of the present invention, character recognition accuracy can be improved.

図１は、実施の形態にかかる画像処理方法の一実施例を示す説明図である。FIG. 1 is an explanatory diagram of an example of the image processing method according to the embodiment. 図２は、画像処理システム２００のシステム構成例を示す説明図である。FIG. 2 is an explanatory diagram showing a system configuration example of the image processing system 200. 図３は、画像処理装置１０１のハードウェア構成例を示すブロック図である。FIG. 3 is a block diagram illustrating a hardware configuration example of the image processing apparatus 101. 図４は、答案画像の具体例を示す説明図である。FIG. 4 is an explanatory diagram showing a specific example of an answer image. 図５は、小片画像の具体例を示す説明図である。FIG. 5 is an explanatory diagram showing a specific example of a small piece image. 図６は、答案画像ＤＢ２２０の記憶内容の一例を示す説明図である。FIG. 6 is an explanatory diagram showing an example of the contents stored in the answer image DB 220. 図７は、小片画像ＤＢ２３０の記憶内容の一例を示す説明図である。FIG. 7 is an explanatory diagram showing an example of the contents stored in the small piece image DB 230. 図８は、正答テーブル２４０の記憶内容の一例を示す説明図である。FIG. 8 is an explanatory diagram showing an example of the contents stored in the correct answer table 240. 図９は、閾値テーブル２５０の記憶内容の一例を示す説明図である。FIG. 9 is an explanatory diagram illustrating an example of the contents stored in the threshold table 250. 図１０は、画像処理装置１０１の機能的構成例を示すブロック図である。FIG. 10 is a block diagram illustrating a functional configuration example of the image processing apparatus 101. 図１１は、文字チェック画面の画面例を示す説明図である。FIG. 11 is an explanatory diagram illustrating a screen example of a character check screen. 図１２は、答案チェック画面の画面例を示す説明図である。FIG. 12 is an explanatory diagram illustrating a screen example of an answer check screen. 図１３は、距離値／隣接差テーブル１３００の記憶内容の一例を示す説明図である。FIG. 13 is an explanatory diagram showing an example of the contents stored in the distance value / adjacent difference table 1300. 図１４は、画像処理装置１０１の画像処理手順の一例を示すフローチャート（その１）である。FIG. 14 is a flowchart (part 1) illustrating an example of an image processing procedure of the image processing apparatus 101. 図１５は、画像処理装置１０１の画像処理手順の一例を示すフローチャート（その２）である。FIG. 15 is a flowchart (part 2) illustrating an example of an image processing procedure of the image processing apparatus 101.

以下に図面を参照して、本発明にかかる画像処理プログラム、画像処理方法および画像処理装置の実施の形態を詳細に説明する。 Hereinafter, embodiments of an image processing program, an image processing method, and an image processing apparatus according to the present invention will be described in detail with reference to the drawings.

（実施の形態）
図１は、実施の形態にかかる画像処理方法の一実施例を示す説明図である。図１において、画像処理装置１０１は、答案の画像データから文字を認識するコンピュータである。答案は、設問に対する回答が記入された用紙である。答案の画像データは、答案をスキャンして得られる画像データである。 (Embodiment)
FIG. 1 is an explanatory diagram of an example of the image processing method according to the embodiment. In FIG. 1, an image processing apparatus 101 is a computer that recognizes characters from image data of an answer. The answer is a sheet on which an answer to the question is entered. The answer image data is image data obtained by scanning the answer.

ここで、答案の画像データをＯＣＲで文字認識し、認識した設問に対する回答の文字を、その設問の正答の文字と比較することで、答案の自動採点を行う場合がある。しかし、文字を書く能力は、個人差があるだけでなく、学年の違いによって変わる傾向がある。例えば、小学校では、学年が高くなるにつれて、綺麗に文字を書く児童が多くなる傾向があり、低学年ほど、読みやすい字を書く児童は少なくなる。 Here, there is a case where the answer image is automatically scored by character-recognizing the image data of the answer by OCR and comparing the character of the answer to the recognized question with the character of the correct answer of the question. However, the ability to write letters not only varies from individual to individual, but tends to vary depending on the grade. For example, in elementary school, the number of children who write beautiful characters tends to increase as the school year rises, and the number of children who write easy-to-read characters decreases in the lower grades.

例えば、文字１２０は、小学６年生の児童が書いた「ア」という文字である。また、文字１３０は、小学１年生の児童が書いた「ア」という文字である。文字１２０と文字１３０とを比較すると、文字１２０のほうが綺麗に書けており、ＯＣＲで正しく認識される可能性が高い。 For example, the character 120 is a character “A” written by a sixth grader. The character 130 is a character “A” written by a first grader. When the character 120 and the character 130 are compared, the character 120 is written more beautifully and is more likely to be correctly recognized by the OCR.

ＯＣＲを用いた文字認識では、例えば、ある文字を認識するにあたり、予め登録された複数の文字それぞれとの一致度を求め、一致度が最大となる文字を認識する。ただし、単純に一致度が最大となる文字を認識すると、本来ならば認識エラーとすべきものまで、何らかの文字として認識されてしまうことがある。 In character recognition using OCR, for example, when recognizing a certain character, the degree of coincidence with each of a plurality of pre-registered characters is obtained, and the character with the largest degree of coincidence is recognized. However, simply recognizing the character with the highest degree of coincidence may cause it to be recognized as some character even if it should be a recognition error.

このため、ある文字を認識するにあたり、予め登録された文字との一致度について閾値を設け、一致度が最大で、かつ、閾値より大きい文字を認識する方式がとられることがある。この方式によれば、本来ならば認識エラーとすべきものが、何らかの文字として認識されるのを防ぐことができる。 For this reason, when recognizing a certain character, there is a case where a threshold is set for the degree of coincidence with a previously registered character, and a character having the maximum degree of coincidence and recognizing a character larger than the threshold may be used. According to this method, it is possible to prevent a recognition error from being recognized as some character.

ところが、上述したように、文字を書く能力は、学年の違いによって変わる傾向がある。このため、１年生から６年生まで一律に閾値を設定すると、文字認識精度を確保できなくなるおそれがある。例えば、６年生に合わせて閾値を厳しめに設定すると、１年生の児童が書いた文字１３０のような文字は正しく認識されないことになる。 However, as described above, the ability to write letters tends to change depending on the grade. For this reason, if the threshold value is uniformly set from the first grader to the sixth grader, the character recognition accuracy may not be ensured. For example, if the threshold is set strictly for the sixth grader, characters such as the letter 130 written by a first grader will not be recognized correctly.

しかし、文字１３０について、児童自身は「ア」という文字を書いており、「ア」と認識されて採点されるべきである。一方、１年生に合わせて閾値を緩めに設定すると、高学年の児童が書いた文字を認識する際に、本来ならば認識エラーとすべきものまで、何らかの文字として認識されてしまうことが増えるおそれがある。 However, for the character 130, the child himself has written the letter “A” and should be recognized and scored as “A”. On the other hand, if the threshold is set loosely for first graders, when characters written by older students are recognized, there is a risk that they will be recognized as some characters, even if they should be recognized as errors. .

このような問題は、学年の違いだけでなく、科目の違い、設問の種別の違いによっても生じる可能性がある。設問の種別とは、設問に対する回答形式を示すものである。設問の種別としては、例えば、単一回答、複数回答、記述回答などがある。単一回答は、選択肢の中から回答を一つ選ぶ形式である。複数回答は、選択肢の中から回答を二つ以上選ぶ形式である。 Such problems may arise not only due to differences in grades but also due to differences in subjects and question types. The type of question indicates the answer format for the question. Examples of the types of questions include a single answer, a plurality of answers, and a written answer. The single answer is a form in which one answer is selected from the choices. Multiple answers is a form in which two or more answers are selected from the choices.

まず、科目によっては、設問に対する回答として、数字や記号（○、×など）を書くことが多いものがある。数字や記号は、ひらがなやカタカナに比べて、文字認識しやすい傾向があるため、閾値は厳しめに設定してもよいといえる。また、設問の種別が「複数回答」の場合、文字同士が近づきすぎたり重なったりするため、「単一回答」に比べて、閾値は緩めに設定することが望ましい。 First, depending on the subject, there are many cases where numbers and symbols (such as ○ and ×) are often written as answers to questions. Since numbers and symbols tend to be recognized more easily than hiragana and katakana, it can be said that the threshold may be set strictly. In addition, when the question type is “multiple answers”, the characters are too close to each other and overlap each other, so it is desirable to set the threshold value more loosely than “single answer”.

そこで、本実施の形態では、学年、科目および設問の種別のうちの少なくともいずれかに対応する一致度の閾値を用いて、ＯＣＲで認識した文字の確からしさを検証することで、文字認識精度の向上を図る画像処理方法について説明する。以下、画像処理装置１０１の処理例について説明する。 Therefore, in this embodiment, character recognition accuracy is verified by verifying the accuracy of characters recognized by OCR using a threshold value of coincidence corresponding to at least one of a grade, a subject, and a question type. An image processing method for improvement will be described. Hereinafter, a processing example of the image processing apparatus 101 will be described.

（１）画像処理装置１０１は、答案の画像データからＯＣＲにより第１の文字であると認識された設問に対する回答の文字の第１の文字との一致度を取得する。ここで、第１の文字は、予め登録された複数の文字のうち、設問に対する回答の文字としてＯＣＲで認識された文字（認識文字）である。 (1) The image processing apparatus 101 acquires, from the image data of the answer, the degree of coincidence between the answer character and the first character for the question recognized as the first character by OCR. Here, the first character is a character (recognized character) recognized by the OCR as a response character to the question among a plurality of characters registered in advance.

予め登録された複数の文字は、ＯＣＲで認識する文字として登録されたものである。文字の種類としては、例えば、ひらがな（あ、い、う、…）、カタカナ（ア、イ、ウ、…）、数字（１、２、３、…）、アルファベット（Ａ、Ｂ、Ｃ、…）、記号（○、×、…）、漢字などがある。 A plurality of characters registered in advance are registered as characters recognized by the OCR. Examples of characters include hiragana (a, i, u, ...), katakana (a, i, u, ...), numbers (1, 2, 3, ...), alphabets (A, B, C, ...). ), Symbols (○, ×, ...), kanji, etc.

第１の文字との一致度とは、ＯＣＲで認識された文字、すなわち、手書きで記入された文字の第１の文字との一致度合いを示す値である。ここでは、一致度が大きいほど、第１の文字との一致度合いが高いことを示す。ただし、第１の文字との一致度を、第１の文字との違いをあらわす距離によって表現することにしてもよい。この場合、第１の文字との距離が小さいほど、第１の文字との一致度合いが高いことを示す。 The degree of coincidence with the first character is a value indicating the degree of coincidence between the character recognized by the OCR, that is, the character written by handwriting with the first character. Here, the greater the degree of matching, the higher the degree of matching with the first character. However, the degree of coincidence with the first character may be expressed by a distance representing a difference from the first character. In this case, the smaller the distance from the first character, the higher the degree of matching with the first character.

図１の例では、文字１２０を、小学６年生に出題されたある設問に対する回答の文字として答案に記入されたものとする。また、文字１３０を、小学１年生に出題されたある設問に対する回答の文字として答案に記入されたものとする。また、文字１２０，１３０は、各答案の画像データからＯＣＲによりそれぞれ第１の文字「ア」であると認識された場合を想定する。また、文字１２０の第１の文字「ア」との一致度を「８０」とし、文字１３０の第１の文字「ア」との一致度を「４０」とする。 In the example of FIG. 1, it is assumed that the character 120 is entered in the answer as a character of an answer to a question given to the sixth grader. In addition, it is assumed that the letter 130 is entered in the answer as a letter of an answer to a question given to the first grader. Further, it is assumed that the characters 120 and 130 are recognized as the first character “A” from the image data of each answer by OCR. Further, the matching degree of the character 120 with the first character “A” is “80”, and the matching degree of the character 130 with the first character “A” is “40”.

（２）画像処理装置１０１は、取得した一致度を、記憶部１１０に記憶した答案及び／又は設問に対応付けた学年、科目および設問の種別のうちの少なくともいずれかに対応する一致度の第１の閾値と比較する。記憶部１１０は、学年、科目および設問の種別のうちの少なくともいずれかに対応する一致度の閾値を記憶する。 (2) The image processing apparatus 101 sets the degree of coincidence corresponding to at least one of the grade, the subject, and the question type corresponding to the answer and / or question stored in the storage unit 110. Compare with 1 threshold. The storage unit 110 stores a coincidence threshold corresponding to at least one of a grade, a subject, and a question type.

記憶部１１０は、画像処理装置１０１が有していてもよく、また、画像処理装置１０１がアクセス可能な他のコンピュータが有していてもよい。他のコンピュータが記憶部１１０を有する場合、画像処理装置１０１は、他のコンピュータにアクセスして、記憶部１１０の記憶内容を参照する。 The storage unit 110 may be included in the image processing apparatus 101, or may be included in another computer accessible by the image processing apparatus 101. When another computer has the storage unit 110, the image processing apparatus 101 accesses the other computer and refers to the storage contents of the storage unit 110.

図１の例では、記憶部１１０には、各学年（小学１年生〜小学６年生）に対応する一致度の閾値が記憶されている場合を想定する。また、小学６年生に対応する一致度の第１の閾値を「７０」とし、小学１年生に対応する一致度の第１の閾値を「３０」とする。 In the example of FIG. 1, it is assumed that the storage unit 110 stores a threshold of coincidence corresponding to each grade (first grader to sixth grader). In addition, the first threshold value of the degree of coincidence corresponding to the sixth grader is set to “70”, and the first threshold value of the degree of coincidence corresponding to the first grader is set to “30”.

この場合、画像処理装置１０１は、文字１２０の第１の文字「ア」との一致度「８０」を、小学６年生に対応する第１の閾値「７０」と比較する。また、画像処理装置１０１は、文字１３０の第１の文字「ア」との一致度「４０」を、小学１年生に対応する第１の閾値「３０」と比較する。なお、学年は、答案及び／又は設問に対応付けられている。 In this case, the image processing apparatus 101 compares the matching degree “80” of the character 120 with the first character “A” with the first threshold “70” corresponding to the sixth grader. Further, the image processing apparatus 101 compares the degree of coincidence “40” of the character 130 with the first character “A” with the first threshold “30” corresponding to the first grader. The grade is associated with an answer and / or question.

（３）画像処理装置１０１は、回答の文字の第１の文字との一致度が第１の閾値よりも大きい場合に、回答の文字と設問の正答の文字とが一致するか否かを判定する。ここで、回答の文字の第１の文字との一致度が第１の閾値よりも大きければ、ＯＣＲで正しく文字が認識された可能性が高いといえる。 (3) The image processing apparatus 101 determines whether the answer character and the correct answer character of the question match when the matching degree of the answer character with the first character is greater than the first threshold. To do. Here, if the degree of coincidence between the reply character and the first character is greater than the first threshold value, it can be said that there is a high possibility that the character has been correctly recognized by the OCR.

すなわち、画像処理装置１０１は、ＯＣＲで回答の文字が正しく認識されたといえる場合に、回答の文字と設問の正答の文字との一致判定を実施する。設問の正答の文字とは、設問の答えを示す文字である。一方、ＯＣＲで回答の文字が正しく認識されたといえない場合には、画像処理装置１０１は、回答の文字と設問の正答の文字との一致判定を実施しない。 In other words, when it can be said that the character of the answer is correctly recognized by the OCR, the image processing apparatus 101 performs a match determination between the character of the answer and the character of the correct answer of the question. The correct answer letter of the question is a letter indicating the answer of the question. On the other hand, if it cannot be said that the character of the answer is correctly recognized by the OCR, the image processing apparatus 101 does not perform a match determination between the character of the answer and the character of the correct answer of the question.

図１の例では、文字１２０の第１の文字「ア」との一致度「８０」は、小学６年生に対応する第１の閾値「７０」よりも大きい。このため、画像処理装置１０１は、設問に対する回答の文字（文字１２０）と、設問の正答の文字とが一致するか否かを判定する。 In the example of FIG. 1, the matching degree “80” of the character 120 with the first character “A” is larger than the first threshold “70” corresponding to the sixth grader. For this reason, the image processing apparatus 101 determines whether or not the character (character 120) of the answer to the question matches the character of the correct answer of the question.

また、文字１３０の第１の文字「ア」との一致度「４０」は、小学１年生に対応する第１の閾値「３０」よりも大きい。このため、画像処理装置１０１は、設問に対する回答の文字（文字１３０）と、設問の正答の文字とが一致するか否かを判定する。なお、設問の正答の文字を示す情報は、例えば、画像処理装置１０１に記憶されている。 Further, the degree of coincidence “40” of the character 130 with the first character “A” is larger than the first threshold “30” corresponding to the first grader. For this reason, the image processing apparatus 101 determines whether or not the character (character 130) of the answer to the question matches the character of the correct answer of the question. Information indicating the correct answer character of the question is stored in the image processing apparatus 101, for example.

このように、画像処理装置１０１によれば、学年、科目および設問の種別のうちの少なくともいずれかに対応する第１の閾値（判断基準）を用いて、ＯＣＲで認識した文字の確からしさを検証して文字認識精度の向上を図ることができる。 As described above, according to the image processing apparatus 101, the probability of the character recognized by the OCR is verified using the first threshold (judgment criterion) corresponding to at least one of the grade, the subject, and the question type. Thus, the character recognition accuracy can be improved.

例えば、学年に対応する閾値を用いることで、学年の違いによって文字を書く能力が異なることを考慮して、ＯＣＲで認識した文字の確からしさを検証することができる。また、科目に対応する閾値を用いることで、科目の違いによって回答として書かれる文字の種類が異なることを考慮して、ＯＣＲで認識した文字の確からしさを検証することができる。また、設問の種別に対応する閾値を用いることで、設問の種別の違いによってＯＣＲによる文字認識のしやすさが異なることを考慮して、ＯＣＲで認識した文字の確からしさを検証することができる。 For example, by using the threshold value corresponding to the grade, it is possible to verify the certainty of the character recognized by the OCR in consideration of the difference in the ability to write characters depending on the grade. Further, by using the threshold value corresponding to the subject, it is possible to verify the certainty of the character recognized by the OCR, considering that the type of character written as an answer varies depending on the subject. Also, by using the threshold value corresponding to the question type, it is possible to verify the probability of the character recognized by the OCR, considering that the ease of character recognition by the OCR varies depending on the question type. .

図１の例では、小学１年生に対しても小学６年生と同じ第１の閾値「７０」を用いた場合、文字１３０の第１の文字「ア」との一致度「４０」は、第１の閾値「７０」以下となり、文字１３０は認識されない。これに対して、学年の特性に応じた第１の閾値、すなわち、小学１年生に対応する第１の閾値「３０」を用いることで、文字１３０は、「ア」と正しく認識される。 In the example of FIG. 1, when the same first threshold “70” is used for the first grader as the sixth grader, the degree of coincidence “40” of the character 130 with the first letter “A” is Therefore, the character 130 is not recognized. On the other hand, the character 130 is correctly recognized as “A” by using the first threshold corresponding to the characteristics of the grade, that is, the first threshold “30” corresponding to the first grade.

（画像処理システム２００のシステム構成例）
つぎに、実施の形態にかかる画像処理システム２００のシステム構成例について説明する。ここでは、画像処理システム２００を、全国の各拠点で実施されるテストの答案の画像データに対する文字認識を行うシステムに適用する場合を例に挙げて説明する。 (System configuration example of the image processing system 200)
Next, a system configuration example of the image processing system 200 according to the embodiment will be described. Here, a case will be described as an example where the image processing system 200 is applied to a system that performs character recognition on image data of a test answer executed at each base in the country.

図２は、画像処理システム２００のシステム構成例を示す説明図である。図２において、画像処理システム２００は、画像処理装置１０１と、複数の拠点端末２０１と、を含む。画像処理システム２００において、画像処理装置１０１および複数の拠点端末２０１は、有線または無線のネットワーク２１０を介して接続される。ネットワーク２１０は、例えば、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、ＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）、インターネットなどである。 FIG. 2 is an explanatory diagram showing a system configuration example of the image processing system 200. In FIG. 2, the image processing system 200 includes an image processing apparatus 101 and a plurality of base terminals 201. In the image processing system 200, the image processing apparatus 101 and the plurality of base terminals 201 are connected via a wired or wireless network 210. The network 210 is, for example, a local area network (LAN), a wide area network (WAN), or the Internet.

画像処理装置１０１は、答案画像ＤＢ（ＤａｔａＢａｓｅ）２２０、小片画像ＤＢ２３０、正答テーブル２４０および閾値テーブル２５０を有する。各種ＤＢ等２２０，２３０，２４０，２５０の記憶内容については、図６〜図９を用いて後述する。画像処理装置１０１は、例えば、サーバである。 The image processing apparatus 101 includes an answer image DB (DataBase) 220, a small piece image DB 230, a correct answer table 240, and a threshold value table 250. The storage contents of various DBs 220, 230, 240, 250 will be described later with reference to FIGS. The image processing apparatus 101 is a server, for example.

拠点端末２０１は、テストが実施される各拠点に設置されるコンピュータであり、スキャナ２０２を有する。スキャナ２０２は、画像を光学的に読み取る装置である。スキャナ２０２により読み取られた画像は画像データとして拠点端末２０１に格納される。拠点端末２０１は、例えば、ＰＣ（ＰｅｒｓｏｎａｌＣｏｍｐｕｔｅｒ）である。 The base terminal 201 is a computer installed at each base where the test is performed, and includes a scanner 202. The scanner 202 is a device that optically reads an image. An image read by the scanner 202 is stored in the base terminal 201 as image data. The base terminal 201 is, for example, a PC (Personal Computer).

画像処理システム２００において、スキャナ２０２は、テストの答案を読み取るために用いられる。スキャナ２０２により読み取られた答案の画像データは、拠点端末２０１に取り込まれた後、拠点端末２０１から画像処理装置１０１に送信される。画像処理装置１０１は、拠点端末２０１から答案の画像データを受信すると、受信した答案の画像データを答案画像ＤＢ２２０に記憶する。答案の画像データの具体例については、図４を用いて後述する。 In the image processing system 200, the scanner 202 is used to read a test answer. The image data of the answer read by the scanner 202 is taken into the base terminal 201 and then transmitted from the base terminal 201 to the image processing apparatus 101. Upon receiving the answer image data from the base terminal 201, the image processing apparatus 101 stores the received answer image data in the answer image DB 220. A specific example of the image data of the answer will be described later with reference to FIG.

また、画像処理装置１０１は、答案の画像データを設問単位で区切って小片画像に分割する。小片画像は、各設問の回答欄を含む画像データである。分割された小片画像は、小片画像ＤＢ２３０に記憶される。小片画像の具体例については、図５を用いて後述する。 Further, the image processing apparatus 101 divides the image data of the answer into question pieces and divides them into small piece images. The small piece image is image data including an answer column for each question. The divided small piece images are stored in the small piece image DB 230. A specific example of the small piece image will be described later with reference to FIG.

なお、画像処理装置１０１は、１台のコンピュータによって実現されてもよく、また、複数のコンピュータによって実現されることにしてもよい。 The image processing apparatus 101 may be realized by a single computer or may be realized by a plurality of computers.

（画像処理装置１０１のハードウェア構成例）
図３は、画像処理装置１０１のハードウェア構成例を示すブロック図である。図３において、画像処理装置１０１は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）３０１と、メモリ３０２と、Ｉ／Ｆ（Ｉｎｔｅｒｆａｃｅ）３０３と、ディスクドライブ３０４と、ディスク３０５と、を有する。また、各構成部は、バス３００によってそれぞれ接続される。 (Hardware configuration example of image processing apparatus 101)
FIG. 3 is a block diagram illustrating a hardware configuration example of the image processing apparatus 101. In FIG. 3, the image processing apparatus 101 includes a CPU (Central Processing Unit) 301, a memory 302, an I / F (Interface) 303, a disk drive 304, and a disk 305. Each component is connected by a bus 300.

ここで、ＣＰＵ３０１は、画像処理装置１０１の全体の制御を司る。メモリ３０２は、例えば、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）およびフラッシュＲＯＭなどを有する。具体的には、例えば、フラッシュＲＯＭがＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）のプログラムを記憶し、ＲＯＭがアプリケーションプログラムを記憶し、ＲＡＭがＣＰＵ３０１のワークエリアとして使用される。メモリ３０２に記憶されるプログラムは、ＣＰＵ３０１にロードされることで、コーディングされている処理をＣＰＵ３０１に実行させる。 Here, the CPU 301 controls the entire image processing apparatus 101. The memory 302 includes, for example, a ROM (Read Only Memory), a RAM (Random Access Memory), and a flash ROM. Specifically, for example, the flash ROM stores an OS (Operating System) program, the ROM stores an application program, and the RAM is used as a work area of the CPU 301. The program stored in the memory 302 is loaded into the CPU 301 to cause the CPU 301 to execute the coded process.

Ｉ／Ｆ３０３は、通信回線を通じてネットワーク２１０に接続され、ネットワーク２１０を介して外部のコンピュータ（例えば、図２に示した拠点端末２０１）に接続される。そして、Ｉ／Ｆ３０３は、ネットワーク２１０と装置内部とのインターフェースを司り、外部のコンピュータからのデータの入出力を制御する。Ｉ／Ｆ３０３には、例えば、モデムやＬＡＮアダプタなどを採用することができる。 The I / F 303 is connected to the network 210 through a communication line, and is connected to an external computer (for example, the base terminal 201 shown in FIG. 2) via the network 210. The I / F 303 controls an interface between the network 210 and the inside of the apparatus, and controls data input / output from an external computer. For example, a modem or a LAN adapter may be employed as the I / F 303.

ディスクドライブ３０４は、ＣＰＵ３０１の制御に従ってディスク３０５に対するデータのリード／ライトを制御する。ディスク３０５は、ディスクドライブ３０４の制御で書き込まれたデータを記憶する。ディスク３０５としては、例えば、磁気ディスク、光ディスクなどが挙げられる。 The disk drive 304 controls reading / writing of data with respect to the disk 305 according to the control of the CPU 301. The disk 305 stores data written under the control of the disk drive 304. Examples of the disk 305 include a magnetic disk and an optical disk.

なお、画像処理装置１０１は、上述した構成部のほかに、例えば、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）、入力装置、ディスプレイ等を有することにしてもよい。また、図２に示した拠点端末２０１についても、画像処理装置１０１と同様のハードウェア構成により実現することができる。ただし、拠点端末２０１は、上述した構成部のほかに、スキャナ２０２（図２参照）、入力装置、ディスプレイ等を有する。 Note that the image processing apparatus 101 may include, for example, an SSD (Solid State Drive), an input device, a display, and the like in addition to the above-described components. Also, the base terminal 201 shown in FIG. 2 can be realized by the same hardware configuration as the image processing apparatus 101. However, the base terminal 201 includes a scanner 202 (see FIG. 2), an input device, a display, and the like in addition to the components described above.

（答案画像の具体例）
つぎに、答案画像の具体例について説明する。 (Specific example of answer image)
Next, a specific example of the answer image will be described.

図４は、答案画像の具体例を示す説明図である。図４において、答案画像Ｐ１は、スキャナ２０２（図２参照）により読み取られた、ある児童（生徒ＩＤ：Ｓ１）の答案（答案ＩＤ：Ａ１）の画像データである。答案（答案ＩＤ：Ａ１）は、小学１年生を対象とする算数のテストの答案である。 FIG. 4 is an explanatory diagram showing a specific example of an answer image. In FIG. 4, an answer image P1 is image data of an answer (answer ID: A1) of a certain child (student ID: S1) read by the scanner 202 (see FIG. 2). The answer (answer ID: A1) is an answer to an arithmetic test for first graders.

各拠点においてテストが実施されると、答案画像Ｐ１のような答案の画像データが、テストを受けた児童の数分、スキャナ２０２に読み取られて、拠点端末２０１から画像処理装置１０１に転送される。この結果、テストを受けた全児童の答案の画像データが画像処理装置１０１に記憶される。 When the test is performed at each site, the image data of the answer such as the answer image P1 is read by the scanner 202 by the number of children who have been tested, and transferred from the site terminal 201 to the image processing apparatus 101. . As a result, the image data of the answers of all the children who have been tested are stored in the image processing apparatus 101.

（小片画像の具体例）
つぎに、答案画像から分割された小片画像の具体例について説明する。 (Specific example of small image)
Next, a specific example of a small piece image divided from the answer image will be described.

図５は、小片画像の具体例を示す説明図である。図５において、小片画像ｐ１は、図４に示した答案画像Ｐ１から分割された小片画像の一つである。画像処理装置１０１では、答案画像Ｐ１を設問単位に区切って分割することで、小片画像ｐ１のような画像データが設問の数分、答案画像Ｐ１から生成される。 FIG. 5 is an explanatory diagram showing a specific example of a small piece image. In FIG. 5, the small piece image p1 is one of the small piece images divided from the answer image P1 shown in FIG. In the image processing apparatus 101, by dividing the answer image P1 into question units and dividing it, image data such as small piece images p1 is generated from the answer images P1 by the number of questions.

（各種ＤＢ等２２０，２３０，２４０，２５０の記憶内容）
つぎに、図６〜図９を用いて、画像処理装置１０１が有する各種ＤＢ等２２０，２３０，２４０，２５０の記憶内容について説明する。各種ＤＢ等２２０，２３０，２４０，２５０は、例えば、図３に示したメモリ３０２、ディスク３０５等の記憶装置により実現される。 (Storage contents of various DBs 220, 230, 240, 250)
Next, the storage contents of various DBs 220, 230, 240, and 250 included in the image processing apparatus 101 will be described with reference to FIGS. The various DBs 220, 230, 240, and 250 are realized by storage devices such as the memory 302 and the disk 305 shown in FIG.

図６は、答案画像ＤＢ２２０の記憶内容の一例を示す説明図である。図６において、答案画像ＤＢ２２０は、答案ＩＤ、学年、科目、児童ＩＤおよび答案画像のフィールドを有し、各フィールドに情報を設定することで、答案画像情報（例えば、答案画像情報６００−１〜６００−３）をレコードとして記憶する。 FIG. 6 is an explanatory diagram showing an example of the contents stored in the answer image DB 220. In FIG. 6, the answer image DB 220 has answer ID, grade, subject, child ID, and answer image fields. By setting information in each field, answer image information (for example, answer image information 600-1 to 600-1). 600-3) is stored as a record.

ここで、答案ＩＤは、テストの答案を一意に識別する識別子である。学年は、テストの対象となる学年である。例えば、学年「１」は、小学１年生を示す。学年「６」は、小学６年生を示す。科目は、テストの科目である。科目としては、国語、社会、算数、理科などがある。児童ＩＤは、テストを受けた児童を一意に識別する識別子である。答案画像は、テストを受けた児童が提出した答案の画像データである。 Here, the answer ID is an identifier for uniquely identifying the answer of the test. The grade is the grade that will be tested. For example, the grade “1” indicates a first grader. The grade “6” indicates a sixth grader of elementary school. The subject is a test subject. Subjects include national language, society, mathematics, and science. The child ID is an identifier that uniquely identifies a child who has been tested. The answer image is image data of an answer submitted by the child who took the test.

例えば、答案画像情報６００−１は、小学１年生の児童Ｓ１が提出した算数のテストの答案Ａ１の答案画像Ｐ１を示す。 For example, the answer image information 600-1 indicates the answer image P1 of the answer A1 of the arithmetic test submitted by the first grade child S1.

図７は、小片画像ＤＢ２３０の記憶内容の一例を示す説明図である。図７において、小片画像ＤＢ２３０は、答案ＩＤ、児童ＩＤ、設問ＩＤおよび小片画像のフィールドを有し、各フィールドに情報を設定することで、小片画像情報（例えば、小片画像情報７００−１〜７００−３）をレコードとして記憶する。 FIG. 7 is an explanatory diagram showing an example of the contents stored in the small piece image DB 230. In FIG. 7, the small piece image DB 230 includes answer ID, child ID, question ID, and small piece image fields. By setting information in each field, small piece image information (for example, small piece image information 700-1 to 700-700). -3) is stored as a record.

ここで、答案ＩＤは、テストの答案を一意に識別する識別子である。児童ＩＤは、テストを受けた児童を一意に識別する識別子である。設問ＩＤは、テストの設問を一意に識別する識別子である。小片画像は、答案画像から分割された設問単位の画像データである。 Here, the answer ID is an identifier for uniquely identifying the answer of the test. The child ID is an identifier that uniquely identifies a child who has been tested. The question ID is an identifier for uniquely identifying a test question. The small piece image is image data for each question divided from the answer image.

例えば、小片画像情報７００−１は、児童Ｓ１が提出した答案Ａ１に含まれる設問Ｑ１の小片画像ｐ１を示す。 For example, the small piece image information 700-1 indicates the small piece image p1 of the question Q1 included in the answer A1 submitted by the child S1.

図８は、正答テーブル２４０の記憶内容の一例を示す説明図である。図８において、正答テーブル２４０は、答案ＩＤ、学年、科目、設問ＩＤ、設問パターンおよび正答のフィールドを有し、各フィールドに情報を設定することで、正答情報（例えば、正答情報８００−１〜８００−３）をレコードとして記憶する。 FIG. 8 is an explanatory diagram showing an example of the contents stored in the correct answer table 240. In FIG. 8, the correct answer table 240 has fields of answer ID, grade, subject, question ID, question pattern, and correct answer, and by setting information in each field, correct answer information (for example, correct answer information 800-1 to 800-1). 800-3) is stored as a record.

ここで、答案ＩＤは、テストの答案を一意に識別する識別子である。学年は、テストの対象となる学年である。科目は、テストの科目である。設問ＩＤは、テストの設問を一意に識別する識別子である。設問パターンは、設問の種別である。設問パターンとしては、例えば、単一回答、複数回答などがある。正答は、設問の答えを示す。 Here, the answer ID is an identifier for uniquely identifying the answer of the test. The grade is the grade that will be tested. The subject is a test subject. The question ID is an identifier for uniquely identifying a test question. The question pattern is a question type. Examples of the question pattern include a single answer and a plurality of answers. The correct answer indicates the answer to the question.

例えば、正答情報８００−１は、学年「１」の児童を対象とする科目「算数」のテストの答案Ａ１に含まれる設問Ｑ１の設問パターン「単一回答」および正答「ア」を示す。 For example, the correct answer information 800-1 indicates the question pattern “single answer” and the correct answer “A” of the question Q1 included in the test answer A1 of the subject “arithmetic” for the child of the grade “1”.

図９は、閾値テーブル２５０の記憶内容の一例を示す説明図である。図９において、閾値テーブル２５０は、学年、科目、設問パターン、対象文字、距離閾値および隣接差閾値のフィールドを有し、各フィールドに情報を設定することで、閾値情報（例えば、閾値情報９００−１，９００−２）をレコードとして記憶する。 FIG. 9 is an explanatory diagram illustrating an example of the contents stored in the threshold table 250. In FIG. 9, the threshold table 250 has fields of grade, subject, question pattern, target character, distance threshold, and adjacent difference threshold. By setting information in each field, threshold information (for example, threshold information 900- 1,900-2) is stored as a record.

ここで、学年は、テストの対象となる学年である。科目は、テストの科目である。設問パターンは、設問の種別である。対象文字は、ＯＣＲで認識する対象となる文字である。距離閾値は、距離値についての閾値である。隣接差閾値は、距離値の隣接差についての閾値である。 Here, the grade is the grade to be tested. The subject is a test subject. The question pattern is a question type. The target character is a character to be recognized by OCR. The distance threshold is a threshold for the distance value. The adjacent difference threshold is a threshold for the adjacent difference of distance values.

例えば、閾値情報９００−１は、学年「１」、科目「算数」、設問パターン「単一回答」および対象文字「ア」の組合せに対応する距離閾値「１００」および隣接差閾値「５００」を示す。 For example, the threshold information 900-1 includes a distance threshold “100” and an adjacent difference threshold “500” corresponding to a combination of the grade “1”, the subject “arithmetic”, the question pattern “single answer”, and the target character “a”. Show.

（画像処理装置１０１の機能的構成例）
図１０は、画像処理装置１０１の機能的構成例を示すブロック図である。図１０において、画像処理装置１０１は、取得部１００１と、特定部１００２と、判断部１００３と、判定部１００４と、出力部１００５と、受付部１００６と、記録部１００７と、更新部１００８と、を含む。取得部１００１〜更新部１００８は制御部となる機能であり、具体的には、例えば、図３に示したメモリ３０２、ディスク３０５などの記憶装置に記憶されたプログラムをＣＰＵ３０１に実行させることにより、または、Ｉ／Ｆ３０３により、その機能を実現する。各機能部の処理結果は、例えば、メモリ３０２、ディスク３０５などの記憶装置に記憶される。 (Functional configuration example of the image processing apparatus 101)
FIG. 10 is a block diagram illustrating a functional configuration example of the image processing apparatus 101. In FIG. 10, the image processing apparatus 101 includes an acquisition unit 1001, an identification unit 1002, a determination unit 1003, a determination unit 1004, an output unit 1005, a reception unit 1006, a recording unit 1007, an update unit 1008, including. The acquisition unit 1001 to the update unit 1008 are functions as control units. Specifically, for example, by causing the CPU 301 to execute a program stored in a storage device such as the memory 302 and the disk 305 illustrated in FIG. Alternatively, the function is realized by the I / F 303. The processing result of each functional unit is stored in a storage device such as the memory 302 and the disk 305, for example.

取得部１００１は、答案の画像データを取得する。答案は、例えば、各拠点で実施されたテストの設問に対する回答が記入された用紙である。答案の画像データは、例えば、図４に示した答案画像Ｐ１である。具体的には、例えば、取得部１００１は、拠点端末２０１からテストの答案の画像データを受信することにより、答案の画像データを取得する。 The acquisition unit 1001 acquires image data of an answer. The answer is, for example, a sheet on which an answer to a question of a test performed at each base is entered. The answer image data is, for example, the answer image P1 shown in FIG. Specifically, for example, the acquisition unit 1001 acquires image data of an answer by receiving image data of a test answer from the base terminal 201.

答案の画像データには、例えば、テストの答案、テストの対象となる学年、テストの科目、およびテストを受けた児童を特定する情報が対応付けられている。答案を特定する情報は、例えば、答案ＩＤである。児童を特定する情報は、例えば、児童ＩＤや氏名などである。 The image data of the answer is associated with, for example, information for specifying the test answer, the grade to be tested, the subject of the test, and the child who took the test. The information specifying the answer is, for example, an answer ID. The information for identifying the child is, for example, a child ID or a name.

取得された答案の画像データは、例えば、答案ＩＤ、学年、科目および児童ＩＤと対応付けて、図６に示した答案画像ＤＢ２２０に記憶される。 The acquired answer image data is stored in the answer image DB 220 shown in FIG. 6 in association with the answer ID, grade, subject, and child ID, for example.

なお、図２の例では、拠点端末２０１がスキャナ２０２を有することにしたが、画像処理装置１０１がスキャナ２０２を有することにしてもよい。この場合、取得部１００１は、スキャナ２０２により読み取られた答案の画像データを取り込むことにより、答案の画像データを取得する。 In the example of FIG. 2, the base terminal 201 has the scanner 202, but the image processing apparatus 101 may have the scanner 202. In this case, the acquisition unit 1001 acquires the image data of the answer by taking in the image data of the answer read by the scanner 202.

特定部１００２は、取得された答案の画像データからＯＣＲにより第１の文字であると認識された設問に対する回答の文字の第１の文字との一致度を特定する。ここで、第１の文字との一致度とは、ＯＣＲで認識された文字、すなわち、手書きで記入された文字の第１の文字との一致度合いを示す値である。 The identifying unit 1002 identifies the degree of coincidence between the character of the answer to the question recognized as the first character by OCR from the acquired image data of the answer and the first character. Here, the degree of coincidence with the first character is a value indicating the degree of coincidence between the character recognized by OCR, that is, the character written by handwriting with the first character.

以下の説明では、第１の文字との一致度として、第１の文字との違いをあらわす距離値を用いる場合を例に挙げて説明する。この場合、第１の文字との距離値が小さいほど、認識した文字と第１の文字との一致度合いが高く、認識した文字が正しい可能性が高いことを示す。 In the following description, a case where a distance value representing a difference from the first character is used as an example of the degree of coincidence with the first character will be described as an example. In this case, the smaller the distance value from the first character, the higher the degree of matching between the recognized character and the first character, indicating that the recognized character is more likely to be correct.

具体的には、例えば、まず、特定部１００２は、取得された答案の画像データを設問単位で区切って小片画像に分割する。小片画像は、各設問の回答欄を含む画像データである。分割された小片画像は、例えば、図７に示した小片画像ＤＢ２３０に記憶される。なお、答案上の各設問の回答欄の位置を特定する情報は、例えば、答案の答案ＩＤ等と対応付けて、メモリ３０２、ディスク３０５等の記憶装置に予め記憶されている。 Specifically, for example, the specifying unit 1002 first divides the acquired image data of the answer into question pieces and divides them into small pieces of images. The small piece image is image data including an answer column for each question. The divided small piece images are stored in, for example, the small piece image DB 230 shown in FIG. Information specifying the position of the answer column for each question on the answer is stored in advance in a storage device such as the memory 302 and the disk 305 in association with the answer ID of the answer, for example.

つぎに、特定部１００２は、分割した小片画像に対してＯＣＲ処理を実施して文字を認識する。ＯＣＲ処理とは、文字を光学的に読み取る文字認識処理である。ＯＣＲ処理では、例えば、ある文字を認識するにあたり、複数の対象文字それぞれとの距離値（一致度）を求め、距離値が最小となる対象文字（第１の文字）を認識する。 Next, the specifying unit 1002 performs OCR processing on the divided small piece image to recognize characters. OCR processing is character recognition processing that optically reads characters. In the OCR process, for example, when recognizing a certain character, a distance value (matching degree) with each of a plurality of target characters is obtained, and a target character (first character) having a minimum distance value is recognized.

対象文字は、ＯＣＲで認識する文字として予め登録された文字である。文字間の距離値の算出方法については、既存のいかなる技術を用いることにしてもよい。そして、特定部１００２は、認識した第１の文字を回答の文字として特定する。また、特定部１００２は、第１の文字であると認識した回答の文字の第１の文字との距離値を特定する。 The target character is a character registered in advance as a character recognized by OCR. Any existing technique may be used for calculating the distance value between characters. Then, the specifying unit 1002 specifies the recognized first character as a response character. Further, the specifying unit 1002 specifies a distance value between the character of the answer recognized as the first character and the first character.

例えば、ある小片画像に対してＯＣＲ処理を実施した結果、距離値が低いものから上位３つの文字「ア」、「イ」、「ウ」が候補文字として抽出されたとする。また、候補文字「ア」との距離値を「１０」とし、候補文字「イ」との距離値を「１１０」とし、候補文字「ウ」との距離値を「５００」とする。この場合、特定部１００２は、３つの候補文字「ア」、「イ」、「ウ」のうち、距離値が最小の候補文字「ア」を認識する。そして、特定部１００２は、認識した候補文字「ア」を回答の文字として特定する。また、特定部１００２は、回答の文字の候補文字「ア」との距離値「１０」を特定する。 For example, suppose that as a result of performing OCR processing on a small piece image, the top three characters “A”, “I”, and “U” are extracted as candidate characters from the ones with a low distance value. Further, the distance value with the candidate character “A” is “10”, the distance value with the candidate character “I” is “110”, and the distance value with the candidate character “U” is “500”. In this case, the identifying unit 1002 recognizes the candidate character “A” having the smallest distance value among the three candidate characters “A”, “I”, “U”. The identifying unit 1002 identifies the recognized candidate character “A” as the answer character. Further, the specifying unit 1002 specifies a distance value “10” from the candidate character “A” of the answer character.

なお、答案の画像データを小片画像に分割する処理については、画像処理装置１０１とは異なる他のコンピュータにおいて実行することにしてもよい。また、小片画像に対するＯＣＲ処理についても、画像処理装置１０１とは異なる他のコンピュータにおいて実行することにしてもよい。 Note that the process of dividing the image data of the answer into small pieces of images may be executed by another computer different from the image processing apparatus 101. Further, the OCR processing for the small piece image may be executed by another computer different from the image processing apparatus 101.

また、特定部１００２は、回答の文字の第１の文字との距離値と、回答の文字の第２の文字との距離値との差分を算出することにしてもよい。ここで、第１の文字は、複数の対象文字のうち、回答の文字との距離値が最小の文字であり、回答の文字との一致度合いが最も高いといえる文字である。また、第２の文字は、複数の対象文字のうち、回答の文字との距離値が第１の文字のつぎに小さい文字であり、回答の文字との一致度合いが第１の文字のつぎに高いといえる文字である。 Further, the specifying unit 1002 may calculate a difference between the distance value between the answer character and the first character and the distance value between the answer character and the second character. Here, the first character is a character having the smallest distance value with the answer character among the plurality of target characters, and can be said to have the highest degree of matching with the answer character. The second character is a character whose distance value to the answer character is the second smallest after the first character among the plurality of target characters, and the degree of coincidence with the answer character is next to the first character. It is a character that can be said to be expensive.

上述した３つの候補文字「ア」、「イ」、「ウ」の例では、第１の文字は候補文字「ア」であり、第２の文字は候補文字「イ」である。この場合、特定部１００２は、回答の文字の候補文字「ア」との距離値「１０」と、回答の文字の候補文字「イ」との距離値「１１０」との差分「１００」を算出する。 In the example of the three candidate characters “A”, “I”, and “U” described above, the first character is the candidate character “A”, and the second character is the candidate character “A”. In this case, the specifying unit 1002 calculates a difference “100” between the distance value “10” between the answer character candidate character “A” and the distance value “110” between the answer character candidate character “A”. To do.

以下の説明では、回答の文字の第１の文字との距離値と回答の文字の第２の文字との距離値との差分を「第１の文字／第２の文字間の距離値の隣接差」と表記する場合がある。 In the following description, the difference between the distance value between the first character of the answer character and the distance value between the second character of the answer character is expressed as “adjacent distance value between first character / second character”. Sometimes referred to as “difference”.

判断部１００３は、特定された第１の文字との距離値を、答案及び／又は設問に対応付けた学年、科目および設問パターンのうちの少なくともいずれかに対応する距離閾値と比較する。距離閾値は、距離値（一致度）についての閾値（第１の閾値）である。距離閾値は、学年、科目および設問パターンのうちの少なくともいずれかに対応付けて保持される。この際、距離閾値は、ＯＣＲで認識する文字として登録された対象文字ごとに保持されることにしてもよい。 The determination unit 1003 compares the distance value to the identified first character with a distance threshold value corresponding to at least one of the grade, subject, and question pattern associated with the answer and / or question. The distance threshold value is a threshold value (first threshold value) for the distance value (degree of coincidence). The distance threshold is stored in association with at least one of the grade, subject, and question pattern. At this time, the distance threshold may be held for each target character registered as a character recognized by OCR.

具体的には、例えば、まず、判断部１００３は、小片画像ＤＢ２３０を参照して、小片画像に対応する答案ＩＤおよび設問ＩＤを特定する。つぎに、判断部１００３は、図８に示した正答テーブル２４０を参照して、特定した答案ＩＤおよび設問ＩＤに対応する学年、科目および設問パターンを特定する。 Specifically, for example, first, the determination unit 1003 refers to the small piece image DB 230 and specifies an answer ID and a question ID corresponding to the small piece image. Next, the determination unit 1003 identifies the grade, subject, and question pattern corresponding to the identified answer ID and question ID with reference to the correct answer table 240 shown in FIG.

つぎに、判断部１００３は、図９に示した閾値テーブル２５０を参照して、特定した学年、科目および設問パターンに対応する第１の文字についての距離閾値を特定する。一例として、学年を「１」、科目を「算数」、設問パターンを「単一回答」、第１の文字を「ア」とすると、距離閾値は「１００」となる。 Next, the determination unit 1003 specifies the distance threshold for the first character corresponding to the specified grade, subject, and question pattern with reference to the threshold table 250 shown in FIG. As an example, if the grade is “1”, the subject is “arithmetic”, the question pattern is “single answer”, and the first character is “a”, the distance threshold is “100”.

そして、判断部１００３は、第１の文字との距離値が、特定した距離閾値よりも小さいか否かを判断する。上述した候補文字「ア」を第１の文字とする例では、回答の文字の第１の文字「ア」との距離値は「１０」である。このため、距離閾値を「１００」とすると、判断部１００３は、第１の文字「ア」との距離値「１０」が距離閾値「１００」より小さいと判断する。 Then, the determination unit 1003 determines whether or not the distance value with the first character is smaller than the specified distance threshold. In the example in which the candidate character “A” described above is the first character, the distance value between the answer character and the first character “A” is “10”. Therefore, when the distance threshold is “100”, the determination unit 1003 determines that the distance value “10” with the first character “A” is smaller than the distance threshold “100”.

また、判断部１００３は、第１の文字／第２の文字間の距離値の隣接差を、答案及び／又は設問に対応付けた学年、科目および設問パターンのうちの少なくともいずれかに対応する第１の文字についての隣接差閾値と比較することにしてもよい。隣接差閾値は、距離値の隣接差についての対象文字ごとの閾値（第２の閾値）である。 In addition, the determination unit 1003 corresponds to at least one of the grade, the subject, and the question pattern in which the adjacent difference in the distance value between the first character / second character is associated with the answer and / or the question. You may decide to compare with the adjacent difference threshold value about 1 character. The adjacent difference threshold value is a threshold value (second threshold value) for each target character regarding an adjacent difference in distance values.

ここで、文字によっては、特徴がよく似た他の文字があるものや、特徴が他の文字と大きく異なるものなど様々である。例えば、カタカナの「カ」は「オ」と特徴がよく似ている。このため、隣接差閾値は、対象文字ごとに設けられ、例えば、閾値テーブル２５０から特定される。一例として、学年を「１」、科目を「算数」、設問パターンを「単一回答」、第１の文字を「ア」とすると、隣接差閾値は「５００」となる。 Here, there are various types of characters, such as those having other characters with similar characteristics and those having features that are greatly different from other characters. For example, the katakana “ka” is very similar in character to “o”. For this reason, the adjacent difference threshold is provided for each target character, and is specified from the threshold table 250, for example. As an example, if the grade is “1”, the subject is “arithmetic”, the question pattern is “single answer”, and the first character is “a”, the adjacent difference threshold is “500”.

そして、判断部１００３は、第１の文字／第２の文字間の距離値の隣接差が、第１の文字についての隣接差閾値よりも大きいか否かを判断する。上述した３つの候補文字「ア」、「イ」、「ウ」の例では、第１の文字／第２の文字間の距離値の隣接差は「１００」であり、第１の文字「ア」についての隣接差閾値は「５００」である。このため、判断部１００３は、隣接差「１００」が隣接差閾値「５００」以下であると判断する。隣接差「１００」が隣接差閾値「５００」以下であれば、第２の文字「イ」が正しい文字である可能性が無視できない程度に高いといえる。 Then, the determination unit 1003 determines whether or not the adjacent difference in the distance value between the first character and the second character is larger than the adjacent difference threshold for the first character. In the example of the above three candidate characters “A”, “I”, “U”, the adjacent difference in the distance value between the first character and the second character is “100”, and the first character “A” The adjacency difference threshold for “” is “500”. For this reason, the determination unit 1003 determines that the adjacent difference “100” is equal to or smaller than the adjacent difference threshold “500”. If the adjacent difference “100” is equal to or smaller than the adjacent difference threshold “500”, it can be said that the possibility that the second character “I” is a correct character cannot be ignored.

以下の説明では、ＯＣＲにより認識された回答の文字の第１の文字との距離値が距離閾値よりも小さいという条件を、「第１の条件」と表記する場合がある。第１の条件を満たすということは、ＯＣＲで認識した文字（第１の文字）が正しいと判断できる程度に、回答の文字の第１の文字との一致度合いが高いことを意味する。 In the following description, the condition that the distance value between the reply character recognized by the OCR and the first character is smaller than the distance threshold may be referred to as “first condition”. Satisfying the first condition means that the degree of coincidence between the answer character and the first character is high enough to determine that the character (first character) recognized by the OCR is correct.

また、ＯＣＲにより認識された回答の文字についての第１の文字／第２の文字間の距離値の隣接差が隣接差閾値よりも大きいという条件を、「第２の条件」と表記する場合がある。第２の条件を満たすということは、第２の文字が正しい文字である可能性が低いことを意味する。換言すれば、第２の条件を満たさなければ、第２の文字が正しい文字である可能性を無視できないことを意味する。 Further, the condition that the adjacent difference in the distance value between the first character and the second character for the reply character recognized by the OCR is larger than the adjacent difference threshold may be referred to as a “second condition”. is there. Satisfying the second condition means that the second character is unlikely to be a correct character. In other words, if the second condition is not satisfied, it means that the possibility that the second character is a correct character cannot be ignored.

判定部１００４は、第１の条件を満たすと判断された場合、認識された回答の文字と設問の正答の文字とが一致するか否かを判定する。ここで、設問の正答の文字とは、設問の答えを示す文字である。すなわち、判定部１００４は、第１の条件を満たす場合に、ＯＣＲにより認識された回答の文字を文字認識結果として決定し、正答の文字との一致判定を行う。 When it is determined that the first condition is satisfied, the determination unit 1004 determines whether or not the recognized answer character matches the question correct answer character. Here, the correct answer character of the question is a character indicating the answer of the question. That is, when the first condition is satisfied, the determination unit 1004 determines the character of the answer recognized by the OCR as the character recognition result, and performs a match determination with the correct character.

具体的には、例えば、判定部１００４は、正答テーブル２４０を参照して、小片画像の答案ＩＤおよび設問ＩＤに対応する正答の文字を特定する。そして、判定部１００４は、認識された回答の文字と、特定した正答の文字とが一致するか否かを判定する。 Specifically, for example, the determination unit 1004 refers to the correct answer table 240 and identifies the correct answer character corresponding to the answer ID and the question ID of the small piece image. Then, the determination unit 1004 determines whether or not the recognized answer character matches the identified correct answer character.

これにより、学年、科目、設問パターン等の特性に応じた判断基準を用いてＯＣＲで認識した文字の確からしさが検証された場合、すなわち、ＯＣＲで認識した文字が正しいと判断された場合に、正答の文字との一致判定を行うことができる。 As a result, when the accuracy of the characters recognized by the OCR is verified using the criteria according to the characteristics of the grade, subject, question pattern, etc., that is, when the characters recognized by the OCR are determined to be correct, It is possible to make a match with the correct character.

なお、例えば、設問パターンが「複数回答」の場合、正答の文字は複数存在する。この場合、判定部１００４は、認識された回答の文字それぞれと、正答の文字それぞれとを比較して、回答の文字と正答の文字とが一致するか否かを判定する。この場合、全ての正答の文字と一致する回答の文字があれば、正解となる。 For example, when the question pattern is “multiple answers”, there are a plurality of correct answers. In this case, the determination unit 1004 compares each recognized answer character with each correct answer character to determine whether the answer character matches the correct answer character. In this case, if there is an answer character that matches all the correct answer characters, the answer is correct.

また、判定部１００４は、第１の条件を満たすと判断され、かつ、第２の条件を満たすと判断された場合に、認識された回答の文字と設問の正答の文字とが一致するか否かを判定することにしてもよい。これにより、第２の文字が正しい文字である可能性の高さも考慮して、ＯＣＲで認識した文字の確からしさを検証することができる。 In addition, the determination unit 1004 determines whether or not the recognized answer character matches the question correct answer character when it is determined that the first condition is satisfied and the second condition is satisfied. It may be determined. Accordingly, the probability of the character recognized by the OCR can be verified in consideration of the high possibility that the second character is a correct character.

出力部１００５は、設問と対応付けて、判定された判定結果を出力する。出力部１００５の出力形式としては、例えば、メモリ３０２、ディスク３０５などの記憶装置への記憶、Ｉ／Ｆ３０３による他のコンピュータへの送信、不図示のディスプレイへの表示、不図示のプリンタへの印刷出力などがある。 The output unit 1005 outputs the determined determination result in association with the question. As an output format of the output unit 1005, for example, storage in a storage device such as the memory 302 and the disk 305, transmission to another computer by the I / F 303, display on a display (not shown), and printing to a printer (not shown) There is output.

具体的には、例えば、出力部１００５は、小片画像の答案ＩＤおよび設問ＩＤと対応付けて、判定された判定結果を出力することにしてもよい。これにより、どのテストのどの設問の回答が、正解または不正解であったのかを特定可能となる。なお、画像処理装置１０１は、判定部１００４によって判定された判定結果に基づいて、児童ごとに、答案を採点して、答案の採点結果を出力することにしてもよい。より具体的には、例えば、画像処理装置１０１は、児童の答案画像上の各設問に対する回答に○記号または×記号を付した採点結果画像を出力することにしてもよい。 Specifically, for example, the output unit 1005 may output the determined determination result in association with the answer ID and the question ID of the small piece image. This makes it possible to specify which question of which test was answered correctly or incorrectly. Note that the image processing apparatus 101 may score the answer for each child based on the determination result determined by the determination unit 1004, and output the score result of the answer. More specifically, for example, the image processing apparatus 101 may output a scoring result image in which an answer to each question on the answer image of the child is attached with a symbol “O” or a symbol “X”.

また、出力部１００５は、第１の条件を満たさないと判断された場合、設問に対する回答の文字の画像データを出力またはハイライト表示する。また、出力部１００５は、第１の条件または第２の条件の少なくともいずれかを満たさないと判断された場合に、設問に対する回答の文字の画像データを出力またはハイライト表示することにしてもよい。 Further, when it is determined that the first condition is not satisfied, the output unit 1005 outputs or highlights the image data of the character of the answer to the question. Further, the output unit 1005 may output or highlight the image data of the character for the answer to the question when it is determined that at least one of the first condition and the second condition is not satisfied. .

ここで、回答の文字の画像データは、答案に記入された回答の文字を含む画像データであり、例えば、小片画像である。具体的には、例えば、出力部１００５は、第１の条件（または、第２の条件）を満たさないと判断された回答の文字を含む小片画像を、チェック担当者が使用するコンピュータに表示することにしてもよい。 Here, the character image data of the answer is image data including the character of the answer written in the answer, and is, for example, a small piece image. Specifically, for example, the output unit 1005 displays a small piece image including characters of an answer determined not to satisfy the first condition (or the second condition) on a computer used by the checker. You may decide.

チェック担当者とは、ＯＣＲで認識された文字の正しさをチェックしたり、テストの採点を行ったりする者である。チェック担当者が使用するコンピュータは、例えば、図２に示した拠点端末２０１である。より具体的には、例えば、出力部１００５は、図１１に示すような文字チェック画面１１００を表示することにしてもよい。 The person in charge of checking is a person who checks the correctness of the characters recognized by the OCR and performs a test scoring. The computer used by the person in charge of the check is, for example, the base terminal 201 shown in FIG. More specifically, for example, the output unit 1005 may display a character check screen 1100 as shown in FIG.

図１１は、文字チェック画面の画面例を示す説明図である。図１１において、文字チェック画面１１００は、ＯＣＲで認識された文字の正しさをチェックするための操作画面である。文字チェック画面１１００には、第１の条件または第２の条件の少なくともいずれかを満たさないと判断された設問に対する回答の文字と、ＯＣＲの文字認識結果である第１の文字「ア」が表示されている。 FIG. 11 is an explanatory diagram illustrating a screen example of a character check screen. In FIG. 11, a character check screen 1100 is an operation screen for checking the correctness of characters recognized by OCR. The character check screen 1100 displays the character of the answer to the question that is determined not to satisfy at least one of the first condition and the second condition, and the first character “A” that is the OCR character recognition result. Has been.

文字チェック画面１１００において、不図示の入力装置を用いた操作入力により、ボタン１１０１を選択すると、回答の文字がＯＣＲの文字認識結果と一致するというチェック結果を入力することができる。また、文字チェック画面１１００において、ボタン１１０２を選択すると、回答の文字がＯＣＲの文字認識結果と一致しないというチェック結果を入力するとともに、正しい文字を選択するための操作画面を表示することができる。 When the button 1101 is selected by an operation input using an input device (not shown) on the character check screen 1100, a check result that the answer character matches the character recognition result of the OCR can be input. When the button 1102 is selected on the character check screen 1100, a check result indicating that the answer character does not match the character recognition result of the OCR can be input, and an operation screen for selecting a correct character can be displayed.

すなわち、チェック担当者は、ＯＣＲの文字認識結果が正しいと判断した場合、ボタン１１０１を選択する。一方、ＯＣＲの文字認識結果が誤っていると判断した場合には、チェック担当者は、ボタン１１０２を選択して、正しい文字を選択する。 That is, the checker selects the button 1101 when determining that the OCR character recognition result is correct. On the other hand, when it is determined that the character recognition result of the OCR is incorrect, the checker selects the correct character by selecting the button 1102.

また、出力部１００５は、例えば、第１の条件または第２の条件の少なくともいずれかを満たさないと判断された回答の文字がハイライト表示された答案の画像データを、チェック担当者が使用するコンピュータに表示することにしてもよい。より具体的には、例えば、出力部１００５は、図１２に示すような答案チェック画面１２００を表示することにしてもよい。 The output unit 1005 uses, for example, the image data of the answer in which the character of the answer that is determined not to satisfy at least one of the first condition and the second condition is highlighted. It may be displayed on a computer. More specifically, for example, the output unit 1005 may display an answer check screen 1200 as shown in FIG.

図１２は、答案チェック画面の画面例を示す説明図である。図１２において、答案チェック画面１２００は、第１の条件または第２の条件の少なくともいずれかを満たさないと判断された回答の文字がハイライト表示された答案画像Ｐ１を表示する操作画面である。具体的には、答案チェック画面１２００では、第１の条件または第２の条件の少なくともいずれかを満たさないと判断された回答の文字が、枠１２０１で囲われて強調表示されている。 FIG. 12 is an explanatory diagram illustrating a screen example of an answer check screen. In FIG. 12, an answer check screen 1200 is an operation screen that displays an answer image P1 in which characters of an answer determined to satisfy at least one of the first condition and the second condition are highlighted. Specifically, on the answer check screen 1200, characters of an answer that are determined not to satisfy at least one of the first condition and the second condition are highlighted in a frame 1201.

答案チェック画面１２００において、不図示の入力装置を用いた操作入力により、枠１２０１を選択すると、例えば、図１１に示したような文字チェック画面１１００に遷移して、ＯＣＲで認識された文字の正しさをチェックすることができる。また、答案チェック画面１２００において、採点チェックを行えるようにしてもよい。 When the frame 1201 is selected by an operation input using an input device (not shown) on the answer check screen 1200, for example, the screen changes to a character check screen 1100 as shown in FIG. You can check the strength. In addition, a scoring check may be performed on the answer check screen 1200.

受付部１００６は、出力またはハイライト表示された回答の文字の画像データに対して、回答の文字が第１の文字と一致するとの入力を受け付ける。ここで、回答の文字が第１の文字と一致するとの入力は、ＯＣＲによる文字認識結果が正しいものであることを示し、例えば、図１１に示した文字チェック画面１１００において入力される。 The accepting unit 1006 accepts an input that the answer character matches the first character with respect to the answer character image data that is output or highlighted. Here, an input indicating that the character of the answer matches the first character indicates that the character recognition result by OCR is correct, and is input on the character check screen 1100 shown in FIG. 11, for example.

具体的には、例えば、受付部１００６は、拠点端末２０１から回答の文字がＯＣＲの文字認識結果と一致するというチェック結果を受信することにより、回答の文字が第１の文字と一致するとの入力を受け付ける。 Specifically, for example, the reception unit 1006 receives an input from the base terminal 201 that the answer character matches the first character by receiving a check result that the answer character matches the OCR character recognition result. Accept.

また、受付部１００６は、回答の文字が第１の文字と一致しないとの入力を受け付けた場合、正しい文字の選択を受け付けることにしてもよい。ここで、回答の文字が第１の文字と一致しないとの入力は、ＯＣＲによる文字認識結果が誤っていることを示し、例えば、図１１に示した文字チェック画面１１００において入力される。 The accepting unit 1006 may accept selection of a correct character when accepting an input that the answer character does not match the first character. Here, the input that the character of the answer does not match the first character indicates that the character recognition result by OCR is incorrect, and is input on the character check screen 1100 shown in FIG. 11, for example.

具体的には、例えば、受付部１００６は、拠点端末２０１から回答の文字がＯＣＲの文字認識結果と一致しないというチェック結果を受信することにより、回答の文字が第１の文字と一致しないとの入力を受け付ける。そして、受付部１００６は、拠点端末２０１から正しい文字の選択結果を受信することにより、回答の文字についての正しい文字の選択を受け付ける。 Specifically, for example, the reception unit 1006 receives a check result indicating that the response character does not match the character recognition result of the OCR from the base terminal 201, whereby the response character does not match the first character. Accept input. The receiving unit 1006 receives the selection result of the correct character from the base terminal 201, thereby receiving the selection of the correct character for the answer character.

この場合、判定部１００４は、回答の文字について選択された正しい文字と設問の正答の文字とが一致するか否かを判定することにしてもよい。これにより、ＯＣＲで誤って認識された文字を用いて正答の文字との一致判定が行われるのを防ぐことができる。 In this case, the determination unit 1004 may determine whether or not the correct character selected for the answer character matches the correct answer character of the question. Thereby, it is possible to prevent a match determination with a correct character from being performed using a character erroneously recognized by OCR.

記録部１００７は、回答の文字が第１の文字と一致するとの入力を受け付けた場合、答案及び／又は設問に対応付けた学年、科目および設問パターンのうちの少なくともいずれかに対応付けて、回答の文字の第１の文字との距離値を記録する。この際、記録部１００７は、答案及び／又は設問に対応付けた学年、科目および設問パターンのうちの少なくともいずれかに対応付けて、第１の文字／第２の文字間の距離値の隣接差をあわせて記録することにしてもよい。 When the recording unit 1007 receives an input indicating that the character of the answer matches the first character, the recording unit 1007 associates the answer with at least one of the grade, the subject, and the question pattern associated with the answer and / or the question. The distance value between the first character and the first character is recorded. At this time, the recording unit 1007 associates the distance difference between the first character / second character with at least one of the grade, subject, and question pattern associated with the answer and / or question. May be recorded together.

具体的には、例えば、記録部１００７は、答案及び／又は設問に対応付けた学年、科目および設問パターンの組合せに対応付けて、回答の文字の第１の文字との距離値、および第１の文字／第２の文字間の距離値の隣接差を、後述の図１３に示すような距離値／隣接差テーブル１３００に記録する。 Specifically, for example, the recording unit 1007 associates the combination of the grade, the subject, and the question pattern with the answer and / or the question, the distance value between the character of the answer and the first character, and the first The adjacent difference in distance value between the second character and the second character is recorded in a distance value / adjacent difference table 1300 as shown in FIG.

これにより、第１の条件または第２の条件を満たさないものの、答案に記入された回答の文字がＯＣＲにより正しく認識されていたときの距離値および距離値の隣接差を、学年、科目および設問パターンの組合せに対応付けて記録することができる。 As a result, although the first condition or the second condition is not satisfied, the distance value and the adjacent difference of the distance value when the character of the answer written in the answer is correctly recognized by the OCR, the grade, subject and question It can be recorded in association with a combination of patterns.

また、記録部１００７は、第１の条件を満たすと判断された場合、答案及び／又は設問に対応付けた学年、科目および設問パターンのうちの少なくともいずれかに対応付けて、回答の文字の第１の文字との距離値を記録する。具体的には、例えば、記録部１００７は、答案及び／又は設問に対応付けた学年、科目および設問パターンの組合せに対応付けて、回答の文字の第１の文字との距離値を距離値／隣接差テーブル１３００に記録する。これにより、第１の条件を満たすときの距離値を、学年、科目および設問パターンの組合せに対応付けて記録することができる。 If the recording unit 1007 determines that the first condition is satisfied, the recording unit 1007 associates at least one of the grade, the subject, and the question pattern associated with the answer and / or the question with the first character of the answer. Record the distance from the 1 character. Specifically, for example, the recording unit 1007 associates the distance value with the first character of the answer character in correspondence with the combination of the grade, subject, and question pattern associated with the answer and / or question. Record in the adjacent difference table 1300. Thereby, the distance value when satisfying the first condition can be recorded in association with the combination of the grade, the subject, and the question pattern.

また、記録部１００７は、第１の条件を満たすと判断され、かつ、第２の条件を満たすと判断された場合に、答案及び／又は設問に対応付けた学年、科目および設問パターンのうちの少なくともいずれかに対応付けて、回答の文字の第１の文字との距離値、および第１の文字／第２の文字間の距離値の隣接差を記録することにしてもよい。 In addition, the recording unit 1007 determines that the first condition is satisfied and the second condition is determined, and the recording unit 1007 includes a grade, a subject, and a question pattern associated with the answer and / or the question. Corresponding to at least one of them, the distance between the answer character and the first character and the adjacent difference between the first character / second character may be recorded.

具体的には、例えば、記録部１００７は、答案及び／又は設問に対応付けた学年、科目および設問パターンの組合せに対応付けて、回答の文字の第１の文字との距離値、および第１の文字／第２の文字間の距離値の隣接差を、距離値／隣接差テーブル１３００に記録する。これにより、第１の条件および第２の条件をともに満たすときの距離値および距離値の隣接差を、学年、科目および設問パターンの組合せに対応付けて記録することができる。 Specifically, for example, the recording unit 1007 associates the combination of the grade, the subject, and the question pattern with the answer and / or the question, the distance value between the character of the answer and the first character, and the first The adjacent difference in the distance value between the second character and the second character is recorded in the distance value / adjacent difference table 1300. Thereby, the distance value and the adjacent difference of the distance value when both the first condition and the second condition are satisfied can be recorded in association with the combination of the school year, the subject, and the question pattern.

ここで、距離値／隣接差テーブル１３００の記憶内容について説明する。 Here, the contents stored in the distance value / adjacent difference table 1300 will be described.

図１３は、距離値／隣接差テーブル１３００の記憶内容の一例を示す説明図である。図１３において、距離値／隣接差テーブル１３００は、対象文字、学年、科目、設問パターン、距離値および隣接差のフィールドを有する。各フィールドに情報を設定することで、距離値／隣接差情報（例えば、距離値／隣接差情報１３００−１〜１３００−４）がレコードとして記憶される。 FIG. 13 is an explanatory diagram showing an example of the contents stored in the distance value / adjacent difference table 1300. In FIG. 13, the distance value / adjacent difference table 1300 includes fields for target character, grade, subject, question pattern, distance value, and adjacent difference. By setting information in each field, distance value / adjacent difference information (for example, distance value / adjacent difference information 1300-1 to 1300-4) is stored as a record.

ここで、対象文字は、ＯＣＲで認識する文字として登録された文字である。学年は、テストの対象となる学年である。科目は、テストの科目である。設問パターンは、設問の種別である。距離値は、設問に対する回答の文字がＯＣＲで認識されたときの対象文字（第１の文字）との距離値である。隣接差は、設問に対する回答の文字がＯＣＲで認識されたときの第１の文字／第２の文字間の距離値の隣接差である。 Here, the target character is a character registered as a character recognized by OCR. The grade is the grade that will be tested. The subject is a test subject. The question pattern is a question type. The distance value is a distance value from the target character (first character) when the character of the answer to the question is recognized by the OCR. The adjacent difference is an adjacent difference in the distance value between the first character and the second character when the character of the answer to the question is recognized by the OCR.

更新部１００８は、学年、科目および設問パターンのうちの少なくともいずれかに対応付けて記録された回答の文字の第１の文字との距離値に基づいて、距離閾値を更新する。ここで、学年、科目および設問パターンのうちの少なくともいずれかに対応付けて記録された、回答の文字の第１の文字との距離値の分布は、正規分布となることが想定される。 The update unit 1008 updates the distance threshold based on the distance value between the answer character and the first character recorded in association with at least one of the school year, the subject, and the question pattern. Here, it is assumed that the distribution of the distance value between the answer character and the first character recorded in association with at least one of the school year, the subject, and the question pattern is a normal distribution.

そこで、更新部１００８は、記録された距離値の統計値に基づいて、全体の９５％〜９８％程度の距離値が存在する範囲を特定して、更新後の距離閾値を決定することにしてもよい。例えば、正規分布の場合、「平均値±２σ」に全体の約９５％が存在することが知られている。σは、標準偏差である。 Therefore, the update unit 1008 determines a distance threshold after updating by specifying a range in which a distance value of about 95% to 98% of the whole exists based on the recorded statistical value of the distance value. Also good. For example, in the case of a normal distribution, it is known that about 95% of the whole exists in “average value ± 2σ”. σ is a standard deviation.

具体的には、例えば、更新部１００８は、図１３に示した距離値／隣接差テーブル１３００を参照して、学年、科目および設問パターンの組合せごとに、各対象文字について、距離値の平均値、標準偏差（σ）を算出する。そして、更新部１００８は、「距離値の平均値＋２σ」を、各対象文字についての更新後の距離閾値に決定することにしてもよい。 Specifically, for example, the update unit 1008 refers to the distance value / adjacent difference table 1300 illustrated in FIG. 13 and calculates the average value of the distance values for each target character for each combination of grade, subject, and question pattern. The standard deviation (σ) is calculated. Then, the updating unit 1008 may determine “average distance value + 2σ” as the updated distance threshold for each target character.

これにより、全体の約９５％の距離値が含まれるような値を、更新後の距離閾値に決定することができる。更新された対象文字についての距離閾値は、例えば、学年、科目および設問パターンの組合せに対応付けて、図９に示した閾値テーブル２５０に記憶される。 As a result, a value that includes about 95% of the total distance value can be determined as the updated distance threshold. The updated distance threshold value for the target character is stored in the threshold value table 250 shown in FIG. 9 in association with, for example, a combination of grade, subject, and question pattern.

なお、更新部１００８は、学年、科目および設問パターンの組合せごとに、全ての対象文字を含めて、距離値の平均値、標準偏差（σ）を算出することにしてもよい。そして、更新部１００８は、「距離値の平均値＋２σ」を、全ての対象文字に共通の更新後の距離閾値に決定することにしてもよい。 The update unit 1008 may calculate the average value of the distance value and the standard deviation (σ) including all target characters for each combination of school year, subject, and question pattern. Then, the update unit 1008 may determine “average distance value + 2σ” as the updated distance threshold value common to all target characters.

また、更新部１００８は、学年、科目および設問パターンのうちの少なくともいずれかに対応付けて記録された第１の文字／第２の文字間の距離値の隣接差に基づいて、第１の文字についての隣接差閾値を更新することにしてもよい。 The update unit 1008 also selects the first character based on the adjacent difference in the distance value between the first character / second character recorded in association with at least one of the school year, the subject, and the question pattern. You may decide to update the adjacent difference threshold about.

具体的には、例えば、更新部１００８は、距離値／隣接差テーブル１３００を参照して、学年、科目および設問パターンの組合せごとに、各対象文字について、隣接差の平均値、標準偏差（σ）を算出する。そして、更新部１００８は、「隣接差の平均値＋２σ」を、各対象文字についての更新後の隣接差閾値に決定することにしてもよい。 Specifically, for example, the update unit 1008 refers to the distance value / adjacent difference table 1300, and for each combination of grade, subject, and question pattern, the average value of the adjacent difference, standard deviation (σ ) Is calculated. Then, the update unit 1008 may determine “average value of adjacent differences + 2σ” as the updated adjacent difference threshold value for each target character.

これにより、全体の約９５％の隣接差が含まれるような値を、更新後の隣接差閾値に決定することができる。更新された対象文字についての隣接差閾値は、例えば、学年、科目および設問パターンの組合せに対応付けて、閾値テーブル２５０に記憶される。 As a result, a value that includes about 95% of the total adjacent difference can be determined as the updated adjacent difference threshold value. The updated adjacent difference threshold value for the target character is stored in the threshold value table 250 in association with, for example, a combination of grade, subject, and question pattern.

（画像処理装置１０１の画像処理手順）
つぎに、画像処理装置１０１の画像処理手順について説明する。画像処理装置１０１の画像処理は、例えば、１つのテストが実施されるたびに、そのテストについての各児童の答案画像から分割された小片画像を用いて実行される。 (Image processing procedure of image processing apparatus 101)
Next, an image processing procedure of the image processing apparatus 101 will be described. For example, every time one test is performed, the image processing of the image processing apparatus 101 is executed using a small piece image divided from each child's answer image for the test.

図１４および図１５は、画像処理装置１０１の画像処理手順の一例を示すフローチャートである。図１４のフローチャートにおいて、まず、画像処理装置１０１は、小片画像ＤＢ２３０から選択されていない未選択の小片画像情報を選択する（ステップＳ１４０１）。そして、画像処理装置１０１は、選択した小片画像情報の小片画像に対してＯＣＲ処理を実施する（ステップＳ１４０２）。 14 and 15 are flowcharts illustrating an example of an image processing procedure of the image processing apparatus 101. In the flowchart of FIG. 14, the image processing apparatus 101 first selects unselected small piece image information that has not been selected from the small piece image DB 230 (step S1401). Then, the image processing apparatus 101 performs OCR processing on the small piece image of the selected small piece image information (step S1402).

つぎに、画像処理装置１０１は、小片画像からＯＣＲ処理により第１の文字であると認識された設問に対する回答の文字の第１の文字との距離値を特定する（ステップＳ１４０３）。そして、画像処理装置１０１は、回答の文字についての第１の文字／第２の文字間の距離値の隣接差を算出する（ステップＳ１４０４）。 Next, the image processing apparatus 101 specifies a distance value between the character of the answer to the question recognized as the first character by the OCR process from the small piece image and the first character (step S1403). Then, the image processing apparatus 101 calculates the adjacent difference in the distance value between the first character and the second character for the reply character (step S1404).

つぎに、画像処理装置１０１は、正答テーブル２４０を参照して、選択した小片画像情報の答案ＩＤおよび設問ＩＤに対応する学年、科目、設問パターンおよび正答を特定する（ステップＳ１４０５）。そして、画像処理装置１０１は、閾値テーブル２５０を参照して、特定した学年、科目および設問パターンの組合せに対応する第１の文字についての距離閾値、隣接差閾値を特定する（ステップＳ１４０６）。 Next, the image processing apparatus 101 refers to the correct answer table 240 and specifies the grade, subject, question pattern, and correct answer corresponding to the answer ID and question ID of the selected piece image information (step S1405). Then, the image processing apparatus 101 refers to the threshold value table 250 and identifies the distance threshold value and the adjacent difference threshold value for the first character corresponding to the identified grade, subject, and question pattern combination (step S1406).

つぎに、画像処理装置１０１は、特定した回答の文字の第１の文字との距離値が、特定した距離閾値よりも小さいか否かを判断する（ステップＳ１４０７）。ここで、距離値が距離閾値よりも小さい場合（ステップＳ１４０７：Ｙｅｓ）、画像処理装置１０１は、算出した隣接差が、特定した隣接差閾値よりも大きいか否かを判断する（ステップＳ１４０８）。 Next, the image processing apparatus 101 determines whether or not the distance value between the identified answer character and the first character is smaller than the identified distance threshold (step S1407). If the distance value is smaller than the distance threshold value (step S1407: Yes), the image processing apparatus 101 determines whether the calculated adjacent difference is larger than the specified adjacent difference threshold value (step S1408).

ここで、隣接差が隣接差閾値よりも大きい場合（ステップＳ１４０８：Ｙｅｓ）、画像処理装置１０１は、ＯＣＲ処理で認識された回答の文字と、特定した正答の文字とが一致するか否かを判定して（ステップＳ１４０９）、図１５に示すステップＳ１５０１に移行する。 If the adjacent difference is larger than the adjacent difference threshold (step S1408: YES), the image processing apparatus 101 determines whether or not the response character recognized by the OCR processing matches the identified correct answer character. It judges (step S1409) and transfers to step S1501 shown in FIG.

また、ステップＳ１４０７において、距離値が距離閾値以上の場合（ステップＳ１４０７：Ｎｏ）、または、ステップＳ１４０８において、隣接差が隣接差閾値以下の場合（ステップＳ１４０８：Ｎｏ）、画像処理装置１０１は、ＯＣＲで認識された文字の正しさをチェックするための文字チェック画面（例えば、図１１参照）を表示する（ステップＳ１４１０）。そして、画像処理装置１０１は、回答の文字がＯＣＲの文字認識結果と一致するというチェック結果の入力を受け付けたか否かを判断する（ステップＳ１４１１）。 In step S1407, if the distance value is equal to or greater than the distance threshold (step S1407: No), or if the adjacent difference is equal to or smaller than the adjacent difference threshold (step S1408: No), the image processing apparatus 101 performs OCR. A character check screen (for example, see FIG. 11) for checking the correctness of the character recognized in step S1410 is displayed (step S1410). Then, the image processing apparatus 101 determines whether or not an input of a check result that the answer character matches the character recognition result of the OCR has been received (step S1411).

ここで、回答の文字がＯＣＲの文字認識結果と一致するというチェック結果の入力を受け付けた場合（ステップＳ１４１１：Ｙｅｓ）、画像処理装置１０１は、ステップＳ１４０９に移行する。 If an input of a check result indicating that the response character matches the character recognition result of the OCR is received (step S1411: YES), the image processing apparatus 101 proceeds to step S1409.

一方、回答の文字がＯＣＲの文字認識結果と一致しないというチェック結果とともに、正しい文字の選択を受け付けた場合（ステップＳ１４１１：Ｎｏ）、画像処理装置１０１は、選択された正しい文字と、特定した正答の文字とが一致するか否かを判定して（ステップＳ１４１２）、図１５に示すステップＳ１５０２に移行する。 On the other hand, when the selection of the correct character is accepted together with the check result that the character of the answer does not match the character recognition result of the OCR (step S1411: No), the image processing apparatus 101 determines the selected correct character and the specified correct answer. It is determined whether or not the character matches (step S1412), and the process proceeds to step S1502 shown in FIG.

図１５のフローチャートにおいて、まず、画像処理装置１０１は、特定した学年、科目および設問パターンの組合せに対応付けて、特定した回答の文字の第１の文字との距離値、および、算出した第１の文字／第２の文字間の距離値の隣接差を距離値／隣接差テーブル１３００に記録する（ステップＳ１５０１）。 In the flowchart of FIG. 15, first, the image processing apparatus 101 associates the specified grade, subject, and question pattern with the distance value of the specified answer character with the first character, and the calculated first value. The adjacent difference of the distance value between the second character and the second character is recorded in the distance value / adjacent difference table 1300 (step S1501).

つぎに、画像処理装置１０１は、小片画像情報の答案ＩＤおよび設問ＩＤと対応付けて、ステップＳ１４０９またはステップＳ１４１２において判定した判定結果を出力する（ステップＳ１５０２）。そして、画像処理装置１０１は、小片画像ＤＢ２３０から選択されていない未選択の小片画像情報があるか否かを判断する（ステップＳ１５０３）。 Next, the image processing apparatus 101 outputs the determination result determined in step S1409 or step S1412 in association with the answer ID and question ID of the small piece image information (step S1502). Then, the image processing apparatus 101 determines whether there is unselected small piece image information not selected from the small piece image DB 230 (step S1503).

ここで、未選択の小片画像情報がある場合（ステップＳ１５０３：Ｙｅｓ）、画像処理装置１０１は、図１４に示したステップＳ１４０１に戻る。一方、未選択の小片画像情報がない場合（ステップＳ１５０３：Ｎｏ）、画像処理装置１０１は、距離値／隣接差テーブル１３００を参照して、学年、科目および設問パターンの組合せごとに、各対象文字について、距離閾値、隣接差閾値を算出する（ステップＳ１５０４）。 If there is unselected small piece image information (step S1503: YES), the image processing apparatus 101 returns to step S1401 shown in FIG. On the other hand, when there is no unselected small piece image information (step S1503: No), the image processing apparatus 101 refers to the distance value / adjacent difference table 1300, and sets each target character for each grade, subject, and question pattern combination. A distance threshold value and an adjacent difference threshold value are calculated for (Step S1504).

そして、画像処理装置１０１は、算出した算出結果に基づいて、閾値テーブル２５０に記憶された各対象文字についての距離閾値、隣接差閾値を更新して（ステップＳ１５０５）、本フローチャートによる一連の処理を終了する。なお、画像処理装置１０１は、ステップＳ１５０２において出力した判定結果に基づいて、各児童の答案を採点して、各児童の答案の採点結果を出力することにしてもよい。 Then, the image processing apparatus 101 updates the distance threshold value and the adjacent difference threshold value for each target character stored in the threshold value table 250 based on the calculated result (step S1505), and performs a series of processes according to this flowchart. finish. Note that the image processing apparatus 101 may score each child's answer based on the determination result output in step S1502 and output the score result of each child's answer.

これにより、学年、科目および設問パターンの組合せに対応する距離閾値、隣接差閾値を用いて、ＯＣＲで認識した文字の確からしさを検証することができる。また、第１の条件または第２の条件を満たさなかったものの、目視確認（目検）によって、回答の文字がＯＣＲにより正しく認識されていたと判断されたときの距離値および距離値の隣接差を用いて、距離閾値および隣接差閾値を更新することができる。 Thereby, the probability of the character recognized by OCR can be verified using the distance threshold value and the adjacent difference threshold value corresponding to the combination of the grade, the subject, and the question pattern. In addition, although the first condition or the second condition is not satisfied, the distance value and the adjacent difference of the distance value when the character of the answer is determined to be correctly recognized by the OCR by visual check (visual inspection) Used to update the distance threshold and the adjacent difference threshold.

以上説明したように、実施の形態にかかる画像処理装置１０１によれば、答案の画像データからＯＣＲにより第１の文字であると認識された設問に対する回答の文字の第１の文字との距離値を、答案及び／又は設問に対応付けた学年、科目および設問パターンのうちの少なくともいずれかに対応する距離閾値と比較することができる。そして、画像処理装置１０１によれば、回答の文字の第１の文字との距離値が距離閾値よりも小さい場合に、回答の文字と設問の正答の文字とが一致するか否かを判定することができる。 As described above, according to the image processing apparatus 101 according to the embodiment, the distance value between the character of the answer to the question recognized as the first character by the OCR from the image data of the answer and the first character. Can be compared with a distance threshold corresponding to at least one of a grade, a subject, and a question pattern associated with the answer and / or question. Then, according to the image processing apparatus 101, when the distance value between the answer character and the first character is smaller than the distance threshold, it is determined whether or not the answer character matches the question correct answer character. be able to.

これにより、学年、科目および設問パターンのうちの少なくともいずれかに対応する距離閾値を用いて、ＯＣＲで認識した文字の確からしさを検証して文字認識精度の向上を図ることができる。例えば、学年に対応する距離閾値を用いることで、学年の違いによって文字を書く能力が異なることを考慮して、ＯＣＲで認識した文字の確からしさを検証することができる。 Thereby, using the distance threshold corresponding to at least one of the school year, the subject, and the question pattern, the accuracy of the character recognized by the OCR can be verified to improve the character recognition accuracy. For example, by using the distance threshold corresponding to the grade, it is possible to verify the certainty of the character recognized by the OCR in consideration of the difference in the ability to write characters depending on the grade.

また、画像処理装置１０１によれば、回答の文字の第１の文字との距離値が距離閾値以上の場合、回答の文字の画像データを出力またはハイライト表示することができる。これにより、ＯＣＲで認識した文字が正しいとはいえない場合に、人手による目視確認（目検）を促すことができる。 Further, according to the image processing apparatus 101, when the distance value between the reply character and the first character is equal to or greater than the distance threshold value, the image data of the reply character can be output or highlighted. Thereby, when the character recognized by OCR cannot be said to be correct, it is possible to prompt visual confirmation (eye check) by hand.

また、画像処理装置１０１によれば、出力またはハイライト表示した回答の文字の画像データに対して、回答の文字が第１の文字と一致するとの入力を受け付けた場合、回答の文字の第１の文字との距離値に基づいて、距離閾値を更新することができる。これにより、第１の条件は満たさなかったものの、人手による目視確認（目検）を実施したところ、答案に記入された回答の文字がＯＣＲにより正しく認識されていたと判断されたときの距離値を用いて、距離閾値を更新することができる。 Further, according to the image processing apparatus 101, when an input indicating that the answer character matches the first character is received for the image data of the answer character that is output or highlighted, the first character of the answer character is received. The distance threshold value can be updated based on the distance value to the character. As a result, although the first condition was not satisfied, the distance value when it was determined that the character of the answer written in the answer was correctly recognized by the OCR when the visual check (check) was performed manually. Can be used to update the distance threshold.

具体的には、例えば、画像処理装置１０１は、第１の条件を満たすと判断された場合、答案及び／又は設問に対応付けた学年、科目および設問パターンのうちの少なくともいずれかに対応付けて、回答の文字の第１の文字との距離値を記録する。また、画像処理装置１０１は、回答の文字が第１の文字と一致するとの入力を受け付けた場合、答案及び／又は設問に対応付けた学年、科目および設問パターンのうちの少なくともいずれかに対応付けて、回答の文字の第１の文字との距離値を記録する。そして、画像処理装置１０１は、学年、科目および設問パターンのうちの少なくともいずれかに対応付けて記録した距離値の統計値（例えば、平均値、標準偏差など）に基づいて、距離閾値を更新する。 Specifically, for example, when the image processing apparatus 101 determines that the first condition is satisfied, the image processing apparatus 101 associates it with at least one of a grade, a subject, and a question pattern associated with the answer and / or the question. The distance value between the answer character and the first character is recorded. Further, when the image processing apparatus 101 receives an input indicating that the character of the answer matches the first character, the image processing device 101 associates it with at least one of the grade, the subject, and the question pattern associated with the answer and / or the question. Then, the distance value between the answer character and the first character is recorded. Then, the image processing apparatus 101 updates the distance threshold based on a statistical value (for example, average value, standard deviation, etc.) of the distance value recorded in association with at least one of the grade, subject, and question pattern. .

これにより、画像処理システム２００を運用しながら、ＯＣＲにより正しく認識されていたときの距離値を収集し、収集した距離値を用いて、学年、科目、設問パターン等の特性に応じた距離閾値を統計的に求めることができる。この結果、距離閾値が厳しくなりすぎたときは、目視確認で正しい文字であったと判断されることが多くなり、距離閾値が緩くなるように自動調整される。一方、距離閾値が緩くなりすぎたときは、分布が中央に寄り、距離閾値が厳しくなるように自動調整される。 Thereby, while operating the image processing system 200, the distance value when correctly recognized by the OCR is collected, and the distance threshold according to the characteristics of the grade, subject, question pattern, etc. is collected using the collected distance value. It can be obtained statistically. As a result, when the distance threshold becomes too strict, it is often determined that the character is correct by visual confirmation, and the distance threshold is automatically adjusted so as to be relaxed. On the other hand, when the distance threshold becomes too loose, the distribution is automatically adjusted so that the distribution is closer to the center and the distance threshold becomes stricter.

また、画像処理装置１０１によれば、第１の文字／第２の文字間の距離値の隣接差を、答案及び／又は設問に対応付けた学年、科目および設問パターンのうちの少なくともいずれかに対応する第１の文字についての隣接差閾値と比較することができる。そして、画像処理装置１０１によれば、回答の文字の第１の文字との距離値が距離閾値よりも小さく、かつ、第１の文字／第２の文字間の距離値の隣接差が隣接差閾値よりも大きい場合に、回答の文字と設問の正答の文字とが一致するか否かを判定することができる。 Further, according to the image processing apparatus 101, the adjacent difference in the distance value between the first character / second character is set to at least one of the grade, subject, and question pattern associated with the answer and / or question. It can be compared with the adjacent difference threshold for the corresponding first character. According to the image processing apparatus 101, the distance value between the reply character and the first character is smaller than the distance threshold, and the adjacent difference in the distance value between the first character and the second character is the adjacent difference. When it is larger than the threshold value, it can be determined whether the character of the answer matches the character of the correct answer of the question.

これにより、学年、科目および設問パターンのうちの少なくともいずれかに対応する対象文字ごとの隣接差閾値を用いて、２番目に一致度合いが高い文字（第２の文字）が正しい文字である可能性の高さを判断して、ＯＣＲで認識した文字の確からしさを検証することができる。 As a result, using the adjacent difference threshold value for each target character corresponding to at least one of the grade, subject, and question pattern, the second most likely character (second character) may be the correct character It is possible to verify the accuracy of the characters recognized by the OCR.

また、画像処理装置１０１によれば、第１の文字／第２の文字間の距離値の隣接差が隣接差閾値以下の場合に、回答の文字の画像データを出力またはハイライト表示することができる。これにより、第２の文字が正しい文字である可能性が無視できない程度にある場合に、人手による目視確認（目検）を促すことができる。 Further, according to the image processing apparatus 101, when the adjacent difference in the distance value between the first character and the second character is equal to or smaller than the adjacent difference threshold, the image data of the answer character can be output or highlighted. it can. Thereby, when there is a possibility that the second character is a correct character cannot be ignored, it is possible to prompt a visual check (visual check) by hand.

また、画像処理装置１０１によれば、出力またはハイライト表示した回答の文字の画像データに対して、回答の文字が第１の文字と一致するとの入力を受け付けた場合、第１の文字／第２の文字間の距離値の隣接差に基づいて、隣接差閾値を更新することができる。これにより、第２の条件は満たさなかったものの、人手による目視確認（目検）を実施したところ、答案に記入された回答の文字がＯＣＲにより正しく認識されていたと判断されたときの距離値の隣接差を用いて、隣接差閾値を更新することができる。 Also, according to the image processing apparatus 101, when an input indicating that the answer character matches the first character is received for the image data of the answer character output or highlighted, the first character / the first character is received. The adjacent difference threshold value can be updated based on the adjacent difference in the distance value between the two characters. As a result, although the second condition was not satisfied, the distance value when it was determined that the character of the answer written in the answer was correctly recognized by the OCR when visual confirmation (check) was performed manually. The adjacent difference threshold can be updated using the adjacent difference.

また、画像処理装置１０１によれば、設問と対応付けて、判定した判定結果を出力することができる。これにより、各設問に対する回答の正否を特定することができ、例えば、児童ごとに答案の自動採点を行うことが可能となる。 Further, according to the image processing apparatus 101, the determined determination result can be output in association with the question. Thereby, it is possible to specify whether the answer to each question is correct or not, for example, it is possible to automatically grade the answer for each child.

これらのことから、画像処理装置１０１によれば、学年、科目、設問パターン等の特性に応じた判断基準を用いて、ＯＣＲで認識した文字の確からしさを検証して文字認識精度の向上を図ることができる。これにより、小学生等を対象としたテストの答案についての文字認識精度を確保することが可能となり、ひいては、テストの自動採点の精度を向上させることができる。 Therefore, according to the image processing apparatus 101, the accuracy of the character recognized by the OCR is verified by using the determination criteria according to the characteristics of the grade, the subject, the question pattern, etc., and the character recognition accuracy is improved. be able to. This makes it possible to ensure character recognition accuracy for test answers intended for elementary school students and the like, and thus improve the accuracy of automatic test scoring.

なお、本実施の形態で説明した画像処理方法は、予め用意されたプログラムをパーソナル・コンピュータやワークステーション等のコンピュータで実行することにより実現することができる。本画像処理プログラムは、ハードディスク、フレキシブルディスク、ＣＤ（ＣｏｍｐａｃｔＤｉｓｃ）−ＲＯＭ、ＭＯ（Ｍａｇｎｅｔｏ−Ｏｐｔｉｃａｌｄｉｓｋ）、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋ）、ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）メモリ等のコンピュータで読み取り可能な記録媒体に記録され、コンピュータによって記録媒体から読み出されることによって実行される。また、本画像処理プログラムは、インターネット等のネットワークを介して配布してもよい。 The image processing method described in the present embodiment can be realized by executing a program prepared in advance on a computer such as a personal computer or a workstation. This image processing program is a computer-readable recording medium such as a hard disk, flexible disk, CD (Compact Disc) -ROM, MO (Magneto-Optical disk), DVD (Digital Versatile Disk), USB (Universal Serial Bus) memory, etc. And is executed by being read from the recording medium by a computer. The image processing program may be distributed via a network such as the Internet.

上述した実施の形態に関し、さらに以下の付記を開示する。 The following additional notes are disclosed with respect to the embodiment described above.

（付記１）答案の画像データからＯＣＲにより第１の文字であると認識された設問に対する回答の文字の前記第１の文字との一致度を、記憶部に記憶した前記答案及び／又は前記設問に対応付けた学年、科目および設問の種別のうちの少なくともいずれかに対応する一致度の第１の閾値と比較し、
前記回答の文字の前記第１の文字との一致度が前記第１の閾値よりも大きい場合に、前記回答の文字と前記設問の正答の文字とが一致するか否かを判定する、
処理をコンピュータに実行させることを特徴とする画像処理プログラム。 (Supplementary Note 1) The answer and / or the question stored in the storage unit with the degree of coincidence of the answer character with the first character for the question recognized as the first character by OCR from the image data of the answer Compared to the first threshold of coincidence corresponding to at least one of the grade, subject and question type associated with
Determining whether the character of the answer matches the character of the correct answer of the question when the degree of coincidence of the character of the answer with the first character is greater than the first threshold;
An image processing program for causing a computer to execute processing.

（付記２）前記回答の文字の前記第１の文字との一致度が前記第１の閾値以下の場合、前記回答の文字の画像データを出力またはハイライト表示する、処理を前記コンピュータに実行させることを特徴とする付記１に記載の画像処理プログラム。 (Appendix 2) When the degree of coincidence between the character of the answer and the first character is equal to or less than the first threshold value, the image data of the character of the answer is output or highlighted, and the process is executed by the computer The image processing program according to supplementary note 1, wherein

（付記３）出力またはハイライト表示した前記回答の文字の画像データに対して、前記回答の文字が前記第１の文字と一致するとの入力を受け付けた場合、前記回答の文字の前記第１の文字との一致度に基づいて、前記第１の閾値を更新する、処理を前記コンピュータに実行させることを特徴とする付記２に記載の画像処理プログラム。 (Supplementary Note 3) When an input indicating that the character of the answer matches the first character is received for the image data of the character of the answer that is output or highlighted, the first character of the character of the answer The image processing program according to appendix 2, wherein the computer is caused to execute a process of updating the first threshold based on a degree of coincidence with a character.

（付記４）前記記憶部は、前記答案及び／又は前記設問に対応付けた学年、科目および設問の種別のうちの少なくともいずれかに対応する前記第１の文字についての一致度の隣接差に関する第２の閾値をさらに記憶しており、
前記回答の文字の前記第１の文字との一致度と、前記回答の文字の第２の文字との一致度との差分を前記第２の閾値と比較する、処理を前記コンピュータに実行させ、
前記判定する処理は、
前記回答の文字の前記第１の文字との一致度が前記第１の閾値よりも大きく、かつ、前記差分が前記第２の閾値よりも大きい場合に、前記回答の文字と前記設問の正答の文字とが一致するか否かを判定する、ことを特徴とする付記１〜３のいずれか一つに記載の画像処理プログラム。 (Additional remark 4) The said memory | storage part is the 1st regarding the adjacent difference of the coincidence degree about the said 1st character corresponding to at least any one of the grade associated with the said answer and / or the said question, and the kind of question. 2 threshold values are further stored,
Causing the computer to execute a process of comparing the difference between the degree of coincidence of the character of the answer with the first character and the degree of coincidence of the character of the answer with the second character with the second threshold;
The determination process is as follows.
When the degree of coincidence between the character of the answer and the first character is greater than the first threshold and the difference is greater than the second threshold, the character of the answer and the correct answer of the question 4. The image processing program according to any one of appendices 1 to 3, wherein it is determined whether or not the character matches.

（付記５）前記設問と対応付けて、判定した判定結果を出力する、
処理を前記コンピュータに実行させることを特徴とする付記１〜４のいずれか一つに記載の画像処理プログラム。 (Supplementary Note 5) Output the determined determination result in association with the question.
The image processing program according to any one of appendices 1 to 4, which causes the computer to execute processing.

（付記６）前記回答の文字の前記第１の文字との一致度が前記第１の閾値以下の場合、または、前記差分が前記第２の閾値以下の場合、前記回答の文字の画像データを出力またはハイライト表示する、処理を前記コンピュータに実行させることを特徴とする付記４に記載の画像処理プログラム。 (Supplementary Note 6) When the degree of coincidence between the character of the answer and the first character is equal to or less than the first threshold value, or when the difference is equal to or less than the second threshold value, the image data of the character of the answer is The image processing program according to appendix 4, wherein the computer executes the process of outputting or highlighting.

（付記７）出力またはハイライト表示した前記回答の文字の画像データに対して、前記回答の文字が前記第１の文字と一致するとの入力を受け付けた場合、前記回答の文字の前記第１の文字との一致度と、前記回答の文字の前記第２の文字との一致度との差分に基づいて、前記第２の閾値を更新する、処理を前記コンピュータに実行させることを特徴とする付記６に記載の画像処理プログラム。 (Supplementary note 7) When an input indicating that the character of the answer matches the first character is received for the image data of the character of the answer that is output or highlighted, the first character of the character of the answer An additional note that causes the computer to execute a process of updating the second threshold based on a difference between a matching degree with a character and a matching degree between the reply character and the second character. 6. The image processing program according to 6.

（付記８）前記第１の文字は、ＯＣＲで認識する文字として予め登録された複数の文字のうち前記回答の文字との一致度が最大の文字であり、
前記第２の文字は、前記複数の文字のうち前記回答の文字との一致度が前記第１の文字のつぎに大きい文字である、
ことを特徴とする付記４に記載の画像処理プログラム。 (Supplementary Note 8) The first character is a character having a maximum coincidence with the character of the answer among a plurality of characters registered in advance as characters recognized by OCR,
The second character is a character having a degree of coincidence with the character of the answer among the plurality of characters next to the first character.
The image processing program according to supplementary note 4, wherein

（付記９）答案の画像データからＯＣＲにより第１の文字であると認識された設問に対する回答の文字の前記第１の文字との一致度を、記憶部に記憶した前記答案及び／又は前記設問に対応付けた学年、科目および設問の種別のうちの少なくともいずれかに対応する一致度の第１の閾値と比較し、
前記回答の文字の前記第１の文字との一致度が前記第１の閾値よりも大きい場合に、前記回答の文字と前記設問の正答の文字とが一致するか否かを判定する、
処理をコンピュータが実行することを特徴とする画像処理方法。 (Supplementary note 9) The answer and / or the question stored in the storage unit, the degree of coincidence of the answer character with the first character for the question recognized as the first character by OCR from the image data of the answer Compared to the first threshold of coincidence corresponding to at least one of the grade, subject and question type associated with
Determining whether the character of the answer matches the character of the correct answer of the question when the degree of coincidence of the character of the answer with the first character is greater than the first threshold;
An image processing method, wherein the computer executes the processing.

（付記１０）答案の画像データからＯＣＲにより第１の文字であると認識された設問に対する回答の文字の前記第１の文字との一致度を、記憶部に記憶した前記答案及び／又は前記設問に対応付けた学年、科目および設問の種別のうちの少なくともいずれかに対応する一致度の第１の閾値と比較し、
前記回答の文字の前記第１の文字との一致度が前記第１の閾値よりも大きい場合に、前記回答の文字と前記設問の正答の文字とが一致するか否かを判定する、
制御部を有することを特徴とする画像処理装置。 (Supplementary Note 10) The answer and / or the question stored in the storage unit with the degree of coincidence of the answer character with the first character for the question recognized as the first character by OCR from the image data of the answer Compared to the first threshold of coincidence corresponding to at least one of the grade, subject and question type associated with
Determining whether the character of the answer matches the character of the correct answer of the question when the degree of coincidence of the character of the answer with the first character is greater than the first threshold;
An image processing apparatus having a control unit.

１０１画像処理装置
１１０記憶部
２００画像処理システム
２０１拠点端末
２０２スキャナ
２１０ネットワーク
２２０答案画像ＤＢ
２３０小片画像ＤＢ
２４０正答テーブル
２５０閾値テーブル
３００バス
３０１ＣＰＵ
３０２メモリ
３０３Ｉ／Ｆ
３０４ディスクドライブ
３０５ディスク
１００１取得部
１００２特定部
１００３判断部
１００４判定部
１００５出力部
１００６受付部
１００７記録部
１００８更新部
１１００文字チェック画面
１２００答案チェック画面
１３００距離値／隣接差テーブル DESCRIPTION OF SYMBOLS 101 Image processing apparatus 110 Storage part 200 Image processing system 201 Base terminal 202 Scanner 210 Network 220 Answer image DB
230 Small Image DB
240 Correct answer table 250 Threshold table 300 Bus 301 CPU
302 Memory 303 I / F
304 disk drive 305 disk 1001 acquisition unit 1002 identification unit 1003 determination unit 1004 determination unit 1005 output unit 1006 reception unit 1007 recording unit 1008 update unit 1100 character check screen 1200 answer check screen 1300 distance value / adjacent difference table

Claims

答案の画像データからＯＣＲにより第１の文字であると認識された設問に対する回答の文字の前記第１の文字との一致度を、記憶部に記憶した前記答案及び／又は前記設問に対応付けた学年、科目および設問の種別のうちの少なくともいずれかに対応する一致度の第１の閾値と比較し、
前記回答の文字の前記第１の文字との一致度が前記第１の閾値よりも大きい場合に、前記回答の文字と前記設問の正答の文字とが一致するか否かを判定する、
処理をコンピュータに実行させることを特徴とする画像処理プログラム。 The degree of coincidence between the character of the answer to the question recognized as the first character by OCR from the image data of the answer is associated with the answer and / or the question stored in the storage unit Compare with the first threshold of coincidence corresponding to at least one of grade, subject and question type,
Determining whether the character of the answer matches the character of the correct answer of the question when the degree of coincidence of the character of the answer with the first character is greater than the first threshold;
An image processing program for causing a computer to execute processing.

前記回答の文字の前記第１の文字との一致度が前記第１の閾値以下の場合、前記回答の文字の画像データを出力またはハイライト表示する、処理を前記コンピュータに実行させることを特徴とする請求項１に記載の画像処理プログラム。 When the degree of coincidence of the answer character with the first character is equal to or less than the first threshold value, the image data of the answer character is output or highlighted, and the computer is caused to execute a process. The image processing program according to claim 1.

出力またはハイライト表示した前記回答の文字の画像データに対して、前記回答の文字が前記第１の文字と一致するとの入力を受け付けた場合、前記回答の文字の前記第１の文字との一致度に基づいて、前記第１の閾値を更新する、処理を前記コンピュータに実行させることを特徴とする請求項２に記載の画像処理プログラム。 When the input that the character of the answer matches the first character is received for the image data of the character of the answer that is output or highlighted, the character of the answer matches the first character The image processing program according to claim 2, wherein the computer executes the process of updating the first threshold based on the degree.

前記記憶部は、前記答案及び／又は前記設問に対応付けた学年、科目および設問の種別のうちの少なくともいずれかに対応する前記第１の文字についての一致度の隣接差に関する第２の閾値をさらに記憶しており、
前記回答の文字の前記第１の文字との一致度と、前記回答の文字の第２の文字との一致度との差分を前記第２の閾値と比較する、処理を前記コンピュータに実行させ、
前記判定する処理は、
前記回答の文字の前記第１の文字との一致度が前記第１の閾値よりも大きく、かつ、前記差分が前記第２の閾値よりも大きい場合に、前記回答の文字と前記設問の正答の文字とが一致するか否かを判定する、ことを特徴とする請求項１〜３のいずれか一つに記載の画像処理プログラム。 The storage unit sets a second threshold value related to the adjacent difference in the degree of coincidence for the first character corresponding to at least one of a grade, a subject, and a question type associated with the answer and / or the question. And remember
Causing the computer to execute a process of comparing the difference between the degree of coincidence of the character of the answer with the first character and the degree of coincidence of the character of the answer with the second character with the second threshold;
The determination process is as follows.
When the degree of coincidence between the character of the answer and the first character is greater than the first threshold and the difference is greater than the second threshold, the character of the answer and the correct answer of the question The image processing program according to claim 1, wherein it is determined whether or not the character matches.

前記設問と対応付けて、判定した判定結果を出力する、
処理を前記コンピュータに実行させることを特徴とする請求項１〜４のいずれか一つに記載の画像処理プログラム。 Output the determination result determined in association with the question,
The image processing program according to claim 1, wherein the computer executes the process.

答案の画像データからＯＣＲにより第１の文字であると認識された設問に対する回答の文字の前記第１の文字との一致度を、記憶部に記憶した前記答案及び／又は前記設問に対応付けた学年、科目および設問の種別のうちの少なくともいずれかに対応する一致度の第１の閾値と比較し、
前記回答の文字の前記第１の文字との一致度が前記第１の閾値よりも大きい場合に、前記回答の文字と前記設問の正答の文字とが一致するか否かを判定する、
処理をコンピュータが実行することを特徴とする画像処理方法。 The degree of coincidence between the character of the answer to the question recognized as the first character by OCR from the image data of the answer is associated with the answer and / or the question stored in the storage unit Compare with the first threshold of coincidence corresponding to at least one of grade, subject and question type,
Determining whether the character of the answer matches the character of the correct answer of the question when the degree of coincidence of the character of the answer with the first character is greater than the first threshold;
An image processing method, wherein the computer executes the processing.

答案の画像データからＯＣＲにより第１の文字であると認識された設問に対する回答の文字の前記第１の文字との一致度を、記憶部に記憶した前記答案及び／又は前記設問に対応付けた学年、科目および設問の種別のうちの少なくともいずれかに対応する一致度の第１の閾値と比較し、
前記回答の文字の前記第１の文字との一致度が前記第１の閾値よりも大きい場合に、前記回答の文字と前記設問の正答の文字とが一致するか否かを判定する、
制御部を有することを特徴とする画像処理装置。 The degree of coincidence between the character of the answer to the question recognized as the first character by OCR from the image data of the answer is associated with the answer and / or the question stored in the storage unit Compare with the first threshold of coincidence corresponding to at least one of grade, subject and question type,
Determining whether the character of the answer matches the character of the correct answer of the question when the degree of coincidence of the character of the answer with the first character is greater than the first threshold;
An image processing apparatus having a control unit.