JP7456580B2

JP7456580B2 - Information processing device, information processing system, and information processing method

Info

Publication number: JP7456580B2
Application number: JP2020040753A
Authority: JP
Inventors: 健太柴田; 龍小野; 英樹蝦名; 希彦岩田; 知昭古崎
Original assignee: Glory Ltd
Current assignee: Glory Ltd
Priority date: 2020-03-10
Filing date: 2020-03-10
Publication date: 2024-03-27
Anticipated expiration: 2040-03-10
Also published as: JP2021144289A

Description

本発明は、情報処理装置、情報処理システム及び情報処理方法に関する。 The present invention relates to an information processing device, an information processing system, and an information processing method.

各店舗における１日の売上を管理する目的で、ＰＯＳ（＝Point Of sale System）レジスタには、１日の売上を集計した精算レシートを出力する機能が用意されている。また、一部のユーザは、精算レシートに印字された情報を表計算ソフト等に手入力して活用している。 In order to manage the daily sales at each store, POS (Point Of Sale System) registers have a function that prints out a receipt that tallys up the day's sales. Some users also manually input the information printed on the receipt into a spreadsheet program or similar.

特開２０１２－２２１１８３号公報JP 2012-221183 A

ところで、精算レシートは一般に長く、その紙面には多くの文字が印字される。このため、一般的なＯＣＲ（＝Optical Character Recognition）技術を適用して、精算レシートに印字されている全ての文字を認識しようとすると、その処理時間が長くなる問題がある。例えば１枚の精算レシートのＯＣＲ処理に数十秒もの時間を必要とする。数十秒は、ＯＣＲ処理の結果を待つユーザには十分に長い時間であり、複数枚の精算レシートをまとめてＯＣＲ処理したい場合には、待ち時間だけ作業時間が長くなる。 By the way, payment receipts are generally long, and many characters are printed on the paper. For this reason, if a general OCR (=Optical Character Recognition) technique is applied to recognize all the characters printed on the payment receipt, there is a problem that the processing time will be long. For example, OCR processing of one payment receipt requires several tens of seconds. Several tens of seconds is a sufficiently long time for a user to wait for the results of OCR processing, and if a user wishes to perform OCR processing on a plurality of payment receipts at once, the working time increases by the waiting time.

本発明は、レシートから必要とする文字情報を認識するために要する時間の短縮を目的とする。 The present invention aims to shorten the time required to recognize necessary character information from a receipt.

請求項１に記載の発明は、制御部と、撮像部が撮像した精算レシートの画像データを、当該撮像部から取得する取得部と、前記画像データのうち文字が含まれる各行の先頭側から固定文字数の部分画像データを１行ずつ順番に切り出す切出処理部と、切り出された前記部分画像データから文字情報を認識する文字認識部とを有し、前記制御部は、前記文字認識部から出力される前記文字情報に対応する登録単語が存在する場合、対応する行全体を文字認識の対象に決定し、前記文字認識部から出力される前記文字情報に対応する登録単語が存在しない場合、対応する行全体を文字認識の対象から除外する、情報処理装置である。
請求項２に記載の発明は、前記制御部は、前記文字認識部から出力される前記文字情報に対応する金額情報を決定する、請求項１に記載の情報処理装置である。
請求項３に記載の発明は、１つの前記部分画像データについて、前記文字認識部が前記文字情報の認識を開始する行よりも、当該文字情報の認識を終了する行がレシートの上方に位置する、請求項１又は２に記載の情報処理装置である。
請求項４に記載の発明は、制御部と、精算レシートを撮像して画像データを出力する撮像部と、前記撮像部から前記画像データを取得する取得部と、前記画像データのうち文字が含まれる各行の先頭側から固定文字数の部分画像データを１行ずつ順番に切り出す切出処理部と、切り出された前記部分画像データから文字情報を認識する文字認識部とを有し、前記制御部は、前記文字認識部から出力される前記文字情報に対応する登録単語が存在する場合、対応する行全体を文字認識の対象に決定し、前記文字認識部から出力される前記文字情報に対応する登録単語が存在しない場合、対応する行全体を文字認識の対象から除外する、情報処理システムである。
請求項５に記載の発明は、精算レシートの画像データを取得する処理と、前記画像データのうち文字が含まれる各行の先頭側から固定文字数の部分画像データを１行ずつ順番に切り出す処理と、切り出された前記部分画像データから文字情報を認識する処理と、認識された前記文字情報に対応する登録単語が存在する場合、対応する行全体を文字認識の対象に決定する処理と、文字認識部から出力される前記文字情報に対応する登録単語が存在しない場合、対応する行全体を文字認識の対象から除外する処理と、を有する情報処理方法である。 The invention according to claim 1 includes: a control unit; an acquisition unit that acquires image data of a payment receipt imaged by an imaging unit from the imaging unit; The control unit includes a cutout processing unit that sequentially cuts out partial image data of the number of characters one line at a time, and a character recognition unit that recognizes character information from the cut out partial image data, and the control unit controls the output from the character recognition unit. If there is a registered word corresponding to the character information outputted from the character recognition unit, the entire corresponding line is determined to be the target of character recognition , and if there is no registered word corresponding to the character information output from the character recognition unit, the corresponding line is determined to be the target of character recognition. This is an information processing device that excludes entire lines from character recognition .
The invention according to claim 2 is the information processing apparatus according to claim 1, wherein the control section determines amount information corresponding to the character information output from the character recognition section.
The invention according to claim 3 provides that, for one piece of the partial image data, a line where the character recognition unit ends recognition of the character information is located above a line where the character recognition unit starts recognition of the character information. , an information processing apparatus according to claim 1 or 2 .
The invention according to claim 4 includes : a control unit; an imaging unit that images a payment receipt and outputs image data; an acquisition unit that acquires the image data from the imaging unit; a cutout processing unit that sequentially cuts out partial image data of a fixed number of characters line by line from the beginning of each line ; and a character recognition unit that recognizes character information from the cut out partial image data; , if there is a registered word corresponding to the character information output from the character recognition unit, the entire corresponding line is determined as a target for character recognition , and the registration word corresponding to the character information output from the character recognition unit is determined. This information processing system excludes the entire corresponding line from character recognition when a word does not exist .
The invention according to claim 5 includes a process of acquiring image data of a payment receipt, a process of sequentially cutting out partial image data of a fixed number of characters one line at a time from the beginning of each line containing characters from the image data, a process of recognizing character information from the extracted partial image data; a process of determining the entire corresponding line as a target for character recognition if there is a registered word corresponding to the recognized character information; and a character recognition unit. If there is no registered word corresponding to the character information outputted from the character information, the information processing method includes the step of excluding the entire corresponding line from character recognition targets .

本発明によれば、レシートから必要とする文字情報を認識するために要する時間を短縮できる。 According to the present invention, the time required to recognize necessary character information from a receipt can be reduced.

実施の形態１で使用する情報処理システムの構成例を説明する図である。1 is a diagram illustrating a configuration example of an information processing system used in Embodiment 1. FIG. 精算レシートで使用される可能性がある単語と、売上金等の自動入力サービスで扱う単語との対応関係の一例を示す図である。FIG. 3 is a diagram illustrating an example of the correspondence between words that may be used in a payment receipt and words used in an automatic input service such as sales proceeds. 実施の形態１で使用するユーザ端末の構成例を説明する図である。FIG. 2 is a diagram illustrating a configuration example of a user terminal used in Embodiment 1. FIG. 実施の形態１で使用するＯＣＲサーバの構成例を説明する図である。FIG. 2 is a diagram illustrating a configuration example of an OCR server used in the first embodiment. 実施の形態１で使用するＯＣＲサーバが実行する処理動作の一例を示すフローチャートである。3 is a flowchart illustrating an example of processing operations executed by the OCR server used in the first embodiment. ユーザ端末の表示部に表示される画像の例を示す図である。FIG. 3 is a diagram illustrating an example of an image displayed on a display unit of a user terminal. 各行に対応する文字画像データのうち対象文字のサーチに用いる部分を説明する図である。13A and 13B are diagrams illustrating a portion of character image data corresponding to each line that is used to search for a target character. 対象文字のサーチの結果の具体例を説明する図である。FIG. 3 is a diagram illustrating a specific example of a search result for a target character. ＯＣＲ処理で認識される情報に紐付けられる情報の例を説明する図である。FIG. 3 is a diagram illustrating an example of information linked to information recognized by OCR processing. マッチングの一例を説明する図である。It is a figure explaining an example of matching. 値の選択に使用される処理の内容を説明するフローチャートである。3 is a flowchart illustrating the content of processing used to select a value. ステップ７における値の選択を説明する図である。7 is a diagram illustrating selection of values in step 7. FIG. グループ別に値を選択する処理の内容を説明するフローチャートである。12 is a flowchart illustrating the contents of a process for selecting values for each group. グループ「現金売上」に紐付けられている２つの値から１つを選択する例を説明する図である。FIG. 3 is a diagram illustrating an example in which one is selected from two values linked to the group "cash sales." グループ「純売上」に紐付けられている２つの値から１つを選択する例を説明する図である。FIG. 6 is a diagram illustrating an example of selecting one from two values linked to the group "net sales." FIG. ユーザ端末の表示部に表示される売上金等の自動入力結果の画面例を説明する図である。FIG. 3 is a diagram illustrating an example of a screen showing automatic input results such as sales proceeds displayed on a display unit of a user terminal. 実施の形態２におけるＯＣＲサーバがＯＣＲ処理を実行する範囲を説明する図である。7 is a diagram illustrating a range in which an OCR server executes OCR processing in Embodiment 2. FIG.

以下、添付図面を参照して、実施の形態について詳細に説明する。
＜実施の形態１＞
＜システムの構成＞
図１は、実施の形態１で使用する情報処理システム１の構成例を説明する図である。
図１に示す情報処理システム１は、複数台のユーザ端末１００と、ユーザ端末１００からアップロードされた画像データをＯＣＲ処理し、認識された文字の中から予め定めた情報をユーザ端末１００に通知するＯＣＲサーバ２００とで構成される。
図１に示す情報処理システム１の場合、ユーザ端末１００とＯＣＲサーバ２００は、クラウドネットワーク３００を通じて接続されている。従って、ＯＣＲサーバ２００は、クラウドサーバの一例である。 Hereinafter, embodiments will be described in detail with reference to the accompanying drawings.
<Embodiment 1>
<System configuration>
FIG. 1 is a diagram illustrating a configuration example of an information processing system 1 used in the first embodiment.
The information processing system 1 shown in FIG. 1 performs OCR processing on a plurality of user terminals 100 and image data uploaded from the user terminals 100, and notifies the user terminal 100 of predetermined information from among the recognized characters. It is composed of an OCR server 200.
In the case of the information processing system 1 shown in FIG. 1, the user terminal 100 and the OCR server 200 are connected through the cloud network 300. Therefore, OCR server 200 is an example of a cloud server.

本実施の形態におけるユーザ端末１００は、精算レシートの撮像により生成された画像データをＯＣＲサーバ２００にアップロードする。本実施の形態における精算レシートとは、日毎の売上を統計的に処理した情報が印字されたレシートの意味で使用する。換言すると、精算レシートは、個々の取引に伴う金額を個別に印字したレシートの意味ではなく、営業日を単位として予め設定された項目別に集計された金額が印字されているレシートである。売上には、例えば現金による売上、クレジットカードによる売上、電子マネーによる売上等がある。この他、精算レシートには、１つの項目に紐付けられる値が１つに限らないという特徴がある。換言すると、精算レシートは、複数の値が紐付けられた項目を含むことがある。一方、消費者が受け取るレシートでは、消費者が購入した又は提供を受けたサービスの内容と対応する金額とが１対１に紐づけられている。 User terminal 100 in this embodiment uploads image data generated by capturing an image of a payment receipt to OCR server 200. The settlement receipt in this embodiment is used to mean a receipt on which information obtained by statistically processing daily sales is printed. In other words, the settlement receipt does not mean a receipt on which the amount associated with each transaction is printed individually, but a receipt on which the amount summed up by item set in advance for each business day is printed. Sales include, for example, cash sales, credit card sales, electronic money sales, and the like. In addition, the payment receipt is characterized in that the value that can be linked to one item is not limited to one. In other words, the payment receipt may include items to which multiple values are linked. On the other hand, in the receipt that the consumer receives, the content of the service that the consumer purchased or received and the corresponding amount are linked on a one-to-one basis.

図１に示すユーザ端末１００は、スマートフォンを想定している。もっとも、ユーザ端末１００は、スマートフォン以外の端末、例えばタブレット端末でもよい。
本実施の形態におけるユーザ端末１００は、本体のカメラで撮像された精算レシートの画像データを、ＯＣＲサーバ２００にアップロードする。もっとも、ユーザ端末１００とは別の装置であるデジタルカメラやスキャナで精算レシートを撮像し、各装置から出力される画像データを取得して、ＯＣＲサーバ２００にアップロードしてもよい。
なお、デジタルカメラ等で撮像された画像データをＯＣＲサーバ２００にアップロードする形態であれば、ユーザ端末１００として、ノート型のコンピュータやデスクトップ型のコンピュータを使用してもよい。 The user terminal 100 shown in FIG. 1 is assumed to be a smartphone. However, the user terminal 100 may be a terminal other than a smartphone, for example, a tablet terminal.
The user terminal 100 in this embodiment uploads image data of the payment receipt captured by the camera of the main body to the OCR server 200. However, it is also possible to image the payment receipt with a digital camera or scanner that is a device different from the user terminal 100, obtain image data output from each device, and upload it to the OCR server 200.
Note that a notebook computer or a desktop computer may be used as the user terminal 100 as long as the image data captured by a digital camera or the like is uploaded to the OCR server 200.

本実施の形態におけるユーザ端末１００とＯＣＲサーバ２００とは、連携により売上金等の自動入力サービスを提供する。このため、ユーザ端末１００には、売上金等の自動入力サービスと連携する専用のアプリケーションプログラム（以下「アプリ」という）がインストールされている。従って、ユーザによる精算レシートの撮像は、起動されたアプリの指示に従って行われる。
勿論、他の種類のアプリがユーザ端末１００で実行されている場合には、撮像の対象は精算レシートに限らない。 The user terminal 100 and the OCR server 200 in this embodiment cooperate to provide an automatic input service for sales proceeds and the like. For this reason, a dedicated application program (hereinafter referred to as "app") is installed on the user terminal 100 to cooperate with the automatic input service for sales proceeds and the like. Therefore, the user takes an image of the payment receipt according to the instructions of the activated application.
Of course, if another type of application is being executed on the user terminal 100, the object to be imaged is not limited to the payment receipt.

本実施の形態におけるＯＣＲサーバ２００は、ユーザ端末１００で実行されるアプリとの連携により売上金等の自動入力サービスを提供する。ＯＣＲサーバ２００は、受信した精算レシートの画像データから売上等に関連する数値を読み出し、アップロード元であるユーザ端末１００に通知する機能を実行する。
なお、精算レシートに印字される売上等に関する単語の表現は、ＰＯＳレジスタのメーカによって異なる。そこで、本実施の形態におけるＯＣＲサーバ２００には、ユーザ端末１００で実行されるアプリで使用される単語に、精算レシートに印字される可能性がある単語との対応関係を記録したテーブルを用意する。本実施の形態では、精算レシートに印字される単語のうちテーブルに登録されている単語を登録単語という。 The OCR server 200 in this embodiment provides an automatic input service for sales proceeds and the like in cooperation with an application executed on the user terminal 100. The OCR server 200 executes a function of reading numerical values related to sales etc. from the image data of the received settlement receipt and notifying the user terminal 100, which is the upload source.
Note that the expression of words related to sales etc. printed on the payment receipt differs depending on the manufacturer of the POS register. Therefore, the OCR server 200 in this embodiment is provided with a table that records the correspondence between words used in an application executed on the user terminal 100 and words that may be printed on the payment receipt. . In this embodiment, the words registered in the table among the words printed on the payment receipt are referred to as registered words.

図２は、精算レシートで使用される可能性がある単語と、売上金等の自動入力サービスで扱う単語との対応関係の一例を示す図である。
図２においては、売上金等の自動入力サービスで扱う単語をグループ欄に示している。図２に示す対応関係は、登録単語テーブルとしてＯＣＲサーバ２００に保持されている。図２の場合、グループ欄の「純売上」に対応する登録単語として「純売」、「純売上」、「純売上合計」、「純売上高」、「＊＊総合計」が例示されている。登録単語の表現のゆらぎは、ＰＯＳレジスタのメーカの設定に依存する。
また、グループ欄の「現金売上」に対応する登録単語として「現金」、「現金残高」が例示されている。
なお、グループ欄に設定される単語は一例である。また、グループ欄に設定される単語は、売上金等の自動入力サービスで使用する表現に依存する。 FIG. 2 is a diagram illustrating an example of the correspondence between words that may be used in a payment receipt and words handled in an automatic input service such as sales proceeds.
In FIG. 2, words handled by the automatic input service such as sales proceeds are shown in the group column. The correspondence shown in FIG. 2 is held in the OCR server 200 as a registered word table. In the case of Figure 2, examples of registered words corresponding to "net sales" in the group column include "net sales", "net sales", "net sales total", "net sales", and "**total". There is. The fluctuation in the expression of registered words depends on the settings of the POS register manufacturer.
In addition, "cash" and "cash balance" are exemplified as registered words corresponding to "cash sales" in the group column.
Note that the words set in the group column are just examples. Further, the words set in the group column depend on the expressions used in automatic input services such as sales proceeds.

＜ユーザ端末の構成＞
図３は、実施の形態１で使用するユーザ端末１００の構成例を説明する図である。
ユーザ端末１００は、いわゆるコンピュータである。ユーザ端末１００は、各部の動作を制御する制御部１０１と、カメラ１０２と、表示部１０３と、操作入力部１０４と、記憶部１０５と、通信部１０６とを有している。
制御部１０１は、不図示のＣＰＵ（＝Central Processing Unit）、ＲＯＭ（＝Read Only Memory）、ＲＡＭ（＝Random Access Memory）等で構成されている。ＲＯＭには、ＢＩＯＳ（＝Basic Input Output System）等のデータが記憶されている。また、ＲＡＭは、アプリの作業エリアとして使用される。 <Configuration of user terminal>
FIG. 3 is a diagram illustrating an example of the configuration of the user terminal 100 used in the first embodiment.
The user terminal 100 is a so-called computer. The user terminal 100 has a control unit 101 that controls the operation of each unit, a camera 102, a display unit 103, an operation input unit 104, a storage unit 105, and a communication unit 106.
The control unit 101 is composed of a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), etc. (not shown). The ROM stores data such as a BIOS (Basic Input Output System). The RAM is used as a working area for applications.

本実施の形態における制御部１０１は、売上金等の自動入力サービスと連携するアプリの実行に伴って、精算レシートの撮像を促す画面の表示、撮像された画像データのＯＣＲサーバ２００へのアップロード、ＯＣＲサーバ２００から通知された売上金等の数値をはめ込んだ画面の表示等を実行する。
勿論、他のアプリが起動されている場合、制御部１０１は、起動されているアプリに応じた処理を実行する。
カメラ１０２は、撮像部の一例である。本実施の形態の場合、カメラ１０２は、ユーザ端末１００の本体に設けられている。なお、カメラ１０２の用途に制約はなく、精算レシートの撮像以外にも用いられる。
表示部１０３は、例えば液晶ディスプレイや有機ＥＬ（＝Electro Luminescence）ディスプレイである。表示部１０３には、カメラ１０２で撮像された画像や自動入力された売上金等の確認画面が表示される。 In this embodiment, the control unit 101 displays a screen that prompts to take an image of a payment receipt, uploads the imaged image data to the OCR server 200, and Display of a screen in which numerical values such as sales proceeds notified from the OCR server 200 are embedded is executed.
Of course, if another application is activated, the control unit 101 executes processing according to the activated application.
Camera 102 is an example of an imaging unit. In the case of this embodiment, camera 102 is provided in the main body of user terminal 100. Note that there are no restrictions on the use of the camera 102, and the camera 102 can be used for purposes other than capturing images of payment receipts.
The display unit 103 is, for example, a liquid crystal display or an organic EL (Electro Luminescence) display. The display unit 103 displays a confirmation screen for images captured by the camera 102, automatically entered sales proceeds, and the like.

操作入力部１０４は、例えばタッチセンサや物理的なボタンで構成される。タッチセンサとしての操作入力部１０４は表示部１０３の表面に配置され、タッチパネルを構成する。
記憶部１０５は、例えば半導体メモリで構成され、撮像された画像データ１０５Ａやアプリが記憶される。なお、ユーザ端末１００がノート型のコンピュータやデスクトップ型のコンピュータの場合、記憶部１０５はハードディスク装置でもよい。
通信部１０６は、クラウドネットワーク３００等に接続された外部端末との通信に用いるデバイスである。通信部１０６は、各種の通信規格に準拠する。通信部１０６は、例えば無線ＬＡＮ（＝Local Area Network）や有線ＬＡＮの通信規格、４Ｇや５Ｇ等の移動通信システムに準拠する。 The operation input unit 104 includes, for example, a touch sensor or a physical button. An operation input unit 104 as a touch sensor is arranged on the surface of the display unit 103 and constitutes a touch panel.
The storage unit 105 is composed of, for example, a semiconductor memory, and stores captured image data 105A and applications. Note that when the user terminal 100 is a notebook computer or a desktop computer, the storage unit 105 may be a hard disk device.
The communication unit 106 is a device used for communication with an external terminal connected to the cloud network 300 or the like. The communication unit 106 complies with various communication standards. The communication unit 106 complies with, for example, wireless LAN (Local Area Network) and wired LAN communication standards, and mobile communication systems such as 4G and 5G.

＜ＯＣＲサーバの構成＞
図４は、実施の形態１で使用するＯＣＲサーバ２００の構成例を説明する図である。
ＯＣＲサーバ２００は、いわゆるコンピュータである。ＯＣＲサーバ２００は、各部の動作を制御する制御部２０１と、記憶部２０２と、通信部２０３とを有している。
制御部２０１は、不図示のＣＰＵ、ＲＯＭ、ＲＡＭ等で構成される。ＲＯＭには、ＢＩＯＳ等のデータが記憶されている。また、ＲＡＭは、アプリの作業エリアとして使用される。ＯＣＲサーバ２００は、情報処理装置の一例である。 <OCR Server Configuration>
FIG. 4 is a diagram illustrating an example of the configuration of the OCR server 200 used in the first embodiment.
The OCR server 200 is a so-called computer. The OCR server 200 includes a control unit 201 that controls the operation of each unit, a storage unit 202, and a communication unit 203.
The control unit 201 is composed of a CPU, a ROM, a RAM, etc. (not shown). The ROM stores data such as the BIOS. The RAM is used as a working area for the application. The OCR server 200 is an example of an information processing device.

本実施の形態における制御部２０１は、アプリの実行を通じ、画像データ取得部２０１Ａ、文字切出処理部２０１Ｂ、ＯＣＲ処理部２０１Ｃ、マッチング部２０１Ｄ、値選択部２０１Ｅ、グループ別結果選択部２０１Ｆとして機能する。これらの機能部は、ユーザ端末１００との連携により自動入力サービスを実現する。
画像データ取得部２０１Ａは、ユーザ端末１００から画像データ１０５Ａを取得する機能部である。画像データ取得部２０１Ａは、取得部の一例であり、取得した画像データ１０５Ａを、画像データ２０２Ａとして記憶部２０２に記憶する。 The control unit 201 in this embodiment functions as an image data acquisition unit 201A, a character extraction processing unit 201B, an OCR processing unit 201C, a matching unit 201D, a value selection unit 201E, and a group result selection unit 201F through the execution of the application. do. These functional units implement an automatic input service in cooperation with the user terminal 100.
The image data acquisition unit 201A is a functional unit that acquires the image data 105A from the user terminal 100. The image data acquisition unit 201A is an example of an acquisition unit, and stores the acquired image data 105A in the storage unit 202 as image data 202A.

文字切出処理部２０１Ｂは、画像データ２０２Ａの中から文字を含む文字画像データを切り出す機能部である。文字を含む行に対応する画像部分は、文字画像データの一例である。また、文字切出処理部２０１Ｂは、切出処理部の一例である。
本実施の形態における文字切出処理部２０１Ｂは、切り出した文字列画像データのうち予め定めた特定部分を切り出してＯＣＲ処理し、登録単語を含む行か否かを判定する。本実施の形態における文字切出処理部２０１Ｂは、文字画像データの先頭に現れる２文字を特定部分とする。 The character cutout processing unit 201B is a functional unit that cuts out character image data including characters from the image data 202A. An image portion corresponding to a line containing characters is an example of character image data. Further, the character cutout processing section 201B is an example of a cutout processing section.
The character extraction processing unit 201B in this embodiment extracts a predetermined specific portion from the extracted character string image data, performs OCR processing on it, and determines whether or not a line includes a registered word. The character extraction processing unit 201B in this embodiment uses two characters appearing at the beginning of character image data as a specific portion.

文字切出処理部２０１Ｂは、画像データ内の各文字の上下（天地）が正しい向きになるように調整した状態で画像データ２０２Ａの処理を開始する。
本実施の形態の場合、文字を含む各行の先頭から２文字を切り出してＯＣＲ処理しているが、切り出す文字は各行の先頭に現れる１文字でも、３文字以上でもよい。
なお、各行の先頭から切り出す文字の数は事前に設定される。文字の数は、文字画像データに登録単語が含まれるか否かの判定が可能であればよい。このため、切り出す文字の数は、登録単語の内容に応じて定められる。もっとも、切り出す文字の数は少ないほどＯＣＲ処理の時間が短くなるため好ましい。先頭２文字が印字される部分は、文字画像データの左端部分の一例である。 The character cutout processing unit 201B starts processing the image data 202A with each character in the image data adjusted so that the top and bottom (top and bottom) are in the correct orientation.
In the case of this embodiment, two characters from the beginning of each line containing characters are extracted and subjected to OCR processing, but the characters to be extracted may be one character appearing at the beginning of each line, or three or more characters.
Note that the number of characters to be cut out from the beginning of each line is set in advance. The number of characters may be determined as long as it is possible to determine whether or not the registered word is included in the character image data. Therefore, the number of characters to be cut out is determined depending on the content of the registered word. However, it is preferable to cut out fewer characters because the OCR processing time will be shorter. The portion where the first two characters are printed is an example of the left end portion of the character image data.

本実施の形態の場合、読み出した２文字は、項目部の一部である。切り出した２文字が登録単語の先頭側の一部を含む場合、文字切出処理部２０１Ｂは、対応する行全体の文字をＯＣＲ処理する行に決定する。一方、切り出した２文字が登録単語の先頭側の一部を含まない場合、文字切出処理部２０１Ｂは、次の行の文字画像データの判定に移行する。
文字画像データの先頭に位置する数文字だけをＯＣＲ処理して、行全体の文字をＯＣＲ処理するか否かを判定するので、全ての行の全ての文字をＯＣＲ処理する場合に比して、ＯＣＲ処理される文字数を削減できる。このことは、ＯＣＲ処理に要する時間の短縮に通じる。 In the case of this embodiment, the two read characters are part of the item section. If the two extracted characters include a part of the beginning of the registered word, the character extraction processing unit 201B determines the entire corresponding line of characters to be subjected to OCR processing. On the other hand, if the two extracted characters do not include the first part of the registered word, the character extraction processing unit 201B moves on to determination of the character image data of the next line.
Because it OCR-processes only the first few characters of the character image data and determines whether or not to OCR-process the entire line of characters, compared to the case where all the characters on all lines are OCR-processed, The number of characters to be subjected to OCR processing can be reduced. This leads to a reduction in the time required for OCR processing.

本実施の形態における文字切出処理部２０１Ｂは、画像データ２０２Ａの下端から上端の方向に向けて順番に文字画像データを判定の対象とする。本実施の形態で自動入力の対象とする売上金等の文字は、ＰＯＳレジスタの違いによらず、精算レシートの下部に出現するためである。換言すると、文字切出処理部２０１Ｂは、画像データ２０２Ａを、その下端から上端の方向に向けてスキャンする。
文字切出処理部２０１Ｂは、特定部分に文字が印字されていない文字画像データが見つかった場合、その上の行で最初に特定部分に文字が現れる行を、処理対応である文字画像データに対する親の行（親行）として紐付ける。換言すると、親行を有する行の数値部には、親行の項目部が紐付けられる。親行を有する行も、行全体の文字をＯＣＲ処理する行に設定される。 The character cutout processing unit 201B in this embodiment subjects character image data to determination in order from the lower end to the upper end of the image data 202A. This is because the characters such as sales proceeds to be automatically input in this embodiment appear at the bottom of the payment receipt regardless of the difference in POS register. In other words, the character cutout processing unit 201B scans the image data 202A from the bottom end to the top end.
When character image data in which no characters are printed in a specific part is found, the character extraction processing unit 201B converts the line above it in which a character first appears in the specific part to the parent character image data corresponding to the processing. Link as the row (parent row). In other words, the numeric value part of a line that has a parent line is linked to the item part of the parent line. A line that has a parent line is also set to a line where the characters of the entire line are subjected to OCR processing.

ＯＣＲ処理部２０１Ｃは、特定部分やＯＣＲ処理の対象に設定された行の文字画像データをＯＣＲ処理して文字情報を認識する。ＯＣＲ処理部２０１Ｃは、文字認識部の一例である。
本実施の形態におけるＯＣＲ処理部２０１Ｃは、ＯＣＲ処理の対象に設定された行に対応する文字画像データを項目部と数値部に分解し、それぞれに専用のＯＣＲモデルを適用して各部に対応する文字情報を認識する。
ここでのＯＣＲモデルは、ディープラーニング技術を用いて生成される。ＯＣＲモデルは、１文字と識別された文字画像データが入力されると、確からしい文字情報（例えば文字コード）を出力する。各文字には、認識の確からしさを示す信頼度が計算される。なお、認識された単語の信頼度は、例えば単語を構成する各文字の信頼度の平均値として与えられる。 The OCR processing unit 201C performs OCR processing on character image data of a specific portion or a line set as a target for OCR processing to recognize character information. The OCR processing unit 201C is an example of a character recognition unit.
The OCR processing unit 201C in this embodiment decomposes character image data corresponding to a line set for OCR processing into an item part and a numerical part, and applies a dedicated OCR model to each part to correspond to each part. Recognize text information.
The OCR model here is generated using deep learning technology. When the OCR model receives character image data identified as a single character, it outputs likely character information (for example, a character code). A confidence level is calculated for each character, indicating how likely it is to be recognized. Note that the reliability of a recognized word is given, for example, as the average value of the reliability of each character composing the word.

マッチング部２０１Ｄは、登録単語テーブル２０２Ｂを参照し、ＯＣＲ処理の結果として認識された文字情報（以下「生のＯＣＲ結果」という）を、登録単語テーブル２０２Ｂに従って、対応するグループの単語に紐付けを行う機能である。
値選択部２０１Ｅは、１つの項目（項目部）が複数の値を持つ場合、該当する項目に対応付ける１つの値を選択する機能である。１つの項目が複数の値を持つ場合には、親行が存在する場合がある。例えば２つの値が１つの項目に対応付けられている場合、値選択部２０１Ｅは、Currency型に変換可能な値の方を選択する。ただし、２つの値のいずれもがCurrency型に変換可能である場合、値選択部２０１Ｅは、２つの値のうちで大きい方の値を選択する。
グループ別結果選択部２０１Ｆは、グループ欄の１つの単語が複数の値を持つ場合、該当する単語に対応付ける１つの値を選択する機能である。例えば２つの値がグループの１つの単語に対応付けられている場合、グループ別結果選択部２０１Ｆは、項目部をＯＣＲ処理した結果の信頼度が高い方の値を選択する。 The matching unit 201D refers to the registered word table 202B and links the character information recognized as a result of OCR processing (hereinafter referred to as "raw OCR result") to the words of the corresponding group according to the registered word table 202B. This is a function to perform.
The value selection section 201E has a function of selecting one value to be associated with the corresponding item when one item (item section) has a plurality of values. If one item has multiple values, a parent row may exist. For example, when two values are associated with one item, the value selection unit 201E selects the value that can be converted into Currency type. However, if both of the two values are convertible to the Currency type, the value selection unit 201E selects the larger of the two values.
The group-by-group result selection unit 201F has a function of selecting one value to be associated with the corresponding word when one word in the group column has a plurality of values. For example, when two values are associated with one word of a group, the group-by-group result selection unit 201F selects the value with higher reliability as a result of OCR processing the item part.

記憶部２０２は、アップロードされた画像データ２０２Ａ、登録単語テーブル２０２Ｂ、ユーザ端末１００側のアプリと連携して売上金等の自動入力サービスを実現するアプリを記憶する。
記憶部２０２は、例えばハードディスク装置や半導体メモリで構成される。なお、記憶部２０２は、クラウドストレージサーバとして、クラウドネットワーク３００に接続されていてもよい。
通信部２０３は、クラウドネットワーク３００等に接続された外部端末との通信に用いるデバイスである。通信部２０３は、各種の通信規格に準拠する。通信部２０３は、例えば無線ＬＡＮや有線ＬＡＮの通信規格、４Ｇや５Ｇ等の移動通信システムに準拠する。 The storage unit 202 stores the uploaded image data 202A, the registered word table 202B, and an application that cooperates with the application on the user terminal 100 side to realize an automatic input service for sales proceeds and the like.
The storage unit 202 is composed of, for example, a hard disk device or a semiconductor memory. Note that the storage unit 202 may be connected to the cloud network 300 as a cloud storage server.
The communication unit 203 is a device used for communication with an external terminal connected to the cloud network 300 or the like. The communication unit 203 complies with various communication standards. The communication unit 203 complies with, for example, communication standards of wireless LAN and wired LAN, and mobile communication systems such as 4G and 5G.

＜処理動作＞
次に、ＯＣＲサーバ２００が実行する処理動作例を説明する。
図５は、実施の形態１で使用するＯＣＲサーバ２００が実行する処理動作の一例を示すフローチャートである。図中に示す記号のＳはステップを意味する。図５に示す処理動作は、情報処理方法の一例である。
なお、ＯＣＲサーバ２００が実行する一連の動作は、ユーザ端末１００から精算レシートの画像がアップデートされることで開始される。
まず、ユーザ端末１００では、ユーザによって、売上金等の自動入力サービスに連携するアプリが起動される（ステップ１）。 <Processing operation>
Next, an example of the processing operation executed by the OCR server 200 will be described.
FIG. 5 is a flowchart illustrating an example of processing operations performed by the OCR server 200 used in the first embodiment. The symbol S shown in the figure means a step. The processing operation shown in FIG. 5 is an example of an information processing method.
Note that the series of operations executed by the OCR server 200 starts when the image of the payment receipt is updated from the user terminal 100.
First, on the user terminal 100, the user starts an application that cooperates with an automatic input service for sales proceeds, etc. (step 1).

アプリが起動すると、ユーザ端末１００の表示部１０３（図３参照）には、カメラ１０２（図３参照）による精算レシートの撮像を促す画像が表示されると共に、カメラ１０２から出力されるライブ画像が表示される。
図６は、ユーザ端末１００の表示部１０３に表示される画像の例を示す図である。表示部１０３には、カメラ１０２（図３参照）で撮像された精算レシートの画像１１０と、シャッターボタン１２０とが表示されている。図６の場合、精算レシートの画像１１０のうち上端側の表記を省略している。 When the application is started, an image prompting the camera 102 (see FIG. 3) to take an image of the payment receipt is displayed on the display unit 103 (see FIG. 3) of the user terminal 100, and a live image output from the camera 102 is displayed. Is displayed.
FIG. 6 is a diagram showing an example of an image displayed on the display unit 103 of the user terminal 100. The display unit 103 displays an image 110 of the payment receipt captured by the camera 102 (see FIG. 3) and a shutter button 120. In the case of FIG. 6, the upper end of the image 110 of the payment receipt is omitted.

ユーザがシャッターボタン１２０をタップすると、表示部１０３に表示されている画像のデータ（画像データ１０５Ａ）が記憶部１０５（図３参照）に記憶される。なお、精算レシートの画像１１０は、１枚に限らず、複数枚撮像することも可能である。
図５の説明に戻る。
ユーザ端末１００を操作するユーザがアップロードする画像データ１０５Ａを選択すると、ユーザ端末１００は、精算レシートの画像データ１０５Ａをアップロードする（ステップ２）。なお、シャッターボタン１２０（図６参照）の操作によって記憶された画像データ１０５Ａを自動的にアップロードする設定の場合には、ユーザによるアップロードする画像データの選択は不要である。 When the user taps the shutter button 120, the data of the image displayed on the display unit 103 (image data 105A) is stored in the storage unit 105 (see FIG. 3). Note that the number of images 110 of the payment receipt is not limited to one, but it is also possible to capture a plurality of images.
Returning to the explanation of FIG. 5.
When the user operating the user terminal 100 selects the image data 105A to be uploaded, the user terminal 100 uploads the image data 105A of the payment receipt (step 2). Note that if the image data 105A stored by operating the shutter button 120 (see FIG. 6) is set to be automatically uploaded, the user does not need to select the image data to be uploaded.

ＯＣＲサーバ２００は、アップロードされた画像データ１０５Ａを、記憶部２０２（図４参照）に画像データ２０２Ａ（図４参照）として記憶する。
記憶部２０２に画像データ２０２Ａが記憶されると、ＯＣＲサーバ２００は、画像データ２０２Ａに含まれる文字画像データを切り出す（ステップ３）。従って、文字を含まない背景部分や余白部分は、切り出しの対象から除外される。
次に、ＯＣＲサーバ２００は、各行に対応する文字画像データの特定部分だけをＯＣＲ処理して対象文字をサーチする（ステップ４）。本実施の形態の場合、対象文字は、登録単語の先頭側の２文字である。 The OCR server 200 stores the uploaded image data 105A in the storage unit 202 (see FIG. 4) as image data 202A (see FIG. 4).
When the image data 202A is stored in the storage unit 202, the OCR server 200 cuts out character image data included in the image data 202A (step 3). Therefore, background portions and margin portions that do not include characters are excluded from the extraction target.
Next, the OCR server 200 performs OCR processing on only a specific portion of the character image data corresponding to each line to search for target characters (step 4). In this embodiment, the target characters are the first two characters of the registered word.

図７は、文字画像データのうち対象文字のサーチに用いる部分を説明する図である。本実施の形態の場合、文字画像データの先頭側の２文字が切り出される。図７に示すように、文字の切り出しは、精算レシートの画像データ２０２Ａ（図４参照）の下端側から上方に１行ずつ順番に実行される。図７では、この切り出し処理が実行される方向をスキャン方向と示している。
例えば精算レシートの最下端の行には「現金残高＼37,462」が印字されているが、行の先頭に位置する「現金」が特定部分として切り出される。因みに、「現金残高」が項目部であり、「＼37,462」が数値部である。
「現金」は、登録単語である「現金残高」の先頭２文字であり、対象文字の一例である。従って、最下端の行は、行全体の文字をＯＣＲ処理する対象に判定される。 FIG. 7 is a diagram illustrating a portion of character image data used for searching for a target character. In the case of this embodiment, the first two characters of the character image data are cut out. As shown in FIG. 7, the characters are cut out line by line from the bottom of the image data 202A of the payment receipt (see FIG. 4) upwards. In FIG. 7, the direction in which this cutting process is executed is shown as the scan direction.
For example, "Cash balance \37,462" is printed on the bottom line of the settlement receipt, but "Cash" located at the beginning of the line is cut out as the specific part. Incidentally, "cash balance" is the item part, and "\37,462" is the numerical part.
“Cash” is the first two characters of the registered word “cash balance” and is an example of target characters. Therefore, the bottom line is determined to be a target for OCR processing of the entire line of characters.

また、最下端から２つ目の行には「取引数 4数」が印字されているが、行の先頭に位置する「取引」が特定部分として切り出される。因みに、「取引数」が項目部であり、「4数」が数値部である。「取引」は、登録単語でも、その一部でもない。このため、ＯＣＲサーバ２００は、最下端から２つ目の行を、行全体の文字をＯＣＲ処理の対象に設定せず、次の行の判定に移行する。 Furthermore, in the second row from the bottom, "Number of transactions: 4" is printed, but "Transaction" located at the beginning of the line is cut out as the specific part. Incidentally, "Number of transactions" is the item part, and "Number 4" is the numerical part. "Transaction" is neither a registered word nor a part of it. For this reason, the OCR server 200 does not set the entire line of characters in the second line from the bottom as a target for OCR processing, and moves on to determination of the next line.

図８は、対象文字のサーチの結果の具体例を説明する図である。図８は、サーチの対象文字が「現金」と「＊＊」と「純売」の場合である。
図８の場合、破線で囲んだ４つの領域が行全体の文字をＯＣＲ処理する対象に設定されている。なお、上から３つ目の領域は、親行としての「現金 4数」が印字された行と、親行を有する「＼16，520」が印字された行の２行を含んでいる。 FIG. 8 is a diagram illustrating a specific example of the search results for target characters. FIG. 8 shows a case where the search target characters are "cash", "**", and "net sale".
In the case of FIG. 8, four areas surrounded by broken lines are set as targets for OCR processing of the entire line of characters. Note that the third area from the top includes two lines: a line in which "Cash 4 Quantity" is printed as a parent line, and a line in which "\16,520" which has a parent line is printed.

図５の説明に戻る。
次に、ＯＣＲサーバ２００は、これら４つの領域を対象として、行全体をＯＣＲ処理する（ステップ５）。
図９は、ＯＣＲ処理で認識される情報に紐付けられる情報の例を説明する図である。
上から１つ目の領域に対応する文字画像データは、精算レシートの１９行目にあたる。項目部の生のＯＣＲ処理の結果は「＊＊純売」であり、数値部の値は「＼15,297」である。
上から２つ目の領域に対応する文字画像データは、精算レシートの２１行目にあたる。項目部の生のＯＣＲ処理の結果は「＊＊総総合計」であり、数値部の値は「＼16,520」である。 Returning to the explanation of FIG. 5.
Next, the OCR server 200 performs OCR processing on the entire line in these four areas (step 5).
FIG. 9 is a diagram illustrating an example of information linked to information recognized by OCR processing.
The character image data corresponding to the first area from the top corresponds to the 19th line of the payment receipt. The raw OCR processing result of the item part is "** Net Sales", and the value of the numerical part is "\15,297".
The character image data corresponding to the second area from the top corresponds to the 21st line of the payment receipt. The raw OCR processing result of the item part is "**Total Total", and the value of the numerical part is "\16,520".

上から３つ目の領域に対応する文字画像データは、精算レシートの２７行目と２８行目にあたる。
２８行目の項目部には文字が無いが、１行上の２７行目の項目部には対象文字が存在している。このため、２７行目は、２８行目の親行である。２８行目における生のＯＣＲ処理の結果は、２７行目にも紐付けられる。
図９の場合、精算レシートの２７行目の生のＯＣＲ処理の結果は「現金」であり、数値部の値は「4数」である。
一方、精算レシートの２８行目には、生のＯＣＲ処理の結果として、２７行目の生のＯＣＲ処理の結果である「現金」が紐付けられる。なお、同行における数値部の値は「＼16,520」である。
上から４つ目の領域に対応する文字画像データは、精算レシートの３１行目にあたる。項目部の生のＯＣＲ処理の結果は「現金残高」であり、数値部の値は「＼37,462」である。 The character image data corresponding to the third area from the top corresponds to the 27th and 28th lines of the payment receipt.
There are no characters in the item section on the 28th line, but there are target characters in the item section on the 27th line one line above. Therefore, the 27th line is the parent line of the 28th line. The raw OCR processing result on the 28th line is also linked to the 27th line.
In the case of FIG. 9, the raw OCR processing result on the 27th line of the payment receipt is "cash" and the value of the numerical part is "4 numbers".
On the other hand, "cash", which is the result of the raw OCR processing on the 27th line, is linked to the 28th line of the payment receipt as a result of the raw OCR processing. In addition, the value of the numerical part in the same is "\16,520".
The character image data corresponding to the fourth area from the top corresponds to the 31st line of the payment receipt. The raw OCR processing result of the item part is "cash balance", and the value of the numerical part is "\37,462".

図５の説明に戻る。
次に、ＯＣＲサーバ２００は、マッチング処理を実行する（ステップ６）。マッチング処理では、生のＯＣＲ処理の結果である文字を登録単語に紐付ける処理と、登録単語が対応付けられているグループの単語に紐付ける処理とが実行される。
図１０は、マッチングの一例を説明する図である。図１０では、紙面の都合から精算レシートの上から３つ目と４つ目の領域についてのみ表している。
精算レシートの２７行目の項目部をＯＣＲ処理して認識された文字「現金」は、登録単語の「現金」に一致する。このため、認識された単語「現金」の信頼度は１．０である。信頼度は、生のＯＣＲ処理した結果の確からしさを表している。本実施の形態の場合、１．０が最大値である。なお、登録単語の「現金」はグループの「現金売上」に対応する。 Returning to the explanation of FIG. 5.
Next, the OCR server 200 executes matching processing (step 6). In the matching process, a process of associating characters resulting from the raw OCR process with a registered word, and a process of associating the characters with a word of a group to which the registered word is associated are executed.
FIG. 10 is a diagram illustrating an example of matching. In FIG. 10, only the third and fourth areas from the top of the payment receipt are shown due to space constraints.
The characters "cash" recognized by OCR processing the item section of the 27th line of the payment receipt match the registered word "cash". Therefore, the reliability of the recognized word "cash" is 1.0. The reliability represents the probability of the raw OCR processing result. In the case of this embodiment, 1.0 is the maximum value. Note that the registered word "cash" corresponds to the group "cash sales."

精算レシートの２８行目の項目部をＯＣＲ処理して認識された文字「現金」も、登録単語の「現金」に対応する。この単語の信頼度も１．０である。登録単語の「現金」はグループの「現金売上」に対応する。
精算レシートの３１行目の項目部から読み出された文字「現金残高」は、登録単語の「現金」に対応する。信頼度は０．５である。なお、登録単語の「現金」はグループの「現金売上」に対応する。 The characters "cash" recognized by OCR processing the item section of the 28th line of the payment receipt also correspond to the registered word "cash". The reliability of this word is also 1.0. The registered word "cash" corresponds to the group "cash sales."
The characters "cash balance" read from the item section on the 31st line of the payment receipt correspond to the registered word "cash." The reliability is 0.5. Note that the registered word "cash" corresponds to the group "cash sales."

図５の説明に戻る。
次に、ＯＣＲサーバ２００は、１つの項目に紐付ける値の選択を実行する（ステップ７）。本実施の形態の場合、項目部の「現金」に紐付けられている２７行目の値と２８行目の値の中から１つを選択する。
図１１は、値の選択に使用される処理の内容を説明するフローチャートである。
まず、ＯＣＲサーバ２００は、処理対象が１つの項目に複数の値を持つか否かを判定する（ステップ１１）。 Returning to the explanation of FIG. 5.
Next, the OCR server 200 selects a value to be associated with one item (step 7). In the case of this embodiment, one is selected from the values on the 27th line and the value on the 28th line that are linked to "cash" in the item section.
FIG. 11 is a flowchart illustrating the contents of the process used to select values.
First, the OCR server 200 determines whether the processing target has multiple values in one item (step 11).

ステップ１１で肯定結果が得られた場合、ＯＣＲサーバ２００は、金額と思われる方の値を選択する（ステップ１２）。具体的には、Current型への変換が可能な数値を選択する。
一方、ステップ１１で否定結果が得られた場合、ＯＣＲサーバ２００は、対応する値をそのまま使用する（ステップ１３）。
図１２は、ステップ７における値の選択を説明する図である。
精算レシートの２７行目の値「4数」はCurrent型に変換できない値である。一方、２８行目の値「＼16,520」はCurrent型に変換できる値である。このため、この例では、２８行目の値「＼16,520」が、グループ「現金売上」に対応付けられる値として選択されている。 If a positive result is obtained in step 11, the OCR server 200 selects the value that is considered to be the amount (step 12). Specifically, select a number that can be converted to the Current type.
On the other hand, if a negative result is obtained in step 11, the OCR server 200 uses the corresponding value as is (step 13).
FIG. 12 is a diagram illustrating the selection of values in step 7.
The value "4 number" on the 27th line of the payment receipt cannot be converted to the Current type. On the other hand, the value "\16,520" on the 28th line is a value that can be converted to the Current type. Therefore, in this example, the value "\16,520" on the 28th line is selected as the value associated with the group "cash sales."

図５の説明に戻る。
４つの領域の全てについて値の選択が完了すると、ＯＣＲサーバ２００は、グループ別に値を選択する（ステップ８）。
本実施の形態の場合、精算レシートの１９行目の値「＼15,297」と同２１行目の値「＼16,520」は、いずれもグループの単語「純売上」に対応付けられている。また、精算レシートの２８行目の値「＼16,520」と同３１行目の値「＼37,462」は、いずれもグループの単語「現金売上」に対応付けられる。このため、各グループの単語に紐付ける１つの値を選択する必要がある。 Returning to the explanation of FIG. 5.
When the selection of values for all four areas is completed, the OCR server 200 selects values for each group (step 8).
In the case of this embodiment, the value "\15,297" on the 19th line and the value "\16,520" on the 21st line of the payment receipt are both associated with the group word "net sales." Further, the value "\16,520" on the 28th line of the settlement receipt and the value "\37,462" on the 31st line are both associated with the word "cash sales" in the group. Therefore, it is necessary to select one value to be associated with each group of words.

図１３は、グループ別に値を選択する処理の内容を説明するフローチャートである。
まず、ＯＣＲサーバ２００は、同じグループに紐付けられる値が複数あるか否かを判定する（ステップ２１）。
ステップ２１で肯定結果が得られた場合、ＯＣＲサーバ２００は、信頼度は同じか否かを判定する（ステップ２２）。
ステップ２２で肯定結果が得られた場合、ＯＣＲサーバ２００は、値が大きい方の値を選択する（ステップ２３）。一方、ステップ２２で否定結果が得られた場合、ＯＣＲサーバ２００は、信頼度の高い方の値を選択する。
なお、ステップ２１で否定結果が得られた場合、ＯＣＲサーバ２００は、対応する行の値をそのまま抽出する。この場合は、１つのグループに紐付けられる値が１つしか存在しないためである。 FIG. 13 is a flowchart illustrating the process of selecting values for each group.
First, the OCR server 200 determines whether there are multiple values associated with the same group (step 21).
If a positive result is obtained in step 21, the OCR server 200 determines whether the reliability levels are the same (step 22).
If a positive result is obtained in step 22, the OCR server 200 selects the larger value (step 23). On the other hand, if a negative result is obtained in step 22, the OCR server 200 selects the value with higher reliability.
Note that if a negative result is obtained in step 21, the OCR server 200 extracts the value of the corresponding row as is. In this case, this is because there is only one value associated with one group.

図１４は、グループ「現金売上」に紐付けられている２つの値から１つを選択する例を説明する図である。
グループ「現金売上」の場合、精算レシートの２８行目の値「＼16,520」と３１行目の値「＼37,462」が紐付けられている。ただし、２８行目の生のＯＣＲ処理の結果の信頼度が１．０であるのに対し、３１行目の生のＯＣＲ処理の結果の信頼度は０．５である。このため、信頼度が相対的に高い２８行目の値「＼16,520」がグループ「現金売上」の値として選択される。 FIG. 14 is a diagram illustrating an example of selecting one of two values associated with the group "cash sales."
In the case of the group "Cash Sales," the value "\16,520" on line 28 of the settlement receipt is linked to the value "\37,462" on line 31. However, the reliability of the raw OCR processing result on line 28 is 1.0, while the reliability of the raw OCR processing result on line 31 is 0.5. Therefore, the value "\16,520" on line 28, which has a relatively high reliability, is selected as the value for the group "Cash Sales."

図１５は、グループ「純売上」に紐付けられている２つの値から１つを選択する例を説明する図である。
グループ「純売上」の場合、精算レシートの１９行目の値「＼15,297」と２１行目の値「＼16,520」が紐付けられている。ただし、２１行目の生のＯＣＲ処理の結果の信頼度が０．９であるのに対し、１９行目の生のＯＣＲ処理の結果の信頼度は０．４である。このため、信頼度が相対的に高い２１行目の値「＼16,520」がグループ「純売上」の値として選択される。 FIG. 15 is a diagram illustrating an example of selecting one from two values linked to the group "net sales."
In the case of the group "net sales", the value "\15,297" on the 19th line of the settlement receipt and the value "\16,520" on the 21st line are linked. However, while the reliability of the raw OCR processing result on the 21st line is 0.9, the reliability of the raw OCR processing result on the 19th line is 0.4. Therefore, the value "\16,520" in the 21st line, which has a relatively high reliability, is selected as the value of the group "net sales".

図５の説明に戻る。
グループ別に値が選択されると、ＯＣＲサーバ２００は、ユーザ端末１００に対し、選択結果を送信する（ステップ９）。具体的には、「現金売上」の値として「＼16,520」が送信され、純売上の値として「＼16,520」が送信される。
これらの情報を受信したユーザ端末１００は、結果を合成した画面を表示部１０３（図３参照）に表示する（ステップ１０）。
図１６は、ユーザ端末１００の表示部１０３に表示される売上金等の自動入力結果の画面例を説明する図である。図１６に示す画面には、撮像された精算レシートの年月日の情報、天気の情報、現金売上の情報、現金外売上の情報、純売上の情報、キャンセルボタン１２１、登録ボタン１２２が表示されている。 Returning to the explanation of FIG. 5.
When values are selected for each group, the OCR server 200 transmits the selection results to the user terminal 100 (step 9). Specifically, "\16,520" is transmitted as the value of "cash sales" and "\16,520" is transmitted as the value of net sales.
The user terminal 100 that has received this information displays a screen in which the results are combined on the display unit 103 (see FIG. 3) (step 10).
FIG. 16 is a diagram illustrating a screen example of automatic input results such as sales proceeds displayed on the display unit 103 of the user terminal 100. The screen shown in FIG. 16 displays information on the date of the imaged payment receipt, weather information, cash sales information, non-cash sales information, net sales information, a cancel button 121, and a registration button 122. ing.

ここで、現金外売上の欄には、純売上から現金売上を減算した値が表示される。現金外売上には、例えばクレジット払いされた金額や電子マネーで支払われた金額がある。なお、年月日には、精算レシートからＯＣＲ処理で認識された年月日を自動的に表示してもよいし、ユーザがカレンダー機能から指定した年月日を表示してもよい。また、天気の欄には、ユーザが入力した情報が表示される。
なお、ＯＣＲ処理により自動入力された値が間違っている場合には、手入力による数値の修正も可能である。
ユーザが登録ボタン１２２をタップすると、表示された金額が確定する。 Here, the non-cash sales column displays the value obtained by subtracting cash sales from net sales. Non-cash sales include, for example, amounts paid by credit card and amounts paid by electronic money. Note that, as the date, the year, month, and day recognized from the payment receipt through OCR processing may be automatically displayed, or the year, month, and day specified by the user from the calendar function may be displayed. Furthermore, information input by the user is displayed in the weather column.
Note that if the value automatically input by OCR processing is incorrect, the numerical value can also be corrected by manual input.
When the user taps the registration button 122, the displayed amount is confirmed.

＜実施の形態２＞
前述の実施の形態１の場合には、図７を用いて説明したように、文字画像データの先頭２文字をＯＣＲ処理した結果に対象文字が含まれるかサーチすることで、ＯＣＲ処理の対象となる文字数を低減させ、その結果として、ＯＣＲ処理に要する時間の短縮を実現している。
ただし、ＯＣＲ処理に要する時間の短縮は、他の手法によっても実現が可能である。 <Embodiment 2>
In the case of the first embodiment described above, as explained using FIG. 7, the first two characters of character image data are searched to see if the target character is included in the result of OCR processing, thereby identifying the target character for OCR processing. As a result, the time required for OCR processing is reduced.
However, the time required for OCR processing can also be reduced by other techniques.

図１７は、実施の形態２におけるＯＣＲサーバ２００がＯＣＲ処理を実行する範囲を説明する図である。本実施の形態では、ＯＣＲ処理を実行する範囲を、純売上や現金売上の情報が出現する可能性が高い範囲に限定する。図１７の例では、精算レシートの下から３分の２の範囲に限定している。ＯＣＲ処理の範囲が精算レシートの全体の３分の２になれば、ＯＣＲ処理の対象となる文字数も概略３分の２となり、ＯＣＲ処理に要する時間の短縮が可能になる。なお、３分の２は一例であり、３分の１や４分の１等でもよい。必要とする情報を認識できるのであれば、ＯＣＲ処理を実行する範囲は狭いほどよい。 FIG. 17 is a diagram illustrating the range in which the OCR server 200 executes OCR processing in the second embodiment. In this embodiment, the range in which OCR processing is executed is limited to a range in which there is a high possibility that information on net sales or cash sales will appear. In the example of FIG. 17, the range is limited to two-thirds from the bottom of the payment receipt. If the range of OCR processing becomes two-thirds of the entire payment receipt, the number of characters subject to OCR processing will also be approximately two-thirds, making it possible to shorten the time required for OCR processing. Note that two-thirds is an example, and it may be one-third, one-fourth, or the like. As long as the required information can be recognized, the narrower the range of OCR processing, the better.

図１７の場合にも、精算レシートの下端側から上端方向に順番に各行に対応する文字画像データをＯＣＲ処理している。また、実施の形態２の場合にも、図５のステップ５以降の処理は同じである。
もっとも、実施の形態２の場合も、各行の先頭２文字だけを読み取り、行全体の文字をＯＣＲ処理するか否かを判定する機能を組み合わせてもよい。この場合、ＯＣＲ処理が対象とする文字数は実施の形態１よりも少なくなり、更なるＯＣＲ処理の時間の短縮が実現される。 In the case of Fig. 17 as well, the character image data corresponding to each line of the payment receipt is processed by OCR in order from the bottom to the top. Also, in the case of the second embodiment, the process from step 5 onwards in Fig. 5 is the same.
However, in the case of the second embodiment, it is also possible to combine a function of reading only the first two characters of each line and determining whether or not to perform OCR processing on the characters of the entire line. In this case, the number of characters to be processed by OCR is smaller than that in the first embodiment, and the OCR processing time can be further reduced.

＜他の実施の形態＞
以上、本発明の実施の形態について説明したが、本発明の技術的範囲は前述した実施の形態に記載の範囲に限定されない。前述した実施の形態に、種々の変更又は改良を加えたものも、本発明の技術的範囲に含まれることは、特許請求の範囲の記載から明らかである。 <Other embodiments>
Although the embodiments of the present invention have been described above, the technical scope of the present invention is not limited to the range described in the embodiments described above. It is clear from the claims that various changes or improvements made to the embodiments described above are also included within the technical scope of the present invention.

（１）例えば前述の実施の形態１の場合には、精算レシートの下端側から上端方向に、各行に対応する文字画像データの対象文字をサーチしているが、精算レシートの上端側から下端方向に対象文字をサーチしてもよい。
（２）前述の実施の形態は、いずれも精算レシートを読み取りの対象としているが、消費者が受け取るレシートの特定の情報だけをＯＣＲ処理して自動入力する用途にも利用できる。 (1) For example, in the case of the above-mentioned embodiment 1, the target character is searched for in the character image data corresponding to each line from the bottom end of the payment receipt toward the top end, but the target character may also be searched for from the top end of the payment receipt toward the bottom end.
(2) In the above-described embodiments, the target of scanning is a billing receipt. However, the present invention can also be used to automatically input specific information on the receipt received by the consumer using OCR processing.

（３）前述の実施の形態においては、精算レシートに印字される可能性がある登録単語と対応付けられるグループの単語との関係を登録単語テーブル２０２Ｂとして用意する場合について説明したが、精算レシートに印字される売上に関連する単語や各単語とグループとの対応関係をディープラーニングにより学習してもよい。ディープラーニングにより生成された辞書を用いることにより、グループとして登録されている単語に紐付けられるべき未知の登録単語が認識された場合にも、対象文字として検出し、当該単語を含む行の全体をＯＣＲ処理の対象に設定することができる。また、この未知の単語を、グループとして登録されている単語に紐付けることもできる。 (3) In the above-described embodiment, a case has been described in which the relationship between registered words that may be printed on the payment receipt and words in the associated group is prepared as the registered word table 202B. Deep learning may be used to learn words related to printed sales and the correspondence between each word and group. By using a dictionary generated by deep learning, even when an unknown registered word that should be linked to a word registered as a group is recognized, it is detected as a target character and the entire line containing the word is recognized. It can be set as a target for OCR processing. Moreover, this unknown word can also be linked to words registered as a group.

（４）前述の実施の形態の場合、精算レシートの画像データから売上金等の数値を認識する機能をＯＣＲサーバ２００（図１参照）で実行しているが、図４に示す機能の全部又は一部をユーザ端末１００（図１参照）で実行してもよい。この場合のユーザ端末１００は、情報処理装置の一例である。
（５）前述の実施の形態の場合、１台のＯＣＲサーバ２００が図４に示す機能の全てを実行しているが、これらの機能の一部をクラウドネットワーク３００に接続される他のサーバや端末との協働により実行してもよい。 (4) In the case of the above embodiment, the OCR server 200 (see FIG. 1) executes the function of recognizing numerical values such as sales proceeds from the image data of the settlement receipt. A part of the process may be executed by the user terminal 100 (see FIG. 1). The user terminal 100 in this case is an example of an information processing device.
(5) In the above embodiment, one OCR server 200 executes all the functions shown in FIG. 4, but some of these functions are performed by other servers connected to the cloud network 300 or It may also be executed in cooperation with the terminal.

１００…ユーザ端末、１０１…制御部、１０２…カメラ、１０３…表示部、１０４…操作入力部、１０５…記憶部、１０５Ａ…画像データ、１０６…通信部、１１０…精算レシートの画像、１２０…シャッターボタン、１２１…キャンセルボタン、１２２…登録ボタン、２００…ＯＣＲサーバ、２０１…制御部、２０１Ａ…画像データ取得部、２０１Ｂ…文字切出処理部、２０１Ｃ…ＯＣＲ処理部、２０１Ｄ…マッチング部、２０１Ｅ…値選択部、２０１Ｆ…グループ別結果選択部、２０２…記憶部、２０２Ａ…画像データ、２０２Ｂ…登録単語テーブル、２０３…通信部、３００…クラウドネットワーク 100... User terminal, 101... Control unit, 102... Camera, 103... Display unit, 104... Operation input unit, 105... Storage unit, 105A... Image data, 106... Communication unit, 110... Image of payment receipt, 120... Shutter Button, 121...Cancel button, 122...Register button, 200...OCR server, 201...Control unit, 201A...Image data acquisition unit, 201B...Character extraction processing unit, 201C...OCR processing unit, 201D...Matching unit, 201E... Value selection section, 201F...Group result selection section, 202...Storage section, 202A...Image data, 202B...Registered word table, 203...Communication section, 300...Cloud network

Claims

制御部と、
撮像部が撮像した精算レシートの画像データを、当該撮像部から取得する取得部と、
前記画像データのうち文字が含まれる各行の先頭側から固定文字数の部分画像データを１行ずつ順番に切り出す切出処理部と、
切り出された前記部分画像データから文字情報を認識する文字認識部と
を有し、
前記制御部は、
前記文字認識部から出力される前記文字情報に対応する登録単語が存在する場合、対応する行全体を文字認識の対象に決定し、
前記文字認識部から出力される前記文字情報に対応する登録単語が存在しない場合、対応する行全体を文字認識の対象から除外する、
情報処理装置。 a control unit;
an acquisition unit that acquires image data of the payment receipt captured by the imaging unit from the imaging unit;
a cutout processing unit that sequentially cuts out partial image data of a fixed number of characters one line at a time from the beginning of each line containing characters in the image data;
a character recognition unit that recognizes character information from the cut out partial image data;
The control unit includes:
If there is a registered word corresponding to the character information output from the character recognition unit, determining the entire corresponding line as a target for character recognition ,
If there is no registered word corresponding to the character information output from the character recognition unit, excluding the entire corresponding line from character recognition targets ;
Information processing device.

前記制御部は、前記文字認識部から出力される前記文字情報に対応する金額情報を決定する、請求項１に記載の情報処理装置。 The information processing device according to claim 1, wherein the control unit determines amount information corresponding to the character information output from the character recognition unit.

１つの前記部分画像データについて、前記文字認識部が前記文字情報の認識を開始する行よりも、当該文字情報の認識を終了する行がレシートの上方に位置する、請求項１又は２に記載の情報処理装置。 3. The method according to claim 1, wherein for one piece of the partial image data, a line where the character recognition section finishes recognizing the character information is located above the line where the character recognition unit starts recognizing the character information. Information processing device.

制御部と、
精算レシートを撮像して画像データを出力する撮像部と、
前記撮像部から前記画像データを取得する取得部と、
前記画像データのうち文字が含まれる各行の先頭側から固定文字数の部分画像データを１行ずつ順番に切り出す切出処理部と、
切り出された前記部分画像データから文字情報を認識する文字認識部と
を有し、
前記制御部は、前記文字認識部から出力される前記文字情報に対応する登録単語が存在する場合、対応する行全体を文字認識の対象に決定し、
前記文字認識部から出力される前記文字情報に対応する登録単語が存在しない場合、対応する行全体を文字認識の対象から除外する、
情報処理システム。 A control unit;
an imaging unit that captures an image of a receipt and outputs image data;
an acquisition unit that acquires the image data from the imaging unit;
a cutout processing unit that cuts out partial image data of a fixed number of characters from the beginning of each line of the image data , one line at a time, in order ;
a character recognition unit that recognizes character information from the cut-out partial image data,
the control unit, when a registered word corresponding to the character information output from the character recognition unit is present, determines an entire corresponding line as a target for character recognition ;
If there is no registered word corresponding to the character information output from the character recognition unit, the entire corresponding line is excluded from the target of character recognition .
Information processing system.

精算レシートの画像データを取得する処理と、
前記画像データのうち文字が含まれる各行の先頭側から固定文字数の部分画像データを１行ずつ順番に切り出す処理と、
切り出された前記部分画像データから文字情報を認識する処理と、
認識された前記文字情報に対応する登録単語が存在する場合、対応する行全体を文字認識の対象に決定する処理と、
文字認識部から出力される前記文字情報に対応する登録単語が存在しない場合、対応する行全体を文字認識の対象から除外する処理と、
を有する情報処理方法。 Processing to obtain image data of payment receipt,
a process of sequentially cutting out partial image data of a fixed number of characters one line at a time from the beginning of each line containing characters from the image data;
a process of recognizing character information from the cut out partial image data;
If there is a registered word corresponding to the recognized character information, a process of determining the entire corresponding line as a target for character recognition ;
If there is no registered word corresponding to the character information output from the character recognition unit, excluding the entire corresponding line from character recognition;
An information processing method having the following.