JP2020046736A

JP2020046736A - Image processing device and program

Info

Publication number: JP2020046736A
Application number: JP2018172620A
Authority: JP
Inventors: 上野　邦和; Kunikazu Ueno; 邦和上野
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2018-09-14
Filing date: 2018-09-14
Publication date: 2020-03-26
Anticipated expiration: 2038-09-14
Also published as: JP7251078B2

Abstract

To provide a technique which can correctly determine the range of a deteriorated code image which cannot be correctly determined, in a method and the like which determine a code area with reference to a prescribed mark.SOLUTION: An input image 1010 including a deteriorated image code is inputted to a GAN generation machine 302. The generation machine 302 is a neural network which generates a generation image 1030 indicating an estimation area of a QR code in the input image 1010. One of a correct image 1020 indicating a correct area of the QR code in the input image 1010, and the generation image 1030 generated by the generation machine 302 is inputted to an identifier 304. The identifier 304 identifies whether the inputted image is one of the correct image 1020 and the generation image 1030. Many input images 1010 and correct images 1020 are inputted, and the generation machine 302 and the identifier 304 are caused to learn so that the identifier 304 can correctly identify the images. The learned generation machine 302 is used as means to find the area of the image code from the input image.SELECTED DRAWING: Figure 2

Description

本発明は、画像処理装置及びプログラムに関する。 The present invention relates to an image processing device and a program.

バーコード等の１次元画像コードやＱＲコード（登録商標）等の２次元画像コードといった画像コードが広く用いられている。画像コードを認識する装置の中には、撮影された画像内の未知の位置にある画像コードを検出し、認識する機能を持つものもある。 Image codes such as one-dimensional image codes such as bar codes and two-dimensional image codes such as QR codes (registered trademark) are widely used. Some devices that recognize an image code have a function of detecting and recognizing an image code at an unknown position in a captured image.

特許文献１に開示された仕組みでは、２次元コード抽出手段は、入力画像データを２値化した２値化画像に所定のぼかし処理を施し、暗画素が所定の割合以上に分布している画素集合を検出し、２次元コード候補とする。さらに、２値画像を用いて２次元コード候補が２次元コードの特徴を満たすかどうかを判定し、２次元コードを特定する。２次元コード認識手段は、２次元コードに対応する２値画像から基準画像パタンを検出する。基準画像パタンに基づく仮モジュールを２値化した２次元コードの画像に割り当て、デコード処理を行う。 In the mechanism disclosed in Patent Literature 1, the two-dimensional code extraction unit performs predetermined blurring processing on a binarized image obtained by binarizing input image data, and obtains pixels whose dark pixels are distributed at a predetermined ratio or more. A set is detected and set as a two-dimensional code candidate. Furthermore, it is determined whether the two-dimensional code candidate satisfies the characteristics of the two-dimensional code using the binary image, and the two-dimensional code is specified. The two-dimensional code recognition unit detects a reference image pattern from a binary image corresponding to the two-dimensional code. The provisional module based on the reference image pattern is assigned to a binarized two-dimensional code image, and decoding processing is performed.

特許文献２に開示された方法では、最初に検出された２つのファインダパターンの特性に基づいてパターンマッチングテンプレートを形成し、検出されたファインダパターンに近接する少なくとも１つの候補領域を判定する。この候補領域のコンテンツをパターンマッチングテンプレートと相関することにより少なくとも１つの候補領域においてＱＲコードの先に検出されなかった第３のファインダパターンを検出する。ＱＲコードの復号化は、識別された第３のファインダパターン及び最初に検出された２つのファインダパターンの各々を使用して実行する。 In the method disclosed in Patent Document 2, a pattern matching template is formed based on characteristics of two finder patterns detected first, and at least one candidate area close to the detected finder pattern is determined. By correlating the content of this candidate area with the pattern matching template, a third finder pattern not detected earlier than the QR code is detected in at least one candidate area. The decoding of the QR code is performed using the identified third finder pattern and each of the first two finder patterns detected.

特許第５５０７１３４号明細書（特開２０１１−１４０１２号公報）Patent No. 5507134 (Japanese Patent Application Laid-Open No. 2011-14012) 特開２０１０−１７０５３９号公報JP 2010-170539 A

例えばＦＡＸ送信やコピー等の繰り返し、筆記具等による記入の重なり、透かしやステガノグラフィ等の地紋の重なり、あるいはそれらの混合等により、コード画像が大きく劣化する場合がある。このような劣化により、例えば、コード画像の範囲を示すマークが潰れたり、コード画像の余白に暗い色となって、コード画像から周囲に暗い色の領域が大きく広がったりすることも起こりえる。このような場合、所定のマークを基準にコード画像の範囲を判定または推定する方式や、暗い色の割合が高い所定形状の領域をコード画像の範囲と判定する方式では、コード画像の範囲を正しく判定できない。 For example, the code image may be significantly degraded due to repetition of facsimile transmission, copying, etc., overlapping of writing with a writing instrument or the like, overlapping of background patterns such as watermarks and steganography, or a mixture thereof. Due to such deterioration, for example, a mark indicating the range of the code image may be crushed, or a dark color may be formed in the margin of the code image, and a dark color area may be greatly spread from the code image to the periphery. In such a case, in a method of determining or estimating the range of the code image based on a predetermined mark or a method of determining a region of a predetermined shape having a high ratio of dark colors as the range of the code image, the range of the code image is correctly determined. Cannot be determined.

本発明は、所定のマークを基準にコード画像の領域を判定または推定する方式や、暗い色の割合が高い所定形状の領域をコード画像の領域と判定する方式では、正しく判定できない劣化したコード画像の領域を正しく判定できる手法を提供する。 The present invention provides a method of determining or estimating a region of a code image based on a predetermined mark and a method of determining a region of a predetermined shape having a high proportion of dark colors as a region of a code image. To provide a method for correctly determining the region of

請求項１に係る発明は、劣化したコード画像を含むサンプル画像と、前記サンプル画像中の前記コード画像の領域を示す正解画像と、の組合せにより、入力画像内のコード画像の領域を判定するよう学習した学習済モデルと、前記学習済モデルを用いて入力画像からコード画像の領域を抽出する抽出手段と、を含む画像処理装置である。 The invention according to claim 1 determines a code image area in an input image by a combination of a sample image including a deteriorated code image and a correct image indicating an area of the code image in the sample image. An image processing device comprising: a learned model that has been learned; and an extracting unit that extracts a region of a code image from an input image using the learned model.

請求項２に係る発明は、前記学習済モデルは、ＧＡＮ（敵対的生成ネットワーク）を学習させることで生成されたものである、請求項１に記載の画像処理装置である。 The invention according to claim 2 is the image processing device according to claim 1, wherein the learned model is generated by learning a GAN (hostile generation network).

請求項３に係る発明は、前記学習済モデルは、前記ＧＡＮ内の生成器に対して前記サンプル画像を入力し、これに応じて前記生成器が生成した正解領域の推定画像と、前記サンプル画像に対応する前記正解画像と、を前記ＧＡＮ内の識別器に入力して両者を識別させる学習を行わせ、学習を済ませた前記ＧＡＮの前記生成器である、請求項２に記載の画像処理装置である。 The invention according to claim 3, wherein the learned model inputs the sample image to a generator in the GAN, and in accordance with the input, the estimated image of the correct answer area generated by the generator, and the sample image The image processing apparatus according to claim 2, wherein the correct image corresponding to the GAN is input to a discriminator in the GAN to perform learning for discriminating the two, and the generator of the GAN that has completed learning. It is.

請求項４に係る発明は、劣化したコード画像と、これに対応する劣化前のコード画像と、の組合せにより、劣化した入力コード画像からその劣化が修復されたコード画像を生成するよう学習した修復用学習済モデルと、前記入力画像のうち前記抽出手段により抽出された領域の画像を、前記修復用学習済モデルを用いて修復し、修復後の画像を復号手段に供給する修復手段と、を更に含む請求項１〜３のいずれか１項に記載の画像処理装置である。 According to a fourth aspect of the present invention, there is provided a restoration method in which a combination of a deteriorated code image and a corresponding pre-deterioration code image is used to generate a code image whose deterioration has been restored from a degraded input code image. A repaired model for repairing an image of an area extracted by the extracting means in the input image using the trained model for repair, and supplying the restored image to the decoding means. The image processing apparatus according to claim 1, further comprising:

請求項５に係る発明は、前記修復用学習済モデルは、第２のＧＡＮを学習させることで生成されたものである、請求項４に記載の画像処理装置である。 The invention according to claim 5 is the image processing apparatus according to claim 4, wherein the repaired learned model is generated by learning a second GAN.

請求項６に係る発明は、前記修復用学習済モデルは、前記第２のＧＡＮ内の生成器に対して前記劣化したコード画像を入力し、これに応じて当該生成器が生成した修復後コード画像の推定画像と、前記サンプル画像に対応する前記劣化前のコード画像と、を前記第２のＧＡＮ内の識別器に入力して両者を識別させる学習を行わせ、学習を済ませた前記第２のＧＡＮの前記生成器である、請求項５に記載の画像処理装置である。 The invention according to claim 6, wherein the repaired learned model inputs the degraded code image to a generator in the second GAN, and the repaired code image generated by the generator in response thereto. The estimated image of the image and the code image before deterioration corresponding to the sample image are input to a discriminator in the second GAN, and learning for discriminating the two is performed. The image processing apparatus according to claim 5, wherein the generator is a GAN.

請求項７に係る発明は、コンピュータを、劣化したコード画像を含むサンプル画像と、前記サンプル画像中の前記コード画像の領域を示す正解領域画像と、の組合せにより、入力画像内のコード画像の領域を判定するよう学習した学習済モデル、前記学習済モデルを用いて入力画像からコード画像の領域を抽出する抽出手段、として機能させるためのプログラムである。 The invention according to claim 7, further comprising: controlling a computer by combining a sample image including a degraded code image with a correct answer region image indicating a region of the code image in the sample image. And a extracting unit that extracts a region of a code image from an input image using the learned model.

請求項１又は７に係る発明によれば、所定のマークを基準にコード画像の領域を判定または推定する方式や、暗い色の割合が高い所定形状の領域をコード画像の領域と判定する方式では、正しく判定できない劣化したコード画像の領域を正しく判定できる手法を提供することができる。 According to the invention according to claim 1 or 7, a method of determining or estimating a region of a code image based on a predetermined mark and a method of determining a region of a predetermined shape having a high proportion of dark colors as a region of a code image are used. In addition, it is possible to provide a method that can correctly determine a region of a deteriorated code image that cannot be correctly determined.

請求項２又は３に係る発明によれば、ＧＡＮを用いない方式よりも、コード画像の領域をより正確に求めることができる。 According to the second or third aspect of the present invention, the area of the code image can be obtained more accurately than the method using no GAN.

請求項４に係る発明によれば、劣化したコード画像を修復した上で復号手段に供給することができる。 According to the fourth aspect of the present invention, the degraded code image can be restored and supplied to the decoding unit.

請求項５又は６に係る発明によれば、ＧＡＮを用いない方式よりも、劣化したコード画像をより正確に修復することができる。 According to the invention according to claim 5 or 6, the degraded code image can be more accurately restored than the method using no GAN.

画像処理装置の一実施形態の機能構成を示す図である。FIG. 2 is a diagram illustrating a functional configuration of an embodiment of an image processing apparatus. 図１の画像処理装置の学習処理部をＧＡＮとして構成した場合の例を示す図である。FIG. 2 is a diagram illustrating an example of a case where a learning processing unit of the image processing apparatus in FIG. 1 is configured as a GAN. 入力画像と正解画像の例を示す図である。It is a figure showing the example of an input image and a correct answer image. 入力画像と正解画像の別の例を示す図である。It is a figure showing another example of an input image and a correct image. 修復部を学習により構築する場合の、ＧＡＮベースの学習の仕組みを例示する図である。It is a figure which illustrates the mechanism of GAN-based learning when a repair part is constructed by learning.

図１を参照して、画像処理装置の一実施形態の機能構成を説明する。 The functional configuration of an embodiment of the image processing apparatus will be described with reference to FIG.

この画像処理装置は、入力画像に含まれるバーコード等の画像コードが表すコード内容を認識する。この認識のために、画像処理装置は、入力画像中から位置が未知の画像コードを見つけ出す機能を持つ。 This image processing apparatus recognizes code contents represented by an image code such as a barcode included in an input image. For this recognition, the image processing apparatus has a function of finding an image code whose position is unknown from the input image.

以下では、画像コードの一例としてＱＲコード（登録商標）を認識する場合の例を説明する。ただし、ＱＲコードを対象とするのはあくまで一例に過ぎず、本実施形態の手法はＱＲコード以外の画像コードの認識処理にも適用可能である。 Hereinafter, an example in which a QR code (registered trademark) is recognized as an example of an image code will be described. However, the target of the QR code is merely an example, and the method of the present embodiment can be applied to a recognition process of an image code other than the QR code.

図１に示す画像処理装置において、学習済モデル１０は、入力画像からその中に含まれる１以上のＱＲコードの存在領域を抽出する学習を済ませたモデルである。学習済モデル１０は、例えば学習済みのニューラルネットワークを規定するモデルであり、例えばニューラルネットワークを構成するノード（ニューロン）同士の間の結合の重み（強度）の情報の集合として表現される。 In the image processing apparatus shown in FIG. 1, the learned model 10 is a model that has been learned from an input image to extract the existence region of one or more QR codes contained therein. The learned model 10 is a model that defines, for example, a learned neural network, and is expressed as a set of information on the weight (strength) of the connection between nodes (neurons) that constitute the neural network, for example.

学習済モデル１０は、学習処理部３０の学習処理により生成される。学習処理部３０は、背景等のノイズやゆがみ等で劣化したＱＲコードを含んだ入力画像と、その入力画像におけるＱＲコードの領域を示した正解画像と、のペアを大量に用いて学習処理を行う。学習処理部３０が行う学習処理については、後で詳しく説明する。 The learned model 10 is generated by the learning processing of the learning processing unit 30. The learning processing unit 30 performs a learning process by using a large number of pairs of an input image including a QR code deteriorated due to noise or distortion such as a background and a correct image indicating a region of the QR code in the input image. Do. The learning process performed by the learning processing unit 30 will be described later in detail.

画像入力部１２は、１以上のＱＲコードを含んだ入力画像の入力を受け付ける。 The image input unit 12 receives an input of an input image including one or more QR codes.

候補領域抽出部１４は、その入力画像から、ＱＲコードである蓋然性が高い領域（候補領域と呼ぶ）を抽出する。候補領域は、１つのＱＲコードと同形状、同サイズの領域である。この抽出のために、候補領域抽出部１４は、その入力画像を、学習済モデル１０を用いて処理する。候補領域抽出部１４の出力は、入力画像中の各候補領域を示す画像である。例えば、候補領域内の画素とそれ以外の領域の画素とを異なる値（例えば候補領域内は画素値１、それ以外は画素値０）で示した画像がその一例である。 The candidate region extracting unit 14 extracts a region having a high probability of being a QR code (called a candidate region) from the input image. The candidate area is an area having the same shape and the same size as one QR code. For this extraction, the candidate area extraction unit 14 processes the input image using the learned model 10. The output of the candidate area extraction unit 14 is an image indicating each candidate area in the input image. For example, an image in which the pixels in the candidate area and the pixels in the other areas are indicated by different values (for example, the pixel value is 1 in the candidate area, and the pixel value is 0 in the other areas) is an example.

候補領域指示部１６は、候補領域抽出部１４の出力した候補領域を示す画像から、各候補領域の位置情報を求めると共に、各候補領域にそれぞれ一意なラベル（識別情報）を付与する。候補領域の位置情報は、当該候補領域を規定する１以上の特徴点の座標の組である。候補領域が、入力画像のｘ軸及びｙ軸の各方向に沿った辺からなる矩形の領域である場合、その候補領域の位置情報は、その矩形領域の対角線上の２頂点の座標の組である。そして、候補領域指示部１６は、求めた各候補領域の位置情報及びラベルの情報を特徴量抽出部１８に指示する。 The candidate area designating unit 16 obtains position information of each candidate area from the image indicating the candidate area output from the candidate area extracting unit 14, and assigns a unique label (identification information) to each candidate area. The position information of the candidate area is a set of coordinates of one or more feature points that define the candidate area. When the candidate area is a rectangular area composed of sides along the x-axis and y-axis directions of the input image, the position information of the candidate area is a set of coordinates of two vertexes on the diagonal of the rectangular area. is there. Then, the candidate area designating section 16 instructs the feature quantity extracting section 18 with the obtained position information and label information of each candidate area.

特徴量抽出部１８は、候補領域指示部１６から指示された各候補領域の位置情報を用いて、入力画像内の各候補領域の画像の特徴量をそれぞれ抽出する。抽出する特徴量は、当該候補領域の画像の白画素と黒画素の密度比、当該候補領域の縦横アスペクト比等である。白画素と黒画素の密度比は、例えば、その候補領域の画像を所定の閾値で二値化し、得られた二値画像中の白画素の数と黒画素の数の比として求めればよい。これら特徴量は、候補領域の画像がＱＲコードの画像に該当するかどうかの判定に用いられる。この判定は、学習済モデル１０により求められる候補領域に万が一ＱＲコード以外の画像が含まれている場合を考慮した安全措置である。特徴量抽出部１８は、抽出した各候補領域の特徴量を、各候補領域のラベルと対応付けて、画像コード抽出部２０に渡す。 The feature extracting unit 18 extracts the feature of the image of each candidate area in the input image using the position information of each candidate area specified by the candidate area specifying unit 16. The feature amounts to be extracted include the density ratio of white pixels and black pixels in the image of the candidate area, the aspect ratio of the candidate area, and the like. The density ratio between the white pixels and the black pixels may be obtained, for example, by binarizing the image of the candidate area with a predetermined threshold and obtaining the ratio between the number of white pixels and the number of black pixels in the obtained binary image. These feature amounts are used for determining whether or not the image of the candidate area corresponds to the image of the QR code. This determination is a safety measure in consideration of a case where an image other than the QR code is included in the candidate area obtained by the learned model 10. The feature amount extraction unit 18 passes the extracted feature amount of each candidate region to the image code extraction unit 20 in association with the label of each candidate region.

画像コード抽出部２０は、各ラベルに対応する候補領域ごとに、当該候補領域の特徴量がＱＲコードの特徴量の範囲内にあれば、その候補領域がＱＲコードであると判定し、そうでなければその候補領域はＱＲコードでないと判定する。そして、ＱＲコードと判定した候補領域の画像を入力画像から抽出し、抽出した画像（ＱＲコード画像と呼ぶ）をラベルに対応付けて修復部２２に渡す。 If the feature amount of the candidate region is within the range of the feature amount of the QR code for each candidate region corresponding to each label, the image code extraction unit 20 determines that the candidate region is a QR code, and so on. If not, it is determined that the candidate area is not a QR code. Then, the image of the candidate area determined as the QR code is extracted from the input image, and the extracted image (called a QR code image) is passed to the restoration unit 22 in association with the label.

修復部２２は、公知の方法、又は後で説明する特徴的な修復方法を用いて、各ラベルに対応するＱＲコード画像を修復する。この修復は、複写やファクシミリ送信、地紋、手書き等によるゆがみ、欠損、汚れ、ノイズ等により劣化したＱＲコード画像を、標準的なＱＲコード認識処理を行う復号部２４が復号できるよう修復する。 The restoration unit 22 restores the QR code image corresponding to each label using a known method or a characteristic restoration method described later. This restoration is performed so that the QR code image deteriorated due to distortion, loss, dirt, noise, or the like due to copying, facsimile transmission, tint block, handwriting, or the like can be decoded by the decoding unit 24 that performs standard QR code recognition processing.

復号部２４は、修復部２２により修復されたＱＲコード画像に対して、公知のＱＲコード認識処理を実行することで、そのＱＲコード画像を復号する。 The decoding unit 24 decodes the QR code image by executing a known QR code recognition process on the QR code image restored by the restoration unit 22.

次に、学習処理部３０の構成を、図２に例示する。学習処理部３０は、ＧＡＮ（Generative adversarial networks：敵対的生成ネットワーク）を構成する生成器（ジェネレータ）３０２と識別器（ディスクリミネータ）３０４とを含む。 Next, the configuration of the learning processing unit 30 is illustrated in FIG. The learning processing unit 30 includes a generator 302 and a discriminator 304 that constitute a GAN (Generative adversarial networks).

また、学習処理部３０は、学習用データとして、入力画像１０１０と正解画像１０２０のペアを多数保持している。入力画像１０１０は、図３（ａ）に示すように、複写やファクシミリ送信、地紋、手書き等によりゆがみ、欠損、汚れ、ノイズ等といった様々な形態の劣化が生じたＱＲコード１０１２を含んだ画像である。正解画像１０２０は、入力画像１０１０中のＱＲコード１０１２の占める領域（コード領域１０２２と呼ぶ）を示す画像である。すなわち、正解画像１０２０は、ＱＲコードの領域内の画素をその領域の外の画素とは区別可能な値で示した画像である。例えば、図３（ｂ）に示すように、入力画像１０１０中のＱＲコード１０１２の占める領域内の画素を黒（値１）、ＱＲコード１０１２でない領域の画素を白（値０）とする二値画像を正解画像１０２０として用いてもよい。 Further, the learning processing unit 30 holds a large number of pairs of the input image 1010 and the correct answer image 1020 as learning data. As shown in FIG. 3A, the input image 1010 is an image including a QR code 1012 in which various forms of deterioration such as distortion, loss, dirt, and noise have occurred due to copying, facsimile transmission, copy-forgery-inhibited pattern, handwriting, and the like. is there. The correct answer image 1020 is an image showing an area occupied by the QR code 1012 in the input image 1010 (referred to as a code area 1022). That is, the correct answer image 1020 is an image in which the pixels in the region of the QR code are indicated by values that can be distinguished from the pixels outside the region. For example, as shown in FIG. 3B, a binary image in which the pixels in the area occupied by the QR code 1012 in the input image 1010 are black (value 1) and the pixels in the area other than the QR code 1012 are white (value 0). The image may be used as the correct image 1020.

また、入力画像１０１０の別の例として、図４（ａ）に例示するように、１つの画像中に複数のＱＲコード１０１２を含んだものを用いてもよい。この場合、対応する正解画像１０２０としては、例えば、図４（ｂ）に示すように、入力画像１０１０中の各ＱＲコード１０１２の場所を黒とし、他の場所を白とした画像を用いる。 Further, as another example of the input image 1010, an image including a plurality of QR codes 1012 in one image as illustrated in FIG. 4A may be used. In this case, as the corresponding correct image 1020, for example, as shown in FIG. 4B, an image in which the location of each QR code 1012 in the input image 1010 is black and the other locations are white is used.

生成器３０２は、入力画像１０１０から生成画像１０３０を生成するニューラルネットワークである。生成画像１０３０は、入力画像１０１０に対応する正解画像１０２０を推定した画像である。例えば、生成画像１０３０は、図３（ｂ）の正解画像１０２０と同様、ＱＲコードの領域内の画素をその領域の外の画素とは区別可能な画素値で示す画像である。生成器３０２は、多数の入力画像１０１０を処理することで、より正確にコード領域１０２２を推定できるよう学習する。 The generator 302 is a neural network that generates a generated image 1030 from the input image 1010. The generated image 1030 is an image obtained by estimating the correct image 1020 corresponding to the input image 1010. For example, similarly to the correct image 1020 in FIG. 3B, the generated image 1030 is an image in which pixels in the QR code area are indicated by pixel values that can be distinguished from pixels outside the area. The generator 302 learns by processing a large number of input images 1010 so that the code area 1022 can be more accurately estimated.

識別器３０４は、入力された画像が、入力画像１０１０に対応する正解画像１０２０、及び入力画像１０１０から生成器３０２が生成した生成画像１０３０、のうちのいずれであるかを識別するニューラルネットワークである。学習処理部３０は、正解画像１０２０（とこれに対応する入力画像１０１０）又は生成画像１０３０（とこれに対応する入力画像１０１０）を識別器３０４に入力する。これに応じて、識別器３０４は、入力された画像が正解画像１０２０（正解:true）又は生成画像１０３０（偽物:false）のいずれであるかを識別し、その識別結果を示す信号を出力する。 The discriminator 304 is a neural network that discriminates whether the input image is a correct image 1020 corresponding to the input image 1010 or a generated image 1030 generated by the generator 302 from the input image 1010. . The learning processing unit 30 inputs the correct image 1020 (and the corresponding input image 1010) or the generated image 1030 (and the corresponding input image 1010) to the classifier 304. In response, the discriminator 304 discriminates whether the input image is the correct image 1020 (correct: true) or the generated image 1030 (fake: false), and outputs a signal indicating the result of the discrimination. .

学習処理部３０は、識別器３０４に入力した画像が正解、偽物のいずれであるかと、その識別器３０４からの出力信号とを比較し、その比較結果に基づく損失信号を生成器３０２及び識別器３０４の各々のニューラルネットワークのノード間の結合の重みパラメータにフィードバックする。これにより、生成器３０２と識別器３０４が学習を行う。 The learning processing unit 30 compares whether the image input to the classifier 304 is a correct answer or a fake, with an output signal from the classifier 304, and generates a loss signal based on the comparison result with the generator 302 and the classifier. Feedback is given to the weight parameter of the connection between the nodes of each neural network at 304. Thus, the generator 302 and the classifier 304 perform learning.

ＧＡＮを構成する生成器３０２及び識別器３０４は、前者が教師データ（正解画像１０２０）になるべく近い偽物（生成画像１０３０）を生成しようとし、後者がその偽物を正しく識別しようとするという形で、いわば互いに切磋琢磨しながら学習を進める。 The generator 302 and the discriminator 304 constituting the GAN are configured such that the former tries to generate a fake (generated image 1030) as close as possible to the teacher data (the correct image 1020), and the latter tries to correctly identify the fake. So to speak, we work together and work together.

学習処理部３０には、例えば「pix2pix」というアルゴリズム（Phillip Iso1a他による論文「Image-to-Image Translation with Conditional Adversarial Networks」、Berkeley AI Research (BAIR) Laboratory, UC Berkeley参照）と同様の方式を用いてもよい。この場合、生成器３０２の学習のために、識別器３０４の損失信号に加え、正解画像１０２０と生成画像１０３０との差もフィードバックする。 The learning processing unit 30 uses, for example, a method similar to the algorithm "pix2pix" (see the paper "Image-to-Image Translation with Conditional Adversarial Networks" by Phillip Iso1a et al., Berkeley AI Research (BAIR) Laboratory, UC Berkeley). You may. In this case, for the learning of the generator 302, the difference between the correct image 1020 and the generated image 1030 is also fed back in addition to the loss signal of the classifier 304.

また、他の例として、ＣｙｃｌｅＧＡＮと呼ばれるＧＡＮを学習処理部３０に用いてもよい。ＣｙｃｌｅＧＡＮを用いた場合、入力画像のすべてに正解画像が用意されていない場合でも学習が可能である。 As another example, a GAN called a Cycle GAN may be used for the learning processing unit 30. When the cycle GAN is used, learning is possible even when correct images are not prepared for all of the input images.

そして、本実施形態の画像処理装置では、以上に例示した手法により生成した学習済みの生成器３０２を、学習済モデル１０として用いる。この学習済モデル１０により、入力画像中のＱＲコードの領域（候補領域）を示す画像が生成される。 In the image processing apparatus according to the present embodiment, the learned generator 302 generated by the above-described method is used as the learned model 10. The learned model 10 generates an image indicating a QR code area (candidate area) in the input image.

以上では、入力画像中のＱＲコードの領域を示す画像を生成するのにニューラルネットワークの学習済モデル１０を用いた。これと同様の手法を、修復部２２に適用してもよい。すなわち、ニューラルネットワークを、劣化したＱＲコードの画像から劣化前のＱＲコードの画像を生成するよう学習させ、学習済みのニューラルネットワークを修復部２２として用いるのである。 In the above, the trained model 10 of the neural network is used to generate an image indicating the region of the QR code in the input image. A similar method may be applied to the restoration unit 22. That is, the neural network is trained to generate an image of the QR code before deterioration from the image of the deteriorated QR code, and the trained neural network is used as the restoration unit 22.

この場合のニューラルネットワークとしては、上記と同様、ＧＡＮを用いる。図５に、修復部２２の学習済モデルを生成するための、ＧＡＮを用いた学習処理部の例を示す。 GAN is used as the neural network in this case, as described above. FIG. 5 shows an example of a learning processing unit using a GAN for generating a learned model of the restoration unit 22.

この学習処理部は、学習用データとして、入力画像６０２と正解画像６０４のペアを多数保持している。入力画像６０２は、複写やファクシミリ送信、地紋、手書き等によりゆがみ、欠損、汚れ、ノイズ等といった様々な形態の劣化が生じたＱＲコードの画像である。正解画像６０４は、そのような劣化を受ける前の正常なＱＲコードの画像である。 This learning processing unit holds a large number of pairs of an input image 602 and a correct image 604 as learning data. The input image 602 is a QR code image in which various forms of deterioration such as distortion, loss, dirt, and noise have occurred due to copying, facsimile transmission, tint block, handwriting, and the like. The correct answer image 604 is an image of a normal QR code before undergoing such deterioration.

生成器４０２は、入力画像６０２（劣化したＱＲコード）から、劣化前の正常なＱＲコードの画像に近い生成画像６０６を生成する。 The generator 402 generates, from the input image 602 (degraded QR code), a generated image 606 that is close to an image of a normal QR code before deterioration.

識別器４０４には、正解画像６０４か、生成器４０２が生成した生成画像６０６かが入力される。識別器４０４は、入力された画像が、正解画像６０４と生成画像６０６（偽物）のいずれであるかを識別し、その識別結果を示す信号を出力する。 The corrector image 604 or the generated image 606 generated by the generator 402 is input to the classifier 404. The discriminator 404 discriminates whether the input image is the correct image 604 or the generated image 606 (fake), and outputs a signal indicating the result of the discrimination.

学習処理部は、識別器４０４に入力した画像が正解、偽物のいずれであるかと、その識別器４０４からの出力信号とを比較し、その比較結果に基づく損失信号を生成器４０２及び識別器４０４の各々のニューラルネットワークのノード間の結合の重みパラメータにフィードバックする。これにより、上述の学習処理部３０の場合と同様に、生成器４０２と識別器４０４が学習を行う。そして、学習済みの生成器４０２を、修復部２２として用いる。 The learning processing unit compares whether the image input to the classifier 404 is a correct answer or a fake, with an output signal from the classifier 404, and generates a loss signal based on the comparison result with the generator 402 and the classifier 404. Is fed back to the weight parameter of the connection between the nodes of each neural network. Thus, the generator 402 and the classifier 404 perform learning as in the case of the learning processing unit 30 described above. Then, the learned generator 402 is used as the restoration unit 22.

以上に例示した画像処理装置は、例えば、コンピュータに上述の各機能を表すプログラムを実行させることにより実現される。ここで、コンピュータは、例えば、ハードウエアとして、ＣＰＵ等のマイクロプロセッサ、ランダムアクセスメモリ（ＲＡＭ）およびリードオンリメモリ（ＲＯＭ）等のメモリ（一次記憶）、フラッシュメモリやＳＳＤ（ソリッドステートドライブ）、ＨＤＤ（ハードディスクドライブ）等の固定記憶装置を制御するコントローラ、各種Ｉ／Ｏ（入出力）インタフェース、ローカルエリアネットワークなどのネットワークとの接続のための制御を行うネットワークインタフェース等が、たとえばバス等を介して接続された回路構成を有する。それら各機能の処理内容が記述されたプログラムがネットワーク等の経由でフラッシュメモリ等の固定記憶装置に保存され、コンピュータにインストールされる。固定記憶装置に記憶されたプログラムがＲＡＭに読み出されＣＰＵ等のマイクロプロセッサにより実行されることにより、上に例示した機能モジュール群が実現される。 The image processing apparatus exemplified above is realized, for example, by causing a computer to execute programs representing the above-described functions. Here, the computer may be, for example, a hardware such as a microprocessor such as a CPU, a memory (primary storage) such as a random access memory (RAM) and a read-only memory (ROM), a flash memory or an SSD (solid state drive), and an HDD. A controller for controlling a fixed storage device such as a hard disk drive, various I / O (input / output) interfaces, a network interface for controlling connection to a network such as a local area network, and the like are provided via, for example, a bus. It has a connected circuit configuration. A program describing the processing contents of each of these functions is stored in a fixed storage device such as a flash memory via a network or the like, and is installed in a computer. The programs stored in the fixed storage device are read into the RAM and executed by the microprocessor such as the CPU, thereby realizing the functional module group exemplified above.

また、画像処理装置の一部、例えば学習済モデル１０等のニューラルネットワークを、ハードウエア回路として構成してもよい。 A part of the image processing apparatus, for example, a neural network such as the learned model 10 may be configured as a hardware circuit.

１０学習済モデル、１２画像入力部、１４候補領域抽出部、１６候補領域指示部、１８特徴量抽出部、２０画像コード抽出部、２２修復部、２４復号部、３０学習処理部、３０２生成器、３０４識別器、１０１０入力画像、１０２０正解画像、１０３０生成画像。
Reference Signs List 10 learned model, 12 image input unit, 14 candidate region extraction unit, 16 candidate region instruction unit, 18 feature amount extraction unit, 20 image code extraction unit, 22 restoration unit, 24 decoding unit, 30 learning processing unit, 302 generator , 304 classifier, 1010 input image, 1020 correct image, 1030 generated image.

Claims

劣化したコード画像を含むサンプル画像と、前記サンプル画像中の前記コード画像の領域を示す正解画像と、の組合せにより、入力画像内のコード画像の領域を判定するよう学習した学習済モデルと、
前記学習済モデルを用いて入力画像からコード画像の領域を抽出する抽出手段と、
を含む画像処理装置。 A trained model that has been learned to determine the area of the code image in the input image by a combination of the sample image including the degraded code image and the correct image indicating the area of the code image in the sample image,
Extraction means for extracting a region of a code image from an input image using the learned model,
An image processing apparatus including:

前記学習済モデルは、ＧＡＮ（敵対的生成ネットワーク）を学習させることで生成されたものである、請求項１に記載の画像処理装置。 The image processing device according to claim 1, wherein the learned model is generated by learning a GAN (hostile generation network).

前記学習済モデルは、前記ＧＡＮ内の生成器に対して前記サンプル画像を入力し、これに応じて前記生成器が生成した正解領域の推定画像と、前記サンプル画像に対応する前記正解画像と、を前記ＧＡＮ内の識別器に入力して両者を識別させる学習を行わせ、学習を済ませた前記ＧＡＮの前記生成器である、請求項２に記載の画像処理装置。 The learned model inputs the sample image to a generator in the GAN, and in response to this, an estimated image of a correct region generated by the generator, and the correct image corresponding to the sample image, The image processing apparatus according to claim 2, wherein the generator is input to a discriminator in the GAN to perform learning for discriminating the two, and the learned GAN is the generator.

劣化したコード画像と、これに対応する劣化前のコード画像と、の組合せにより、劣化した入力コード画像からその劣化が修復されたコード画像を生成するよう学習した修復用学習済モデルと、
前記入力画像のうち前記抽出手段により抽出された領域の画像を、前記修復用学習済モデルを用いて修復し、修復後の画像を復号手段に供給する修復手段と、
を更に含む請求項１〜３のいずれか１項に記載の画像処理装置。 A repaired trained model that has been trained to generate a code image in which the deterioration has been repaired from the deteriorated input code image by a combination of the deteriorated code image and the corresponding undegraded code image;
Restoration means for restoring the image of the area extracted by the extraction means in the input image using the learned model for restoration, and supplying the restored image to the decoding means;
The image processing apparatus according to claim 1, further comprising:

前記修復用学習済モデルは、第２のＧＡＮを学習させることで生成されたものである、請求項４に記載の画像処理装置。 The image processing device according to claim 4, wherein the repaired learned model is generated by learning a second GAN.

前記修復用学習済モデルは、前記第２のＧＡＮ内の生成器に対して前記劣化したコード画像を入力し、これに応じて当該生成器が生成した修復後コード画像の推定画像と、前記サンプル画像に対応する前記劣化前のコード画像と、を前記第２のＧＡＮ内の識別器に入力して両者を識別させる学習を行わせ、学習を済ませた前記第２のＧＡＮの前記生成器である、請求項５に記載の画像処理装置。 The repaired trained model inputs the degraded code image to a generator in the second GAN, and in response thereto, an estimated image of the repaired code image generated by the generator and the sampled image. The code image before deterioration corresponding to the image and the discriminator in the second GAN are input to a discriminator in the second GAN to perform learning for discriminating the two, and the generator of the learned second GAN is completed. An image processing apparatus according to claim 5.

コンピュータを、
劣化したコード画像を含むサンプル画像と、前記サンプル画像中の前記コード画像の領域を示す正解領域画像と、の組合せにより、入力画像内のコード画像の領域を判定するよう学習した学習済モデル、
前記学習済モデルを用いて入力画像からコード画像の領域を抽出する抽出手段、
として機能させるためのプログラム。
Computer
A trained model trained to determine the area of the code image in the input image by a combination of the sample image including the deteriorated code image and the correct area image indicating the area of the code image in the sample image.
Extracting means for extracting a region of a code image from an input image using the learned model,
Program to function as