JP7437918B2

JP7437918B2 - Information processing device, information processing method, and program

Info

Publication number: JP7437918B2
Application number: JP2019209823A
Authority: JP
Inventors: 泰弘奥野
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-11-20
Filing date: 2019-11-20
Publication date: 2024-02-26
Anticipated expiration: 2039-11-20
Also published as: JP2021082068A

Description

本発明は、情報処理装置、情報処理方法、及びプログラムに関し、特に機械学習に関する。 The present invention relates to an information processing device, an information processing method, and a program, and particularly relates to machine learning.

認識処理を行う認識器の学習のためには、認識器に入力される認識対象データと同等の特性を有する学習データが用いられる。例えば、カメラにより撮像され、人間に鑑賞される撮像画像に対して認識処理を行う場合、学習のためには同様の画質を持つ撮像画像を用いることができる。ところで、認識処理を行う認識器の学習のためには、通常、大量の学習データが用いられ、このような学習データを準備することは容易ではない。非特許文献１は、目の実写画像に対する視線推定処理を行う認識器の学習を行うための学習データとして、変換器が目のＣＧ画像を実写画像らしく変換することにより得られたデータを用いることを開示している。目のＣＧ画像と、ＣＧ画像における視線方向を示す教師データと、を大量に生成することは容易であるから、非特許文献１によれば、変換により得られた実写画像らしい目の画像と、教師データと、を含む学習データを容易に生成することができる。非特許文献１は、このような変換器の学習を、目の実写画像と、目のＣＧ画像と、を用いて行うことも開示している。 In order to train a recognizer that performs recognition processing, learning data having characteristics equivalent to recognition target data input to the recognizer is used. For example, when recognition processing is performed on a captured image captured by a camera and viewed by a human being, captured images with similar image quality can be used for learning. By the way, a large amount of learning data is usually used to train a recognizer that performs recognition processing, and it is not easy to prepare such learning data. Non-Patent Document 1 discloses that a converter uses data obtained by converting a CG image of the eye to look like a real image as learning data for learning a recognizer that performs gaze estimation processing on a real eye image. is disclosed. Since it is easy to generate a large amount of CG images of eyes and training data indicating the direction of line of sight in the CG images, according to Non-Patent Document 1, images of eyes that appear to be real images obtained by conversion, Learning data including teacher data can be easily generated. Non-Patent Document 1 also discloses that learning of such a converter is performed using a real image of the eye and a CG image of the eye.

A. Shrivastava et al. "Learning from Simulated and Unsupervised Images through Adversarial Training." arXiv preprint arXiv:1612.07828, 2016.A. Shrivastava et al. "Learning from Simulated and Unsupervised Images through Adversarial Training." arXiv preprint arXiv:1612.07828, 2016.

認識対象データは、認識処理用に特別に生成され、特別な特性を有していることがある。例えば、認識用の画像として、観賞用の画像より低画質の画像を用いることにより、認識用の画像を生成する際に、処理を高速化する又はメモリを節約することができる。このような認識用の画像は、観賞用の画像と比べてノイズが多い、ボケが強い、又は色調が不自然などの特性を持つかもしれない。また、このような認識用の画像の特性は、カメラごとに異なるかもしれない。例えば、カメラは、個体差として、左半分が暗い、下半分にノイズが多い、などの特性を有しているかもしれない。このような場合、認識対象データを生成する装置によって得られた学習データを用いて認識器の学習を行うことが望ましいが、全ての装置について大量の学習データを用意することは容易ではない。 Recognition target data may be specially generated for recognition processing and may have special characteristics. For example, by using an image of lower quality than an ornamental image as a recognition image, it is possible to speed up processing or save memory when generating a recognition image. Such images for recognition may have characteristics such as more noise, more blur, or unnatural color tone than images for viewing. Furthermore, the characteristics of such images for recognition may differ from camera to camera. For example, cameras may have individual characteristics such as a dark left half and a lot of noise in the lower half. In such a case, it is desirable to train the recognizer using learning data obtained by a device that generates recognition target data, but it is not easy to prepare a large amount of learning data for all devices.

本願発明者は、観賞用の画像を、変換器を用いて認識対象データと同等の特性を有するように変換し、変換後のデータを学習データとして用いることを検討した。このような構成によれば、観賞用の画像に対してＧＴ（Ground Truth、教師値）を付与する作業を行うことにより得られたデータセットを、様々な装置の学習のために用いることができる。また、変換器の学習は、認識対象データを生成する装置が生成した、オリジナルデータと同様の特性を有するデータ（例えば観賞用の画像）と、認識対象データと同様の特性を有するデータ（例えば認識処理用の画像）と、を用いて行うことができる。すなわち、観賞用の画像に対する変換処理により得られた画像と、認識用の画像と、が一致するように、変換器の学習を行うことができる。 The inventor of the present application has considered converting an ornamental image using a converter so that it has characteristics equivalent to recognition target data, and using the converted data as learning data. According to such a configuration, a data set obtained by assigning GT (Ground Truth, teacher value) to ornamental images can be used for learning of various devices. . In addition, the learning of the converter is performed using data (e.g., an ornamental image) that is generated by the device that generates the recognition target data and that has the same characteristics as the original data (e.g., an ornamental image) and data that has the same characteristics as the recognition target data (e.g., the recognition target data). image for processing). That is, the converter can be trained so that the image obtained by the conversion process for the ornamental image matches the recognition image.

一方で、このような構成においては、変換器による変換精度が学習後の認識器による認識精度に影響する。 On the other hand, in such a configuration, the conversion accuracy by the converter affects the recognition accuracy by the recognized recognizer after learning.

本発明は、変換器による特性変換により得られた学習データを用いて認識器の学習を行う構成において、認識器の認識精度を向上させることを目的とする。 An object of the present invention is to improve the recognition accuracy of a recognizer in a configuration in which the recognizer is trained using learning data obtained by characteristic conversion by a converter.

本発明の目的を達成するために、例えば、本発明の情報処理装置は以下の構成を備える。すなわち、
変換器がパラメタを用いて第１の特性を有するデータを第２の特性を有するデータに変換することによって得られた学習データを用いて、前記第２の特性を有するデータに対する認識処理を行う認識器の学習を行う、認識器学習手段と、
前記第２の特性を有する検証データに対する前記認識器による認識処理の結果に基づいて、前記変換器が用いるパラメタを更新する制御手段と、
を備え、
前記制御手段は、前記第２の特性を有する検証データに対する前記認識器による認識処理の結果に基づいて、前記変換器が用いる各パラメタが前記認識器の認識誤差に与える影響度を推定し、前記影響度に従って前記パラメタを更新することを特徴とする。 In order to achieve the object of the present invention, for example, an information processing apparatus of the present invention has the following configuration. That is,
Recognition in which a converter uses learning data obtained by converting data having a first characteristic into data having a second characteristic using a parameter to perform recognition processing on data having the second characteristic. a recognizer learning means for learning the device;
A control means for updating parameters used by the converter based on a result of recognition processing performed by the recognizer on verification data having the second characteristic;
Equipped with
The control means estimates the degree of influence that each parameter used by the converter has on the recognition error of the recognizer based on the result of recognition processing performed by the recognizer on verification data having the second characteristic, and The method is characterized in that the parameters are updated according to the degree of influence .

変換器による特性変換により得られた学習データを用いて認識器の学習を行う構成において、認識器の認識精度を向上させることができる。 In a configuration in which a recognizer is trained using learning data obtained by characteristic conversion by a converter, the recognition accuracy of the recognizer can be improved.

一実施形態に係る情報処理装置の構成例を示す図。FIG. 1 is a diagram illustrating a configuration example of an information processing device according to an embodiment. 一実施形態に係る情報処理方法の処理の流れを示すフローチャート。1 is a flowchart showing a process flow of an information processing method according to an embodiment. 変換器の学習におけるデータフローを示す図。The figure which shows the data flow in learning of a converter. 影響度を算出する際のネットワーク接続を説明する図。A diagram illustrating network connections when calculating the degree of influence. 影響度を算出処理の流れを示すフローチャート。5 is a flowchart showing the flow of influence calculation processing. 一実施形態における変換器の再学習を説明する図。FIG. 3 is a diagram illustrating relearning of a converter in one embodiment. 一実施形態における変換器の再学習処理の流れを示すフローチャート。5 is a flowchart showing the flow of converter relearning processing in one embodiment.

以下、添付図面を参照して実施形態を詳しく説明する。なお、以下の実施形態は特許請求の範囲に係る発明を限定するものではない。実施形態には複数の特徴が記載されているが、これらの複数の特徴の全てが発明に必須のものとは限らず、また、複数の特徴は任意に組み合わせられてもよい。さらに、添付図面においては、同一若しくは同様の構成に同一の参照番号を付し、重複した説明は省略する。 Hereinafter, embodiments will be described in detail with reference to the accompanying drawings. Note that the following embodiments do not limit the claimed invention. Although a plurality of features are described in the embodiments, not all of these features are essential to the invention, and the plurality of features may be arbitrarily combined. Furthermore, in the accompanying drawings, the same or similar components are designated by the same reference numerals, and redundant description will be omitted.

本発明の一実施形態によれば、第２の特性を有するデータ（例えば認識用画像）に対する認識処理を行う認識器の学習が行われる。この学習は、第１の特性を有するデータ（例えば観賞用画像）を変換器が第２の特性を有するデータに変換することによって得られた学習データを用いて行われる。また、変換器が用いるパラメタは、学習によって得られている。本発明の一実施形態によれば、さらに、第２の特性を有する検証データに対する認識器による認識処理の結果に基づいて、変換器が用いるパラメタが更新される。例えば、変換器が用いる各パラメタが認識器の認識精度に与える影響度を推定し、変換器が用いるパラメタの、影響度に基づく再学習を行うことができる。このような構成によれば、認識器に入力される第２の特性を有するデータのうち、認識のために重要な要素を考慮しながら、変換器の学習を行うことができる。 According to an embodiment of the present invention, a recognizer that performs recognition processing on data having the second characteristic (for example, a recognition image) is trained. This learning is performed using learning data obtained by a converter converting data having a first characteristic (for example, an ornamental image) into data having a second characteristic. Furthermore, the parameters used by the converter are obtained through learning. According to an embodiment of the present invention, parameters used by the converter are further updated based on the result of the recognition process performed by the recognizer on the verification data having the second characteristic. For example, it is possible to estimate the degree of influence that each parameter used by the converter has on the recognition accuracy of the recognizer, and perform relearning of the parameters used by the converter based on the degree of influence. According to such a configuration, the converter can be trained while considering important elements for recognition among data having the second characteristic input to the recognizer.

例えば、識別器の学習に用いられる学習画像が認識用画像よりも青みがかっている場合は認識精度に大きな影響はないが、赤みがかっている場合は認識精度に大きな影響を及ぼすかもしれない。この場合、変換器の学習を行う際の学習誤差の評価において、色調の青み成分の誤差よりも赤み成分の誤差をより重視することができる。このような構成によれば、単に変換処理後の観賞用画像と認識用画像との誤差に基づいて変換器を学習する場合と比較して、認識処理により適した変換器が得られることが期待され、また認識器による認識精度も向上することが期待される。 For example, if the training image used to train the classifier is more bluish than the recognition image, it will not have a big impact on recognition accuracy, but if it is reddish, it may have a big impact on recognition accuracy. In this case, in evaluating the learning error when learning the converter, it is possible to place more emphasis on the error in the reddish component than the error in the bluish component of the color tone. According to such a configuration, it is expected that a converter more suitable for recognition processing can be obtained compared to the case where the converter is simply trained based on the error between the ornamental image after conversion processing and the recognition image. It is also expected that the recognition accuracy of the recognizer will improve.

以下の各実施形態では、認識対象となるデータが画像データである場合について説明する。しかしながら、認識対象となるデータは、音声データのような別の種類のデータであってもよい。例えば、観賞用の音声を記録する装置で収集した音声データと、認識を行う装置で収集した認識処理用の音声データとの間に特性の差異がある場合に、以下の実施形態と同様の効果が期待される。また、認識対象となるデータは、画像センサ又はマイクロフォン等によって記録された生のデータではなく、生のデータから抽出された特徴データであってもよい。例えば、認識対象となるデータは、画像からのエッジ抽出により得られたエッジ画像や、又はエッジ方向のヒストグラムを抽出することにより得られた特徴ベクトルであってもよい。 In each embodiment below, a case will be described in which the data to be recognized is image data. However, the data to be recognized may be another type of data such as audio data. For example, if there is a difference in characteristics between audio data collected by a device that records audio for entertainment and audio data for recognition processing collected by a device that performs recognition, the same effect as in the embodiment below can be obtained. There is expected. Moreover, the data to be recognized may not be raw data recorded by an image sensor or a microphone, but may be feature data extracted from the raw data. For example, the data to be recognized may be an edge image obtained by extracting edges from an image, or a feature vector obtained by extracting a histogram in the edge direction.

また、以下の各実施形態では、顔認識を行う場合について説明するが、認識タスクの種類は特に限定されない。例えば、認識タスクは、画像から人物領域若しくは顔領域を検出する検出タスク、画像に写っている物体の種別を識別する分類タスク、又は画像に写っている車の台数を回帰により得る回帰タスクであってもよい。もちろん、認識対象も特に限定されず、顔、人物、又は自動車等でありうる。 Further, in each embodiment below, a case will be described in which face recognition is performed, but the type of recognition task is not particularly limited. For example, the recognition task may be a detection task that detects a person region or a face region from an image, a classification task that identifies the type of object in the image, or a regression task that calculates the number of cars in the image by regression. You can. Of course, the recognition target is not particularly limited, and may be a face, a person, a car, or the like.

［実施形態１］
以下、本発明の実施形態１について説明する。図１は、実施形態１に係る情報処理装置のハードウェア構成例を示す。本実施形態に係る情報処理装置は、プロセッサとメモリとを備えるコンピュータにより実現することができる。すなわち、ＣＰＵ１０１のようなプロセッサが、ＲＡＭ、ＲＯＭ、又はＨＤＤのようなメモリ１０５，１０６に格納されたプログラムを実行することにより、各部の機能を実現することができる。図１は、メモリ１０５が、各部を実現するためのプログラムを格納していることを示している。このようなプログラムは、記憶媒体に記録することもできる。また、図１は、メモリ１０６が各データを格納することを示している。もっとも、メモリ１０５と１０６とは統合されていてもよい。図１に示すように、このコンピュータは、キーボード等の入力装置１０３及びディスプレイ等の出力装置１０４を有している。バス１０２は、各構成要素を接続している計算機バスである。なお、情報処理装置は、例えばネットワークを介して接続された複数のコンピュータによって構成されていてもよい。 [Embodiment 1]
Embodiment 1 of the present invention will be described below. FIG. 1 shows an example of a hardware configuration of an information processing apparatus according to a first embodiment. The information processing device according to this embodiment can be realized by a computer including a processor and a memory. That is, the functions of each part can be realized by a processor such as the CPU 101 executing programs stored in the memories 105 and 106 such as RAM, ROM, or HDD. FIG. 1 shows that the memory 105 stores programs for realizing each part. Such a program can also be recorded on a storage medium. FIG. 1 also shows that the memory 106 stores each data. However, memories 105 and 106 may be integrated. As shown in FIG. 1, this computer has an input device 103 such as a keyboard and an output device 104 such as a display. Bus 102 is a computer bus that connects each component. Note that the information processing device may be configured by, for example, a plurality of computers connected via a network.

ここで、メモリ１０６中の第１のデータ１２０と第２のデータ１２１について説明する。第１のデータ１２０は、第１の特性を有するデータであり、第２のデータ１２１は、第２の特性を有するデータである。本明細書において、特性とは、データが表す対象とは無関係な、データが有する性質のことを意味する。例えば、あるカメラにより得られた、人間が観賞するための観賞用画像が有する特性が第１の特性であってもよく、認識器に入力される認識用画像が有する特性が第２の特性であってもよい。第１の特性を有する観賞用画像は、人間が見た時に自然な印象を受けるように生成されている。例えば、観賞用画像は、高解像度の画像センサによって撮像され、ノイズ除去処理又は自然な色を得るための色処理等が施されている、という特性を有していてもよい。第２の特性を有する認識用画像は、例えば低解像度の画像センサによって撮像され、長い処理時間を要するノイズ除去処理又は色処理等が行われていない、という特性を有していてもよい。 Here, the first data 120 and second data 121 in the memory 106 will be explained. The first data 120 is data that has a first characteristic, and the second data 121 is data that has a second characteristic. In this specification, a characteristic means a property of data that is unrelated to the object represented by the data. For example, the first characteristic may be a characteristic of an ornamental image for human viewing obtained by a certain camera, and the second characteristic may be a characteristic of a recognition image input to a recognition device. There may be. The ornamental image having the first characteristic is generated to give a natural impression when viewed by a human. For example, an ornamental image may have a characteristic that it is captured by a high-resolution image sensor and has been subjected to noise removal processing or color processing to obtain natural colors. The recognition image having the second characteristic may have, for example, a characteristic that it is captured by a low-resolution image sensor and has not been subjected to noise removal processing, color processing, etc. that require a long processing time.

第１の特性を有するデータ及び第２の特性を有するデータは、同じ対象を表現していてもよく、例えば同じ対象の撮像画像であってもよい。本実施形態においては、カメラがあるシーンを撮像することにより観賞用画像として生成した画像が、第１の特性を有する第１のデータ１２０である。また、同じカメラが同一のシーンを撮像することにより認識用画像として生成した画像が、第２の特性を有する第２のデータ１２１である。すなわち、第１のデータ１２０と第２のデータ１２１とは、同一の被写体を撮像することにより得られた画像データであってもよい。以下の例においては、カメラは認識用画像を用いた認識処理を行う。この例において、認識処理を行うカメラは、同じシーンの観賞用画像及び認識用画像を撮像し、それぞれを対応付けて、第１のデータ１２０及び第２のデータ１２１としてメモリ１０６に格納する。 The data having the first characteristic and the data having the second characteristic may represent the same object, and may be captured images of the same object, for example. In this embodiment, an image generated as an ornamental image by capturing a certain scene with a camera is the first data 120 having the first characteristic. Furthermore, an image generated as a recognition image by capturing the same scene with the same camera is the second data 121 having the second characteristic. That is, the first data 120 and the second data 121 may be image data obtained by imaging the same subject. In the following example, the camera performs recognition processing using a recognition image. In this example, the camera that performs the recognition process captures an ornamental image and a recognition image of the same scene, associates them with each other, and stores them in the memory 106 as first data 120 and second data 121.

第１のデータ１２０及び第２のデータ１２１は、それぞれ様々なシーンを撮像することにより得られた複数の画像データを含んでいる。一方で、上記のように、第１のデータ１２０に含まれる観賞用画像のそれぞれは、第２のデータ１２１に含まれる認識用画像のいずれかと対応関係を有している。このように、第１のデータ１２０及び第２のデータ１２１は、同じシーンを撮像することにより得られているが、画質において特性が異なっている。なお、これらのデータに、認識対象に関する教師データを付与する必要はない。なお、本実施形態において、第１のデータ１２０及び第２のデータ１２１は画像データであるが、第１のデータ１２０及び第２のデータ１２１は画像から特徴を抽出することにより得られた特徴データであってもよい。 The first data 120 and the second data 121 each include a plurality of image data obtained by imaging various scenes. On the other hand, as described above, each of the ornamental images included in the first data 120 has a correspondence relationship with any of the recognition images included in the second data 121. In this way, the first data 120 and the second data 121 are obtained by imaging the same scene, but have different image quality characteristics. Note that it is not necessary to add training data regarding the recognition target to these data. Note that in this embodiment, the first data 120 and the second data 121 are image data, but the first data 120 and the second data 121 are feature data obtained by extracting features from the image. It may be.

以下、本実施形態に係る情報処理方法の流れを示すフローチャートである図２を参照しながら、図１に示す本実施形態に係る情報処理装置の各部の構成、及び本実施形態に係る情報処理方法における各処理について説明する。 Hereinafter, the configuration of each part of the information processing apparatus according to the present embodiment shown in FIG. 1 and the information processing method according to the present embodiment will be explained with reference to FIG. 2, which is a flowchart showing the flow of the information processing method according to the present embodiment. Each process in will be explained.

ステップＳ２０１で変換器学習部１１０は、変換器の学習を行う。変換器１２２は、第１の特性を有するデータを、これに対応する、第２の特性を有するデータに対する画像変換を行う。変換器学習部１１０は、メモリ１０６中にある第１のデータ１２０と第２のデータ１２１とを学習データとして用いて、データ変換器の学習を行うことができる。そして、変換器学習部１１０は、学習により得られた変換器１２２のパラメタをメモリ１０６に格納することができる。変換器学習部１１０の詳細については後述する。 In step S201, the converter learning unit 110 performs converter learning. The converter 122 performs image conversion of data having a first characteristic to corresponding data having a second characteristic. The converter learning unit 110 can perform learning of the data converter using the first data 120 and the second data 121 in the memory 106 as learning data. Then, the converter learning unit 110 can store the parameters of the converter 122 obtained through learning in the memory 106. Details of the converter learning section 110 will be described later.

ステップＳ２０２でデータ変換部１１１は、メモリ１０６中の学習データ１２４に対する画像変換を行い、得られた変換済み学習データ１２５をメモリ１０６に格納する。学習データ１２４は、認識する対象を学習するためのデータである。例えば、認識器が顔認識を行う場合、学習データ１２４は、画像と、画像に関連付けられている、顔の有無を示すラベル、又は顔の位置の情報などの教師値とを含んでいる。ここで、学習データ１２４の画像は第１の特性を有している。本実施形態において、学習データ１２４は、観賞用画像に教師値を与えることにより作成されている。 In step S202, the data conversion unit 111 performs image conversion on the learning data 124 in the memory 106, and stores the obtained converted learning data 125 in the memory 106. The learning data 124 is data for learning objects to be recognized. For example, when the recognizer performs face recognition, the learning data 124 includes an image and a teacher value associated with the image, such as a label indicating the presence or absence of a face, or information on the position of the face. Here, the image of the learning data 124 has the first characteristic. In this embodiment, the learning data 124 is created by giving a teacher value to an ornamental image.

データ変換部１１１は、変換器１２２を用いて、学習データ１２４を変換済み学習データ１２５に変換する。変換済み学習データ１２５は、学習データ１２４に含まれる画像を第２の特性を有するように変換することにより得られた画像と、学習データ１２４と同じ教師値と、を含んでいる。本実施形態において、変換済み学習データ１２５に含まれる画像は、認識用画像と同じ第２の特性を有している。以上説明したように、変換器１２２は、学習により得られたパラメタを用いて、第１の特性を有する学習データ１２４を、第２の特性を有する変換済み学習データ１２５に変換する。 The data conversion unit 111 converts the learning data 124 into converted learning data 125 using the converter 122 . The converted learning data 125 includes an image obtained by converting the image included in the learning data 124 to have the second characteristic, and the same teacher value as the learning data 124. In this embodiment, the image included in the converted learning data 125 has the same second characteristic as the recognition image. As explained above, the converter 122 converts the learning data 124 having the first characteristic into the converted learning data 125 having the second characteristic using the parameters obtained through learning.

ステップＳ２０３で認識器学習部１１２は、認識器の学習を行う。すなわち、認識器学習部１１２は、認識する対象の学習により得られた認識器１２３のパラメタをメモリ１０６に格納する。認識する対象とは、例えば人間の顔等であり、認識器学習部１１２は、例えば画像中に人間の顔があるかないかを認識する認識器の学習を行うことができる。認識器学習部１１２は、第２の特性を有する、変換済み学習データ１２５を用いて認識器の学習を行うことができる。このような学習により得られた認識器１２３は、第２の特性を有する認識用画像に対する認識処理を行うことができる。 In step S203, the recognizer learning unit 112 performs recognizer learning. That is, the recognizer learning unit 112 stores in the memory 106 the parameters of the recognizer 123 obtained by learning the recognition target. The object to be recognized is, for example, a human face, and the recognizer learning unit 112 can, for example, train a recognizer to recognize whether or not a human face is present in an image. The recognizer learning unit 112 can perform recognizer learning using the converted learning data 125 having the second characteristic. The recognizer 123 obtained through such learning can perform recognition processing on a recognition image having the second characteristic.

認識器の学習方法は特に限定されない。第２の特性を有する画像と、この画像に対する教師値と、を用いて認識器の学習を行うための適切な方法を採用することができる。以下では、認識器が、多層ニューラルネットワークのようなニューラルネットワークを用いて、第２の特性を有する画像データに対する認識処理を行う場合について説明する。この場合、変換済み学習データ１２４に含まれる画像をニューラルネットワークに入力し、ニューラルネットワークからの出力値と教師値との差異をネットワークに逆伝播させることで、ニューラルネットワークの重みパラメタを更新することができる。 The learning method of the recognizer is not particularly limited. An appropriate method can be employed to train the recognizer using an image having the second characteristic and a teacher value for this image. In the following, a case will be described in which the recognizer uses a neural network such as a multilayer neural network to perform recognition processing on image data having the second characteristic. In this case, the weight parameters of the neural network can be updated by inputting the image included in the converted learning data 124 to the neural network and back-propagating the difference between the output value from the neural network and the teacher value to the network. can.

ステップＳ２０４で影響度算出部１１３は、認識器１２３の認識精度を検証する。影響度算出部は、メモリ１０６中の検証データ１２６を用いて、認識精度の検証を行うことができる。検証データ１２６は、第２の特性を有する画像と、この画像に関連付けられた正しい認識結果を示す教師値と、を含む。検証データ１２６に含まれる画像は、認識処理を行う上述のカメラが、認識用画像として生成した現像後画像である。なお、検証データ１２６に含まれる画像の数は、学習データ１２４に含まれる画像の数より少なくてもよい。影響度算出部１１３は、検証データ１２６に対する認識器１２３の認識精度が所定の閾値以上である場合、認識精度が十分であると判定し、処理を完了する。一方で、影響度算出部１１３は、認識精度が所定の閾値未満である場合、認識精度が不十分であると判定し、処理はステップＳ２０５に進む。影響度算出部１１３は、後述する変換器再学習の処理回数が閾値以上になった際に、認識器の学習を終了すると判定し、処理を完了してもよい。 In step S204, the influence calculation unit 113 verifies the recognition accuracy of the recognizer 123. The influence calculation unit can verify recognition accuracy using the verification data 126 in the memory 106. Verification data 126 includes an image having the second characteristic and a teaching value associated with the image indicating a correct recognition result. The image included in the verification data 126 is a developed image generated as a recognition image by the above-mentioned camera that performs recognition processing. Note that the number of images included in the verification data 126 may be smaller than the number of images included in the learning data 124. If the recognition accuracy of the recognizer 123 with respect to the verification data 126 is equal to or higher than a predetermined threshold, the influence calculation unit 113 determines that the recognition accuracy is sufficient and completes the process. On the other hand, if the recognition accuracy is less than the predetermined threshold, the influence calculation unit 113 determines that the recognition accuracy is insufficient, and the process proceeds to step S205. The influence calculation unit 113 may determine that learning of the recognizer is to be completed when the number of processing times of converter relearning described later becomes equal to or greater than a threshold value, and may complete the processing.

ステップＳ２０５で影響度算出部１１３は、変換器１２２が用いる各パラメタが認識器１２３の認識精度に与える影響度を推定する。影響度算出部１１３は、算出した影響度を影響度データ１２７としてメモリ１０６に格納することができる。この処理の詳細については後述する。 In step S205, the influence degree calculation unit 113 estimates the degree of influence that each parameter used by the converter 122 has on the recognition accuracy of the recognizer 123. The influence calculation unit 113 can store the calculated influence in the memory 106 as influence data 127. Details of this processing will be described later.

ステップＳ２０６で影響度算出部１１３は、第２の特性を有する検証データに対する認識器１２３の認識処理の結果に基づいて、変換器１２２が用いるパラメタを更新する制御を行う。本実施形態において影響度算出部１１３は、ステップＳ２０５で推定された影響度に基づいて変換器１２２が用いるパラメタを更新することができる。具体的には、影響度算出部１１３は、変換器１２２が用いる各パラメタの、この影響度に基づく再学習を、変換器学習部１１０に行わせることができる。一実施形態において、影響度算出部１１３は、上述の影響度に基づく誤差評価基準を用いた再学習を変換器学習部１１０に行わせる。このように、誤差算出基準を学習データに応じて変更しながら変換器の再学習を行うことにより、メモリ１０６中の変換器１２２のパラメタを更新することができる。この処理の詳細については後述する。 In step S206, the influence calculation unit 113 performs control to update the parameters used by the converter 122 based on the result of the recognition process by the recognizer 123 on the verification data having the second characteristic. In this embodiment, the influence calculation unit 113 can update the parameters used by the converter 122 based on the influence estimated in step S205. Specifically, the influence calculation unit 113 can cause the converter learning unit 110 to relearn each parameter used by the converter 122 based on this influence. In one embodiment, the influence calculation unit 113 causes the converter learning unit 110 to perform relearning using the error evaluation criteria based on the above-mentioned influence. In this way, the parameters of the converter 122 in the memory 106 can be updated by re-learning the converter while changing the error calculation standard according to the learning data. Details of this processing will be described later.

ステップＳ２０６の後で処理はステップＳ２０２に戻る。ステップＳ２０２においてデータ変換部１１１は、ステップＳ２０６における再学習後の変換器１２２を用いて、メモリ１０６中の学習データ１２４に対する画像変換を行う。以降、処理を完了するまで、上記処理が繰り返される。 After step S206, the process returns to step S202. In step S202, the data conversion unit 111 performs image conversion on the learning data 124 in the memory 106 using the converter 122 after relearning in step S206. Thereafter, the above process is repeated until the process is completed.

図２に示す情報処理方法により、認識器１２３の学習を行うことができる。その後、認識部１１６は、学習により得られた認識器を用いて、認識対象となるデータに対する認識処理を行い、認識結果を出力することができる。ここで、認識対象となるデータは、例えば未知の画像であり、第２の特性を有するデータである。認識部１１６は、例えば、入力装置１０３などを介してユーザが指定した未知の画像データを認識器１２３に入力し、認識器１２３が出力した画像データに対する認識結果を出力装置１０４に表示させることができる。 The information processing method shown in FIG. 2 allows learning of the recognizer 123. Thereafter, the recognition unit 116 can perform recognition processing on the data to be recognized using the recognizer obtained through learning, and output the recognition result. Here, the data to be recognized is, for example, an unknown image and data having the second characteristic. The recognition unit 116 can, for example, input unknown image data specified by the user via the input device 103 to the recognizer 123, and display the recognition result for the image data output by the recognizer 123 on the output device 104. can.

以下、本実施形態に係る情報処理装置の処理について、情報処理装置が処理するデータの例を示す図３を参照しながら、詳細に説明する。図３に示すように、カメラ３０１は、変換器学習用の学習データを撮影する。カメラ３０１は、１つのシーンの撮像を行うことにより、観賞用ＲＡＷデータ３０２、認識用ＲＡＷデータ３０３、及び付帯情報３０４を取得する。付帯情報３０４とは、撮影時に取得される画像以外の情報のことである。付帯情報３０４は、例えば、カメラが備えるＧＰＳ等により得られた撮像位置情報、時刻情報、又はオートフォーカスが合焦している位置を示す合焦位置情報等を含むことができる。また、付帯情報３０４は、例えば、レンズのＦ値などのレンズ情報、露出情報、ホワイトバランス情報、又はカメラが備えるシーン認識器により得られたシーン種別情報等を含むことができる。 Hereinafter, the processing of the information processing apparatus according to the present embodiment will be described in detail with reference to FIG. 3 showing an example of data processed by the information processing apparatus. As shown in FIG. 3, a camera 301 photographs learning data for converter learning. The camera 301 acquires RAW data for viewing 302, RAW data for recognition 303, and supplementary information 304 by capturing an image of one scene. The supplementary information 304 is information other than images acquired at the time of photographing. The supplementary information 304 can include, for example, imaging position information obtained by GPS or the like included in the camera, time information, focus position information indicating the position where autofocus is focused, and the like. Further, the additional information 304 can include, for example, lens information such as the F value of the lens, exposure information, white balance information, scene type information obtained by a scene recognizer included in the camera, and the like.

カメラ３０１は、さらに、観賞用現像処理３０５によって観賞用ＲＡＷデータ３０２を現像することにより第１の特性を有する観賞用画像を生成し、この観賞用画像を第１特性画像３０６として格納する。なお、現像処理とは、画像センサが出力したＲＡＷデータから画像を生成する処理のことを指す。観賞用現像パラメタ３０７は、観賞用現像処理３０５のための処理パラメタである。また、カメラ３０１は、認識用現像処理３０８によって認識用ＲＡＷデータ３０３を現像することにより第２の特性を有する認識用画像を生成し、この認識用画像を第２特性画像３０９として格納する。認識用現像パラメタ３１０は、認識用現像処理３０８のための現像パラメタである。認識用現像パラメタ３１０としては、観賞用現像パラメタ３０７と異なるパラメタを設定することができる。このようにして、カメラ３０１は、同じシーンに対する第１特性画像３０６と第２特性画像３０９とのセットを多数作成し、第１特性画像３０６と第２特性画像３０９とを互いに対応付けて格納することができる。 The camera 301 further generates an ornamental image having the first characteristic by developing the ornamental RAW data 302 by an ornamental development process 305, and stores this ornamental image as a first characteristic image 306. Note that the development process refers to the process of generating an image from the RAW data output by the image sensor. The ornamental development parameter 307 is a processing parameter for the ornamental development process 305. Further, the camera 301 generates a recognition image having the second characteristic by developing the recognition RAW data 303 through recognition development processing 308 and stores this recognition image as a second characteristic image 309 . The recognition development parameter 310 is a development parameter for the recognition development process 308. As the recognition development parameter 310, a parameter different from the ornamental development parameter 307 can be set. In this way, the camera 301 creates many sets of the first characteristic image 306 and the second characteristic image 309 for the same scene, and stores the first characteristic image 306 and the second characteristic image 309 in association with each other. be able to.

第１特性画像３０６はメモリ１０６中の第１のデータ１２０に含まれる画像に、第２特性画像３０９は、メモリ１０６中の第２のデータ１２１に含まれる画像に、それぞれ対応する。第１のデータ１２０又は第２のデータ１２１は、第１特性画像３０６又は第２特性画像３０９に関連する、上記の付帯情報３０４を、メタデータとして含んでいてもよい。このメタデータは、第１特性画像３０６又は第２特性画像３０９を現像するための現像パラメタを含んでいてもよい。 The first characteristic image 306 corresponds to the image contained in the first data 120 in the memory 106, and the second characteristic image 309 corresponds to the image contained in the second data 121 in the memory 106, respectively. The first data 120 or the second data 121 may include the above-mentioned supplementary information 304 related to the first characteristic image 306 or the second characteristic image 309 as metadata. This metadata may include development parameters for developing the first characteristic image 306 or the second characteristic image 309.

カメラ３０１は、観賞用画像を生成するための画像センサと、認識用画像を生成するための画像センサとを有することができる。そして、観賞用画像を生成するための画像センサ及びは観賞用ＲＡＷデータ３０２を、認識用画像を生成するための画像センサは認識用ＲＡＷデータ３０３を、別々に生成することができる。このように、第１特性画像３０６及び第２特性画像３０９は、同一のシーンを互いに異なるセンサを用いて撮像することにより得られた画像データであってもよい。一方で、カメラ３０１は、１つの画像センサを有していてもよく、この場合観賞用ＲＡＷデータ３０２と認識用ＲＡＷデータ３０３とは同じであってもよい。同じＲＡＷデータが用いられる場合であっても、カメラは、観賞用現像処理３０５及び認識用現像処理３０８により、第１の特性を有する第１特性画像３０６と、第２の特性を有する第２特性画像３０９と、を生成することができる。このように、このように、第１特性画像３０６及び第２特性画像３０９は、同一のシーンを撮像して互いに異なる現像処理を施すことにより得られた画像データであってもよい。 Camera 301 can have an image sensor for generating an ornamental image and an image sensor for generating a recognition image. Then, the image sensor for generating the ornamental image and the ornamental RAW data 302 can be separately generated, and the image sensor for generating the recognition image can generate the recognition RAW data 303 separately. In this way, the first characteristic image 306 and the second characteristic image 309 may be image data obtained by imaging the same scene using different sensors. On the other hand, the camera 301 may have one image sensor, and in this case, the RAW data for viewing 302 and the RAW data for recognition 303 may be the same. Even when the same RAW data is used, the camera uses the ornamental development process 305 and the recognition development process 308 to create a first characteristic image 306 having the first characteristic and a second characteristic image 306 having the second characteristic. An image 309 can be generated. In this way, the first characteristic image 306 and the second characteristic image 309 may be image data obtained by imaging the same scene and performing mutually different development processes.

次に、変換器学習部１１０がステップＳ２０１で行う処理の詳細を説明する。変換器学習部１１０は、変換器１２２が第１のデータを変換することにより得られる第２の特性を有するデータと、第２のデータと、の誤差が小さくなるように、変換器１２２が用いるパラメタの学習を行う。以下では、変換器１２２がニューラルネットワーク３１１を用いて、第１の特性を有するデータを第２の特性を有するデータに変換する場合について説明する。すなわち、変換器学習部１１０は、ニューラルネットワークに第１のデータ１２０を入力し、ニューラルネットから出力される画像と、入力された第１のデータ１２０に対応する第２のデータ１２１と、の間の誤差が少なくなるように、変換器の学習を行う。 Next, details of the process performed by the converter learning unit 110 in step S201 will be described. The converter learning unit 110 uses the converter 122 to reduce the error between the second data and the data having the second characteristic obtained by converting the first data by the converter 122. Perform parameter learning. In the following, a case will be described in which the converter 122 uses the neural network 311 to convert data having a first characteristic into data having a second characteristic. That is, the converter learning unit 110 inputs the first data 120 to the neural network, and calculates the difference between the image output from the neural network and the second data 121 corresponding to the input first data 120. The converter is trained so that the error in

変換器学習部１１０は、第１の特性を有する第１のデータと、第２の特性を有し第１のデータと同一の対象を表現する第２のデータとの組を用いて、変換器１２２が用いるパラメタの学習を行うことができる。本実施形態において、変換器学習部１１０は、上記のように用意された同一のシーンについての第１特性画像３０６と第２特性画像３０９との組を用いて、変換器１２２が用いるニューラルネットワーク３１１の学習を行う。ニューラルネットワーク３１１は、入力された第１特性画像３０６に対して、ニューラルネットワーク３１１内の重みパラメタに従うデータ変換を行うことにより、第２の特性を有する推定画像３１２を出力する。なお、ニューラルネットワーク３１１には、学習データとして、さらに付帯情報３０４のような上記のメタデータが入力されてもよい。すなわち、変換器学習部１１０は、このようなメタデータを用いて変換器１２２が用いるパラメタの学習を行うこともできる。 The converter learning unit 110 uses a set of first data having a first characteristic and second data having a second characteristic and representing the same object as the first data. 122 can be learned. In this embodiment, the converter learning unit 110 uses the set of the first characteristic image 306 and the second characteristic image 309 for the same scene prepared as described above to create a neural network 311 used by the converter 122. Learn about. The neural network 311 outputs an estimated image 312 having the second characteristic by performing data conversion on the input first characteristic image 306 according to the weighting parameter within the neural network 311 . Note that the above-mentioned metadata such as the supplementary information 304 may be further input to the neural network 311 as learning data. That is, the converter learning unit 110 can also use such metadata to learn parameters used by the converter 122.

ここで、このニューラルネットワーク３１１の重みパラメタをθｃとする。変換器学習部１１０は、ニューラルネットワーク３１１から出力された推定画像３１２と、正解値である第２特性画像３０９との差分を、学習誤差Ｅ（３１３）として算出する。そして、変換器学習部１１０は、誤差逆伝播法に従って、学習誤差Ｅ（３１３）をパラメタθｃで偏微分して得られる誤差勾配∂Ｅ／∂θｃを用いて、ニューラルネットワーク３１１の重みパラメタθｃを更新する。この学習において、学習誤差は推定画像３１２と正解値である第２特性画像３０９との画素値の差異である。変換器学習部１１０は、以上の手順を学習誤差が収束するまで繰り返すことができる。学習誤差が収束すると、変換器学習部１１０は学習を終了し、学習によって得られたニューラルネットワーク３１１のパラメタをメモリ１０６に格納する。ニューラルネットワーク３１１のパラメタには、各重みパラメタ及びバイアス値が含まれる。こうして、学習後の変換器１２２が得られる。こうして得られた変換器１２２に第１の特性を持つ画像データを入力すると、第２の特性を持つ画像データが出力される。 Here, the weight parameter of this neural network 311 is assumed to be θc. The converter learning unit 110 calculates the difference between the estimated image 312 output from the neural network 311 and the second characteristic image 309, which is the correct value, as a learning error E (313). Then, the converter learning unit 110 uses the error gradient ∂E/∂θc obtained by partially differentiating the learning error E (313) with the parameter θc according to the error backpropagation method to determine the weighting parameter θc of the neural network 311. Update. In this learning, the learning error is the difference in pixel values between the estimated image 312 and the second characteristic image 309 which is the correct value. The converter learning unit 110 can repeat the above procedure until the learning error converges. When the learning error converges, the converter learning unit 110 ends the learning and stores the parameters of the neural network 311 obtained through the learning in the memory 106. The parameters of the neural network 311 include each weight parameter and bias value. In this way, a trained converter 122 is obtained. When image data having the first characteristic is input to the converter 122 obtained in this way, image data having the second characteristic is output.

次に、影響度算出部１１３について説明する。図４は、影響度算出部１１３が用いるニューラルネットワークの接続を示す。図４において、画像４０１は、メモリ１０６中の検証データ１２６に含まれる画像であり、教師値４０４は検証データ１２６に含まれる教師値である。認識器４０２は、ステップＳ２０３における学習により得られた、メモリ１０６中の認識器１２３であり、画像４０１のうちの１つの画像が入力されると、対応する認識結果４０３を出力する。また、変換器４０５はメモリ１０６中の変換器１２２である。第２の特性を有するデータを出力する変換器４０５の出力層は、第２の特性を有するデータが入力される認識器４０２の入力層と接続される。変換器１２２は画像を出力するニューラルネットワークであるので、変換器１２２の出力層を、画像が入力される認識器１２３の入力層と接続することができる。 Next, the influence calculation unit 113 will be explained. FIG. 4 shows the connections of the neural network used by the influence calculation unit 113. In FIG. 4, an image 401 is an image included in the verification data 126 in the memory 106, and a teacher value 404 is a teacher value included in the verification data 126. The recognizer 402 is the recognizer 123 in the memory 106 obtained through the learning in step S203, and when one of the images 401 is input, it outputs the corresponding recognition result 403. Converter 405 is also converter 122 in memory 106 . The output layer of the converter 405, which outputs data with the second property, is connected to the input layer of the recognizer 402, into which the data with the second property is input. Since the transformer 122 is a neural network that outputs images, the output layer of the transformer 122 can be connected to the input layer of the recognizer 123, into which the images are input.

以下、影響度算出部１１３が行うステップＳ２０５の処理の詳細について、図４と、ステップＳ２０５における処理の流れを示す図５のフローチャートとを参照して説明する。影響度算出部１１３は、上述のように、変換器１２２が用いる各パラメタが認識器１２３の認識誤差に与える影響度を算出することができる。この影響度は、例えば、変換器１２２が用いるパラメタの変化に対する、認識器１２３による認識誤差の変化の程度を示すことができる。 The details of the process in step S205 performed by the influence calculation unit 113 will be described below with reference to FIG. 4 and the flowchart in FIG. 5 showing the flow of the process in step S205. The influence calculation unit 113 can calculate the influence of each parameter used by the converter 122 on the recognition error of the recognizer 123, as described above. This degree of influence can indicate, for example, the degree of change in the recognition error by the recognizer 123 with respect to a change in the parameter used by the converter 122.

ステップＳ５０１で影響度算出部１１３は、画像４０１から１つの画像を選択して、認識器４０２に入力することにより、対応する認識結果４０３を得る。ステップＳＳ５０２で影響度算出部１１３は、認識結果４０３と、選択された画像に対応する教師値４０４とを比較することにより、学習誤差Ｌを算出する。 In step S501, the influence calculation unit 113 selects one image from the images 401 and inputs it to the recognizer 402, thereby obtaining the corresponding recognition result 403. In step SS502, the influence calculation unit 113 calculates the learning error L by comparing the recognition result 403 and the teacher value 404 corresponding to the selected image.

ステップＳ５０３で影響度算出部１１３は、誤差逆伝播法に従い、誤差Ｌを認識器４０２が用いるニューラルネットワークに逆伝播させる。この際、影響度算出部１１３は、認識器４０２の重みパラメタを更新せずに、誤差を入力方向に逆伝播させる。こうして、誤差Ｌは認識器４０２を通して変換器４０５へと逆伝播される。 In step S503, the influence calculation unit 113 back-propagates the error L to the neural network used by the recognizer 402 according to the error back-propagation method. At this time, the influence calculation unit 113 back-propagates the error in the input direction without updating the weight parameters of the recognizer 402. Thus, error L is backpropagated through recognizer 402 to converter 405.

ステップＳ５０４で影響度算出部１１３は、変換器４０５へと逆伝播されてきた誤差Ｌと変換器１２２のパラメタθｃとの偏微分∂Ｌ／∂θｃを、逆伝播されてきた誤差Ｌに対するパラメタθｃの誤差勾配として算出する。この誤差勾配は、認識誤差Ｌに対して変換器１２２のパラメタθｃがどの程度影響しているかを示す影響度に相当する。影響度算出部１１３は、変換器４０５の各パラメタに対して誤差勾配を算出することができるため、ここで算出される影響度は変換器１２２のニューラルネットワークのパラメタと同じ構造を有している。影響度算出部１１３は、こうして検証データごとに得られた誤差勾配をメモリに格納する。 In step S504, the influence calculation unit 113 converts the partial differential ∂L/∂θc between the error L back-propagated to the converter 405 and the parameter θc of the converter 122 into the parameter θc for the back-propagated error L. Calculated as the error gradient of This error gradient corresponds to the degree of influence indicating how much influence the parameter θc of the converter 122 has on the recognition error L. Since the influence calculation unit 113 can calculate the error gradient for each parameter of the converter 405, the influence calculated here has the same structure as the neural network parameters of the converter 122. . The influence calculation unit 113 stores the error gradient thus obtained for each verification data in the memory.

ステップＳ５０５で影響度算出部１１３は、検証データに含まれる全ての画像４０１を用いた処理を行ったかどうかを判定する。処理が終わっていなければ、処理はステップＳ５０１に戻り、次の画像を用いて処理が行われる。処理が終わっていれば、処理はステップＳ５０６に進む。 In step S505, the influence calculation unit 113 determines whether processing using all images 401 included in the verification data has been performed. If the processing has not been completed, the processing returns to step S501 and processing is performed using the next image. If the processing has been completed, the processing advances to step S506.

ステップＳ５０６で影響度算出部１１３は、前ステップまでに得られた検証データごとの誤差勾配を平均して得られた値を、影響度データ１２７として格納する。影響度算出部１１３は、変換器４０５の各パラメタに対して平均の誤差勾配を算出することができるため、影響度データ１２７は変換器１２２のニューラルネットワークのパラメタと同じ構造を有している。したがって、影響度算出部１１３は、ニューラルネットワークのパラメタと同じ形式で、影響度データ１２７をメモリに格納することができる。影響度データ１２７は、検証データ１２６における、θｃの、認識誤差に対する、平均的な誤差勾配を表し、変換器１２２が用いるパラメタθｃの変化に対する、認識器１２３による認識誤差の変化の程度であるといえる。 In step S506, the influence degree calculation unit 113 stores, as influence degree data 127, a value obtained by averaging the error gradients for each of the verification data obtained up to the previous step. Since the influence calculation unit 113 can calculate the average error gradient for each parameter of the converter 405, the influence data 127 has the same structure as the neural network parameters of the converter 122. Therefore, the influence calculation unit 113 can store the influence data 127 in the memory in the same format as the neural network parameters. The influence data 127 represents the average error gradient of θc with respect to the recognition error in the verification data 126, and is the degree of change in the recognition error by the recognizer 123 with respect to a change in the parameter θc used by the converter 122. I can say that.

次に、影響度算出部１１３及び変換器学習部１１０が行うステップＳ２０６における再学習処理の詳細について、図６を参照して説明する。ステップＳ２０６の処理は、ステップＳ２０１における学習処理と類似しているが、学習の際に画像変換誤差に加えて影響度データ１２７が用いられる。すなわち、影響度算出部１１３は、変換器学習部１１０に、影響度データ１２７を参照しながら変換器１２２が用いるニューラルネットワーク６０２の再学習を行わせることができる。一実施形態において、ステップＳ２０１では、変換器学習部１１０は画像変換誤差を誤差評価基準として用いるが、ステップＳ２０６では、変換器学習部１１０は影響度データ１２７と画像変換誤差の両方を誤差評価基準として用いる。このように、ステップＳ２０１とＳ２０６とでは、異なる誤差評価基準を用いることができる。 Next, details of the relearning process in step S206 performed by the influence calculation unit 113 and the converter learning unit 110 will be described with reference to FIG. 6. The process in step S206 is similar to the learning process in step S201, but the influence degree data 127 is used in addition to the image conversion error during learning. That is, the influence calculation unit 113 can cause the converter learning unit 110 to relearn the neural network 602 used by the converter 122 while referring to the influence data 127. In one embodiment, in step S201, the converter learning unit 110 uses the image conversion error as the error evaluation standard, but in step S206, the converter learning unit 110 uses both the influence data 127 and the image conversion error as the error evaluation standard. used as In this way, different error evaluation criteria can be used in steps S201 and S206.

例えば、変換器学習部１１０は、変換器１２２が用いるニューラルネットワーク６０２のパラメタθｃを更新する際に用いる誤差勾配の定義が異なることを除いて、ステップＳ２０１と同様に変換器１２２の学習を行うことができる。例えば、ステップＳ２０１で変換器学習部１１０は、変換器１２２の学習誤差Ｅ（３１３）をθｃで偏微分して得られる誤差勾配∂Ｅ／∂θｃを用いてニューラルネットワーク３１１の重みパラメタθｃを更新することができる。この誤差勾配∂Ｅ／∂θｃは、第１特性画像３０６の、第２の特性を有する画像データへの変換誤差に対する、変換器１２２が用いる各パラメタの影響度と考えることができる。 For example, the converter learning unit 110 may perform learning of the converter 122 in the same manner as in step S201, except that the definition of the error gradient used when updating the parameter θc of the neural network 602 used by the converter 122 is different. Can be done. For example, in step S201, the converter learning unit 110 updates the weight parameter θc of the neural network 311 using the error gradient ∂E/∂θc obtained by partially differentiating the learning error E(313) of the converter 122 with respect to θc. can do. This error gradient ∂E/∂θc can be considered as the degree of influence of each parameter used by the converter 122 on the conversion error of the first characteristic image 306 into image data having the second characteristic.

一方で、ステップＳ２０６では、誤差勾配∂Ｅ／∂θｃと、影響度データ１２７と、に基づいて変換器１２２の学習が行われる。例えば、影響度データ１２７に示される、認識誤差Ｌに対するパラメタθｃの影響度∂Ｌ／∂θｃを、任意の重みパラメタαによって重みづけした値を、誤差勾配に加えることができる。変換器学習部１１０は、このようにして得られた勾配∂Ｅ／∂θｃ＋α（∂Ｌ／∂θｃ）を用いて、ニューラルネットワーク６０２のパラメタθｃの更新を行うことができる。なお、ステップＳ２０６で用いる第１特性画像６０１及び第２特性画像６０５は、第１特性画像３０６と第２特性画像３０９と同じであってもよい。また、推定画像６０３は第１特性画像６０１に対するニューラルネットワーク６０２の出力であり、学習誤差Ｅ（６０４）は、推定画像６０３と正解値である第２特性画像３０５との画素値の差異である。図６と図３とを比較すると、パラメタθｃの更新に用いる勾配の定義のみが異なっていることがわかる。学習誤差が収束すると、変換器学習部１１０は学習を終了し、学習によって得られたニューラルネットワーク６０２のパラメタをメモリ１０６に格納する。 On the other hand, in step S206, the converter 122 is trained based on the error gradient ∂E/∂θc and the influence degree data 127. For example, a value obtained by weighting the degree of influence ∂L/∂θc of the parameter θc on the recognition error L by an arbitrary weighting parameter α, which is shown in the degree of influence data 127, can be added to the error gradient. The converter learning unit 110 can update the parameter θc of the neural network 602 using the gradient ∂E/∂θc+α(∂L/∂θc) obtained in this way. Note that the first characteristic image 601 and the second characteristic image 605 used in step S206 may be the same as the first characteristic image 306 and the second characteristic image 309. Furthermore, the estimated image 603 is the output of the neural network 602 for the first characteristic image 601, and the learning error E (604) is the difference in pixel values between the estimated image 603 and the second characteristic image 305, which is the correct value. Comparing FIG. 6 and FIG. 3, it can be seen that only the definition of the gradient used to update the parameter θc is different. When the learning error converges, the converter learning unit 110 ends the learning and stores the parameters of the neural network 602 obtained through the learning in the memory 106.

以上の実施形態１によれば、認識用データが一般的な特性を有さない場合であっても、一般的な特性を有する学習データを流用して、認識器の学習を行うことができる。さらに、データの類似性のみに注目してデータの特性を変換する代わりに、認識処理の精度に影響が大きい要素に重みを置いてデータの類似性が高まるようにデータの特性を変換することにより、認識精度が向上することが期待される。 According to the first embodiment described above, even if the recognition data does not have general characteristics, learning data having general characteristics can be used to train the recognizer. Furthermore, instead of converting data characteristics by focusing only on data similarity, by converting data characteristics in a way that increases data similarity by giving weight to elements that have a large impact on recognition processing accuracy. , it is expected that recognition accuracy will improve.

実施形態１では、変換器の学習、認識器の学習、及び変換器の再学習に関して、前の学習が収束してから次の学習が行われたが、収束を待たずに順次学習が行われてもよい。また、変換器及び認識器はニューラルネットワークを用いて処理を行ったが、変換器及び認識器の構成はこれに限られない。例えば、変換器を何らかのモデルを用いて定義してもよく、この場合モデルのパラメタは勾配法等により学習することができる。さらに、変換器の各パラメタの認識精度に対する影響度は、例えば、変換器の各パラメタをそれぞれ微小に変動させることにより、画像データに対する認識精度がどのように変動するかを観察することにより、算出することができる。 In Embodiment 1, regarding converter learning, recognizer learning, and converter relearning, the next learning was performed after the previous learning converged, but learning was performed sequentially without waiting for convergence. You can. Further, although the converter and recognizer perform processing using a neural network, the configurations of the converter and recognizer are not limited to this. For example, the converter may be defined using some model, and in this case, the parameters of the model can be learned using a gradient method or the like. Furthermore, the degree of influence of each parameter of the converter on recognition accuracy can be calculated by, for example, observing how the recognition accuracy for image data changes by slightly varying each parameter of the converter. can do.

［実施形態２］
実施形態１で説明したように、第２の特性を有する第２のデータは、ＲＡＷ画像データのような元データに対する、現像処理のような変換処理により得ることができる。実施形態２においては、このような変換処理のパラメタの学習が行われる。例えば、実施形態２に係る情報処理装置は、変換器の再学習を行う際に、認識用画像を現像するための現像パラメタの学習も行うことができる。認識処理を行う装置においては、現像処理のパラメタを設定することができる。現像処理には、ノイズ除去処理、ホワイトバランス調整処理、又は露出調整処理などを含むことができる。現像パラメタは、このような処理のそれぞれについて、その処理の度合いを設定するパラメタを含むことができる。このようなパラメタを変更することで、ＲＡＷデータの現像により得られる画像の特性を変更することができる。観賞用現像パラメタ３０７は、人間にとって自然な画像を現像するためのパラメタなので変更することが難しい。一方で、認識器へ入力される認識用画像は、観賞用画像と異なっていてもよく、人間にとって不自然に見えたり低画質に見えたりする画像特性を有していてもよい。このため、認識処理に適する特性を有する認識用画像が得られるように、自由に認識用現像パラメタ３１０を設定することが可能である。このような現像パラメタの学習を行うことにより、認識用画像の特性が固定されている場合と比較して、認識精度が向上することが期待される。 [Embodiment 2]
As described in the first embodiment, the second data having the second characteristic can be obtained by performing conversion processing such as development processing on original data such as RAW image data. In the second embodiment, learning of parameters for such conversion processing is performed. For example, the information processing apparatus according to the second embodiment can also learn development parameters for developing a recognition image when relearning the converter. In an apparatus that performs recognition processing, parameters for development processing can be set. The development processing can include noise removal processing, white balance adjustment processing, exposure adjustment processing, and the like. The development parameters can include, for each such process, a parameter that sets the degree of the process. By changing such parameters, it is possible to change the characteristics of an image obtained by developing RAW data. The ornamental development parameters 307 are difficult to change because they are parameters for developing images that are natural to humans. On the other hand, the recognition image input to the recognizer may be different from the ornamental image, and may have image characteristics that make it look unnatural or of low quality to humans. Therefore, it is possible to freely set the recognition development parameters 310 so that a recognition image having characteristics suitable for recognition processing is obtained. Learning development parameters in this manner is expected to improve recognition accuracy compared to a case where the characteristics of the recognition image are fixed.

実施形態２に係る情報処理装置は、ステップＳ２０６における処理が異なることを除き、実施形態１に係る情報処理装置と同様の構成を有することができる。以下、本実施形態に係るステップＳ２０６の処理を、処理のフローチャートである図７を参照して説明する。 The information processing apparatus according to the second embodiment can have the same configuration as the information processing apparatus according to the first embodiment, except that the processing in step S206 is different. The process of step S206 according to this embodiment will be described below with reference to FIG. 7, which is a flowchart of the process.

始めに、ステップＳ７０１で変換器学習部１１０は、変換器１２２が用いるパラメタを、認識器１２３の認識誤差に与える影響度に従って調整する。例えば、変換器学習部１１０は、影響度データ１２７を用いて、変換器１２２のパラメタθｃをθｃ＿ｎｅｗに更新することができる。影響度データ１２７は認識誤差Ｌに対するθｃの誤差勾配であるから、変換器学習部１１０は、この勾配方向にθｃを変換することができる。例えば、変換器学習部１１０は、θｃに、∂Ｌ／∂θｃと学習率との積を加えることにより、θｃ＿ｎｅｗを決定することができる。このようにパラメタθｃを誤差勾配に従って更新することは、認識器１２３の認識誤差の改善につながることが期待される。一方で、θｃは変換器１２２のパラメタであるので、θｃを変更すると、第１特性画像３０６の変換により得られる推定画像３１２と、教師値である第２特性画像３０９との誤差Ｅ（３１３）が大きくなる可能性がある。このことは、変換器１２２が学習データ１２４を変換することにより得られた変換済み学習データ１２５を用いた学習により得られた認識器１２３の認識精度が低下することにつながるかもしれない。 First, in step S701, the converter learning unit 110 adjusts the parameters used by the converter 122 according to the degree of influence they have on the recognition error of the recognizer 123. For example, the converter learning unit 110 can update the parameter θc of the converter 122 to θc_new using the influence degree data 127. Since the influence degree data 127 is the error gradient of θc with respect to the recognition error L, the converter learning unit 110 can convert θc in the direction of this gradient. For example, the converter learning unit 110 can determine θc_new by adding the product of ∂L/∂θc and the learning rate to θc. Updating the parameter θc according to the error gradient in this way is expected to lead to an improvement in the recognition error of the recognizer 123. On the other hand, since θc is a parameter of the converter 122, when θc is changed, the error E(313) between the estimated image 312 obtained by converting the first characteristic image 306 and the second characteristic image 309 which is the teacher value may become large. This may lead to a decrease in the recognition accuracy of the recognizer 123 obtained by learning using the converted learning data 125 obtained by converting the learning data 124 by the converter 122.

そこで、ステップＳ７０２で変換器学習部１１０は、変換器のパラメタをθｃ＿ｎｅｗに固定しながら、認識用の現像パラメタθｄ（図３の３１０）の学習を行う。変換器学習部１１０は、学習により、誤差Ｅ（３１３）が減少するようにパラメタθｄを更新することができる。例えば、ステップＳ７０２で変換器学習部１１０は、ステップＳ７０１における調整後のパラメタθｃ＿ｎｅｗを用いて変換器１２２が第１特性画像３０６を変換することにより得られた、第２の特性を有する推定画像３１２を得ることができる。そして、変換器学習部１１０は、推定画像３１２と、元データである認識用ＲＡＷデータ３０３に対する変換処理により得られる第２特性画像３０９と、の誤差が小さくなるように、変換処理のパラメタである現像パラメタθｄを更新することができる。 Therefore, in step S702, the converter learning unit 110 learns the recognition development parameter θd (310 in FIG. 3) while fixing the converter parameter to θc_new. Through learning, the converter learning unit 110 can update the parameter θd so that the error E (313) decreases. For example, in step S702, the converter learning unit 110 generates an estimated image 312 having the second characteristic, which is obtained by the converter 122 converting the first characteristic image 306 using the adjusted parameter θc_new in step S701. can be obtained. Then, the converter learning unit 110 sets the parameters of the conversion process so that the error between the estimated image 312 and the second characteristic image 309 obtained by the conversion process of the recognition RAW data 303, which is the original data, becomes small. The development parameter θd can be updated.

具体例として、変換器学習部１１０は、パラメタθｃ＿ｎｅｗに従って第１特性画像３０６から得られる推定画像３１２と、現像パラメタθｄに従って認識用ＲＡＷデータ３０３から得られる第２特性画像３０９との誤差Ｅを算出することができる。そして、変換器学習部１１０は、誤差Ｅに対するθｄの偏微分、すなわち∂Ｅ／∂θｄに従って、認識用現像パラメタθｄ（３１０）を更新することができる。例えば、変換器学習部１１０は、θｄに、∂Ｅ／∂θｄと学習率との積を加えることにより、θｄを更新することができる。なお、パラメタθｄによる誤差Ｅの偏微分ができずに勾配法が使えない場合、変換器学習部１１０は、θｄの値を微小に変動させて誤差Ｅが減少する方向を探索することにより、誤差Ｅが減少するようにパラメタθｄを更新することができる。 As a specific example, the converter learning unit 110 calculates the error E between the estimated image 312 obtained from the first characteristic image 306 according to the parameter θc_new and the second characteristic image 309 obtained from the recognition RAW data 303 according to the development parameter θd. can do. Then, the converter learning unit 110 can update the recognition development parameter θd (310) according to the partial differential of θd with respect to the error E, that is, ∂E/∂θd. For example, the converter learning unit 110 can update θd by adding the product of ∂E/∂θd and the learning rate to θd. Note that if the gradient method cannot be used because the partial differentiation of the error E with respect to the parameter θd cannot be performed, the converter learning unit 110 searches for a direction in which the error E decreases by slightly changing the value of θd. The parameter θd can be updated so that E decreases.

ステップＳ７０３で変換器学習部１１０は、θｄの学習が収束したかどうかを判定する。学習が収束していなければ、ステップＳ７０２の処理が繰り返される。学習が収束していれば、処理はステップＳ７０４に進む。変換器学習部１１０は、誤差Ｅが閾値未満になったか、又は誤差Ｅの変動幅が閾値未満になった場合に、学習が収束したと判定することができる。ステップＳ７０４で変換器学習部１１０は、認識用現像パラメタθｄを、ステップＳ７０３までの学習により得られた値に固定する。そして、変換器学習部１１０は、ステップＳ７０４で固定された認識用現像パラメタθｄを用いて、認識用ＲＡＷデータ３０３を第２特性画像３０９に変換する。 In step S703, the converter learning unit 110 determines whether learning of θd has converged. If learning has not converged, the process of step S702 is repeated. If learning has converged, the process advances to step S704. The converter learning unit 110 can determine that the learning has converged when the error E becomes less than the threshold value or when the fluctuation range of the error E becomes less than the threshold value. In step S704, the converter learning unit 110 fixes the recognition development parameter θd to the value obtained through the learning up to step S703. Then, the converter learning unit 110 converts the recognition RAW data 303 into the second characteristic image 309 using the recognition development parameter θd fixed in step S704.

ステップＳ７０５で変換器学習部１１０は、ステップＳ２０１と同様の手法を用いて、変換器１２２のパラメタをθｃ＿ｎｅｗから更新していくことにより、変換器１２２の再学習を行う。すなわち、変換器学習部１１０は、ステップＳ７０２で更新された現像パラメタθｄに従って認識用ＲＡＷデータ３０３から第２特性画像３０９を得ることができる。そして、変換器１２２が第１特性画像３０６を変換することにより得られる推定画像３１２と、第２特性画像３０９と、の誤差Ｅ（３１３）が小さくなるように、変換器１２２が用いるパラメタを更新することができる。ステップＳ７０６で変換器学習部１１０は、変換器１２２のパラメタθｃの学習が収束したかどうかを判定する。学習が収束していなければ、ステップＳ７０５の処理が繰り返される。学習が収束していれば、処理はステップＳ７０７に進み、変換器学習部１１０は学習により得られた変換器１２２のパラメタθｃをメモリ１０６に格納する。このように、変換器１２２（又は変換器１２２が用いるニューラルネットワーク）の学習を行うことができる。 In step S705, the converter learning unit 110 re-learns the converter 122 by updating the parameters of the converter 122 from θc_new using the same method as in step S201. That is, the converter learning unit 110 can obtain the second characteristic image 309 from the recognition RAW data 303 according to the development parameter θd updated in step S702. Then, the parameters used by the converter 122 are updated so that the error E(313) between the estimated image 312 obtained by converting the first characteristic image 306 by the converter 122 and the second characteristic image 309 becomes smaller. can do. In step S706, the converter learning unit 110 determines whether learning of the parameter θc of the converter 122 has converged. If learning has not converged, the process of step S705 is repeated. If the learning has converged, the process advances to step S707, and the converter learning unit 110 stores the parameter θc of the converter 122 obtained through the learning in the memory 106. In this way, the converter 122 (or the neural network used by the converter 122) can be trained.

実施形態２によれば、認識処理の誤差が小さくなるように、元データに対する変換処理のパラメタと、変換器と、の双方の学習を行うことができる。 According to the second embodiment, it is possible to learn both the parameters of the conversion process for the original data and the converter so that the error in the recognition process is reduced.

［実施形態３］
実施形態３に係る情報処理装置は、変換器１２２及び認識器１２３を用いた認識処理の認識精度への影響が強い画像領域を示すことができる。実施形態３に係る情報処理装置は、ステップＳ５０４とＳ５０５との間において影響度表示部１１５が以下の処理を行う点を除いて、実施形態１又は２と同様である。 [Embodiment 3]
The information processing apparatus according to the third embodiment can indicate an image region that has a strong influence on the recognition accuracy of recognition processing using the converter 122 and the recognizer 123. The information processing apparatus according to the third embodiment is similar to the first or second embodiment, except that the influence display unit 115 performs the following processing between steps S504 and S505.

本実施形態において影響度表示部１１５は、検証データの各要素について、認識器の認識誤差に与える影響度を示す情報を出力することができる。例えば、影響度表示部１１５は、検証データごとに、影響度の大きかった画像領域を表示することができる。この場合、影響度表示部１１５は、図４に示されるように誤差Ｌを変換器１２２の入力層まで逆伝播させることができる。変換器１２２の入力層には画像が入力されるので、入力層まで誤差逆伝播された結果は、変換器１２２へ入力される画像と同じ構造を有する画像として扱うことができる。影響度表示部１１５は、こうして得られた画像を出力装置１０４に表示させることができる。例えば、影響度表示部１１５は、こうして得られた画像を検証データ１２６の画像に重畳することができる。 In this embodiment, the influence degree display unit 115 can output information indicating the degree of influence of each element of the verification data on the recognition error of the recognizer. For example, the influence degree display unit 115 can display image regions with a large degree of influence for each verification data. In this case, the influence display section 115 can back-propagate the error L to the input layer of the converter 122, as shown in FIG. Since an image is input to the input layer of the converter 122, the result of error backpropagation to the input layer can be treated as an image having the same structure as the image input to the converter 122. The influence display unit 115 can display the image thus obtained on the output device 104. For example, the influence degree display unit 115 can superimpose the image obtained in this way on the image of the verification data 126.

この処理は、変換器及び認識器の学習において必須ではない。しかしながら、実施形態３によれば、認識精度への影響が強い画像領域をユーザが確認することができる。影響度表示部１１５は、検証データ１２６の一部のみに対して上記の処理を行ってもよい。 This processing is not essential for training the converter and recognizer. However, according to the third embodiment, the user can confirm image regions that have a strong influence on recognition accuracy. The influence display unit 115 may perform the above processing on only a portion of the verification data 126.

（その他の実施例）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other examples)
The present invention provides a system or device with a program that implements one or more of the functions of the embodiments described above via a network or a storage medium, and one or more processors in the computer of the system or device reads and executes the program. This can also be achieved by processing. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

発明は上記実施形態に制限されるものではなく、発明の精神及び範囲から離脱することなく、様々な変更及び変形が可能である。従って、発明の範囲を公にするために請求項を添付する。 The invention is not limited to the embodiments described above, and various changes and modifications can be made without departing from the spirit and scope of the invention. Therefore, the following claims are hereby appended to disclose the scope of the invention.

１１０：変換器学習部、１１１：データ変換部、１１２：認識器学習部、１１３：影響度算出部、１１５：影響度表示部、１１６：認識部 110: Converter learning unit, 111: Data conversion unit, 112: Recognizer learning unit, 113: Impact calculation unit, 115: Impact display unit, 116: Recognition unit

Claims

変換器がパラメタを用いて第１の特性を有するデータを第２の特性を有するデータに変換することによって得られた学習データを用いて、前記第２の特性を有するデータに対する認識処理を行う認識器の学習を行う、認識器学習手段と、
前記第２の特性を有する検証データに対する前記認識器による認識処理の結果に基づいて、前記変換器が用いるパラメタを更新する制御手段と、
を備え、
前記制御手段は、前記第２の特性を有する検証データに対する前記認識器による認識処理の結果に基づいて、前記変換器が用いる各パラメタが前記認識器の認識誤差に与える影響度を推定し、前記影響度に従って前記パラメタを更新することを特徴とする情報処理装置。 Recognition in which a converter performs recognition processing on data having the second characteristic using learning data obtained by converting data having the first characteristic into data having the second characteristic using a parameter. a recognizer learning means for learning the device;
A control means for updating parameters used by the converter based on a result of recognition processing performed by the recognizer on verification data having the second characteristic;
Equipped with
The control means estimates the degree of influence that each parameter used by the converter has on the recognition error of the recognizer based on the result of recognition processing by the recognizer on verification data having the second characteristic, and An information processing device characterized in that the parameters are updated according to the degree of influence .

前記制御手段は、前記変換器が用いる各パラメタが前記認識器の認識誤差に与える影響度として、前記パラメタの変化に対する、前記認識誤差の変化の程度を求めることを特徴とする、請求項１に記載の情報処理装置。 2. The control means according to claim 1 , wherein the control means determines the degree of change in the recognition error with respect to a change in the parameter as the degree of influence that each parameter used by the converter has on the recognition error of the recognizer. The information processing device described.

前記変換器はニューラルネットワークを用いて前記第１の特性を有するデータを前記第２の特性を有するデータに変換し、
前記認識器はニューラルネットワークを用いて前記第２の特性を有するデータに対する認識処理を行い、
前記制御手段は、前記検証データに対する前記認識処理の誤差を、前記認識器を通して前記変換器へと逆伝播することにより、前記影響度を得ることを特徴とする、請求項１又は２に記載の情報処理装置。 the converter converts data having the first characteristic into data having the second characteristic using a neural network;
The recognizer performs recognition processing on data having the second characteristic using a neural network,
3. The control means obtains the degree of influence by back-propagating an error in the recognition process for the verification data to the converter through the recognizer. Information processing device.

前記第１の特性を有する第１のデータと、前記第２の特性を有し前記第１のデータと同一の対象を表現する第２のデータとの組を用いて、前記変換器が用いるパラメタの学習を行う変換器学習手段をさらに備えることを特徴とする、請求項１から３の何れか１項に記載の情報処理装置。 Parameters used by the converter using a set of first data having the first characteristic and second data having the second characteristic and representing the same object as the first data. 4. The information processing apparatus according to claim 1 , further comprising converter learning means for learning.

前記変換器学習手段は、前記変換器が前記第１のデータを変換することにより得られる前記第２の特性を有するデータと、前記第２のデータと、の誤差が小さくなるように、前記変換器が用いるパラメタの学習を行うことを特徴とする、請求項４に記載の情報処理装置。 The converter learning means performs the conversion so that an error between data having the second characteristic obtained by converting the first data by the converter and the second data becomes small. 5. The information processing device according to claim 4 , wherein the information processing device learns parameters used by the device.

前記変換器学習手段は、さらに前記第１のデータ又は前記第２のデータに関連するメタデータを用いて、前記変換器が用いるパラメタの学習を行うことを特徴とする、請求項４又は５に記載の情報処理装置。 6. The converter learning means further learns parameters used by the converter using metadata related to the first data or the second data . The information processing device described.

前記メタデータは、撮像位置情報、時刻情報、合焦位置情報、ホワイトバランス情報、レンズ情報、露出情報、シーン種別情報、及び現像パラメタのうちの１つ以上を含むことを特徴とする、請求項６に記載の情報処理装置。 The metadata includes one or more of imaging position information, time information, focus position information, white balance information, lens information, exposure information, scene type information, and development parameters. 6. The information processing device according to 6 .

前記制御手段は、前記変換器が用いる各パラメタの、前記認識器の認識誤差に与える影響度に基づく再学習を、前記変換器学習手段に行わせることを特徴とする、請求項４から７のいずれか１項に記載の情報処理装置。 8. The method according to claim 4 , wherein the control means causes the converter learning means to perform relearning based on the degree of influence of each parameter used by the converter on the recognition error of the recognizer. The information processing device according to any one of the items.

前記変換器学習手段は、前記変換器が用いる各パラメタの、前記第１のデータの前記第２の特性を有するデータへの変換誤差に対する影響度と、前記認識器の認識誤差に与える影響度と、に基づいて前記パラメタを更新することにより、前記変換器が用いるパラメタの再学習を行うことを特徴とする、請求項８に記載の情報処理装置。 The converter learning means determines the degree of influence of each parameter used by the converter on a conversion error of the first data to data having the second characteristic, and the degree of influence of each parameter used by the converter on a recognition error of the recognizer. 9. The information processing apparatus according to claim 8 , wherein the parameters used by the converter are re-learned by updating the parameters based on .

前記第２のデータは、元データに対する変換処理により得られるデータであり、
前記変換器学習手段は、さらに、前記変換処理のパラメタの学習を行うことを特徴とする、請求項４から８のいずれか１項に記載の情報処理装置。 The second data is data obtained by a conversion process on the original data,
9. The information processing apparatus according to claim 4 , wherein the converter learning means further learns parameters of the conversion process.

前記変換器学習手段は、
前記変換器が用いるパラメタを、前記認識器の認識誤差に与える影響度に従って調整し、
前記変換器が前記調整後のパラメタを用いて前記第１のデータを変換することにより得られる前記第２の特性を有するデータと、前記元データに対する変換処理により得られる前記第２のデータと、の誤差が小さくなるように、前記変換処理のパラメタを更新し、
前記変換器が前記第１のデータを変換することにより得られる前記第２の特性を有するデータと、元データに対する前記パラメタが更新された変換処理により得られる前記第２のデータと、の誤差が小さくなるように、前記変換器が用いるパラメタを更新する
ことを特徴とする、請求項１０に記載の情報処理装置。 The converter learning means includes:
adjusting the parameters used by the converter according to the degree of influence they have on the recognition error of the recognizer;
data having the second characteristic obtained by the converter converting the first data using the adjusted parameters; and the second data obtained by a conversion process on the original data. Update the parameters of the conversion process so that the error of
An error between data having the second characteristic obtained by the converter converting the first data and the second data obtained by a conversion process in which the parameters of the original data are updated is The information processing device according to claim 10 , wherein parameters used by the converter are updated so that the size of the converter becomes smaller.

前記元データはＲＡＷ画像データであり、前記変換処理は前記ＲＡＷ画像データに対する現像処理であることを特徴とする、請求項１０又は１１に記載の情報処理装置。 12. The information processing apparatus according to claim 10 , wherein the original data is RAW image data, and the conversion process is a development process for the RAW image data.

前記第１のデータ及び前記第２のデータは、同一のシーンを互いに異なるセンサを用いて撮像することにより得られた画像データ、又は、同一のシーンを撮像して互いに異なる現像処理を施すことにより得られた画像データであることを特徴とする、請求項４から１２のいずれか１項に記載の情報処理装置。 The first data and the second data are image data obtained by imaging the same scene using different sensors, or image data obtained by imaging the same scene and performing different development processes on the same scene. 13. The information processing apparatus according to claim 4 , wherein the information processing apparatus is obtained image data.

前記情報処理装置はカメラであり、
前記カメラは、同一のシーンを撮像することにより、画像データである前記第１のデータ及び前記第２のデータを生成することを特徴とする、請求項４から１３のいずれか１項に記載の情報処理装置。 The information processing device is a camera,
14. The camera according to claim 4 , wherein the camera generates the first data and the second data, which are image data, by capturing an image of the same scene. Information processing device.

前記学習により得られた認識器を用いて、前記第２の特性を有するデータに対する認識処理を行う認識手段をさらに備えることを特徴とする、請求項１から１４のいずれか１項に記載の情報処理装置。 The information according to any one of claims 1 to 14 , further comprising recognition means for performing recognition processing on data having the second characteristic using the recognizer obtained by the learning. Processing equipment.

前記検証データの各要素について前記認識器の認識誤差に与える影響度を示す情報を出力する出力手段をさらに備えることを特徴とする、請求項１から１５のいずれか１項に記載の情報処理装置。 The information processing apparatus according to any one of claims 1 to 15 , further comprising an output unit that outputs information indicating the degree of influence of each element of the verification data on a recognition error of the recognizer. .

情報処理装置が行う情報処理方法であって、
変換器がパラメタを用いて第１の特性を有するデータを第２の特性を有するデータに変換することによって得られた学習データを用いて、前記第２の特性を有するデータに対する認識処理を行う認識器の学習を行う工程、
前記第２の特性を有する検証データに対する前記認識器による認識処理の結果に基づいて、前記変換器が用いるパラメタを更新する工程と、
を備え、
前記更新する工程では、前記第２の特性を有する検証データに対する前記認識器による認識処理の結果に基づいて、前記変換器が用いる各パラメタが前記認識器の認識誤差に与える影響度を推定し、前記影響度に従って前記パラメタを更新することを特徴とする情報処理方法。 An information processing method performed by an information processing device, the method comprising:
Recognition in which a converter uses learning data obtained by converting data having a first characteristic into data having a second characteristic using a parameter to perform recognition processing on data having the second characteristic. The process of learning the vessel,
updating parameters used by the converter based on the result of recognition processing performed by the recognizer on verification data having the second characteristic;
Equipped with
In the updating step, the degree of influence of each parameter used by the converter on the recognition error of the recognizer is estimated based on the result of recognition processing by the recognizer on verification data having the second characteristic, An information processing method characterized in that the parameter is updated according to the degree of influence .

コンピュータを、請求項１から１６のいずれか１項に記載の情報処理装置の各手段として機能させるためのプログラム。 A program for causing a computer to function as each means of the information processing apparatus according to claim 1 .