US9552536B2 - Image processing device, information storage device, and image processing method - Google Patents

Image processing device, information storage device, and image processing method Download PDF

Info

Publication number: US9552536B2
Authority: US; United States
Prior art keywords: image; processing target; target image; processing; learning
Prior art date: 2013-04-26
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.): Active

Application number

US14/878,210

Other languages

English (en)

Other versions

US20160026900A1 (en

Inventor

Jun Ando

Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)

Olympus Corp

Original Assignee

Olympus Corp

Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)

2013-04-26

Filing date

2015-10-08

Publication date

2017-01-24

2015-10-08 Application filed by Olympus Corp filed Critical Olympus Corp

2015-10-08 Assigned to OLYMPUS CORPORATION reassignment OLYMPUS CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ANDO, JUN

2016-01-28 Publication of US20160026900A1 publication Critical patent/US20160026900A1/en

2016-11-08 Assigned to OLYMPUS CORPORATION reassignment OLYMPUS CORPORATION CHANGE OF ADDRESS Assignors: OLYMPUS CORPORATION

2017-01-24 Application granted granted Critical

2017-01-24 Publication of US9552536B2 publication Critical patent/US9552536B2/en

Status Active legal-status Critical Current

2034-03-14 Anticipated expiration legal-status Critical

Links

Images

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/50—Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
- G06K9/6262—
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
- G06K9/4642—
- G06K9/4671—
- G06K9/6256—
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/776—Validation; Performance evaluation
- G06K9/4676—
- G06K9/72—
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
- G06V10/464—Salient features, e.g. scale invariant feature transforms [SIFT] using a plurality of salient features, e.g. bag-of-words [BoW] representations
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/768—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using context analysis, e.g. recognition aided by known co-occurring patterns

Definitions

the present invention relates to an image processing device, an information storage device, an image processing method, and the like.
JP-A-2008-282267 discloses a method that classifies such an image based on the feature quantity of part of the target image.
the discrimination accuracy (classification accuracy) achieved using the generated discriminator (classifier) normally increases as the amount of training data (i.e., the number of pieces of training data) used for the learning process increases, and it becomes possible to automatically assign a correct label to unlabeled data.
the correct answer label is normally manually assigned to the training data. Therefore, it may be difficult to provide a large amount of training data, or the cost required to generate the training data may increase.
Semi-supervised learning has been developed from supervised learning.
Semi-supervised learning utilizes unlabeled data as the training data in addition to data to which the correct answer label is assigned.
Generative learning has been proposed as one type of semi-supervised learning in which image data is mainly used as the learning-discrimination target, and a new image is generated from an image to which the correct answer label is assigned, and used for the learning process.
the image to which the correct answer label is assigned can be changed on condition that the correct answer label does not change when generating a new image, and it is impossible to generate a large number of new images. Therefore, it is impossible to sufficiently increase the number of pieces of training data, and sufficiently improve the discrimination accuracy achieved using the discriminator.
a new image may be generated from an image to which the correct answer label is assigned using a method that segments the original image (to which the correct answer label is assigned) into a plurality of images, manually assigns the correct answer label to the generated image group, and uses the resulting image group for the learning process as new training data.
the cost required to label the training data increases to a large extent.
an image processing device comprising:
an input reception section that performs a process that receives a learning image and a correct answer label assigned to the learning image
a processing section that performs a process that generates classifier data that is used to classify an image, and a processing target image that is used to generate the classifier data
the processing section generating the processing target image that is the entirety or part of the learning image, calculating a feature quantity of the processing target image, generating the classifier data based on training data that is a set of the feature quantity and the correct answer label that is assigned to the learning image that corresponds to the feature quantity, generating an image group based on the learning image or the processing target image, classifying each image of the generated image group using the classifier data to calculate a classification score of each image of the image group, and regenerating the processing target image based on the calculated classification score and the image group.
an information storage device storing a program that causes a computer to perform steps of:
the processing target image while generating the processing target image that is the entirety or part of the learning image, calculating a feature quantity of the processing target image, generating the classifier data based on training data that is a set of the feature quantity and the correct answer label that is assigned to the learning image that corresponds to the feature quantity, generating an image group based on the learning image or the processing target image, classifying each image of the generated image group using the classifier data to calculate a classification score of each image of the image group, and regenerating the processing target image based on the calculated classification score and the image group.
an image processing method comprising:
processing target image that is used to generate classifier data that is used to classify an image, the processing target image being the entirety or part of the learning image;
FIG. 1 illustrates a system configuration example according to an embodiment of the invention.
FIG. 2 is a flowchart illustrating the flow of a process according to one embodiment of the invention.
FIG. 3 is a view illustrating a process that generates an image group from a learning image or a processing target image.
FIGS. 4A and 4B are views illustrating a process that generates a processing target image from an image group.
FIGS. 5A to 5C are views illustrating a processing target image generated by a process that generates a processing target image.
FIGS. 6A and 6B are views illustrating an example in which the area of a processing target image increases.
FIGS. 7A to 7D are views illustrating a classification score threshold value.
FIGS. 8A and 8B are views illustrating a process that generates an image group.
FIGS. 9A to 9C are views illustrating a segmentation size.
FIG. 10 is a flowchart illustrating the flow of a process when correcting a processing target image group.
FIG. 11 is a view illustrating a process that displays a list of a processing target image group.
Several embodiments of the invention may provide an image processing device, an information storage device, an image processing method, and the like that make it possible to detect the position of an object by merely labeling a learning image when performing a learning process using a learning image that partially includes an object or the like that is represented by the correct answer label.
Several embodiments of the invention may provide an image processing device, an information storage device, an image processing method, and the like that make it possible to detect the position of an object or the like, and improve the classification accuracy achieved using the resulting classifier by merely labeling a learning image when performing a learning process using a learning image that partially includes an object or the like that is represented by the correct answer label.
a new processing target image is automatically generated by merely assigning the correct answer label to the learning image, and learning is implemented while correcting the training data or increasing the number of pieces of training data.
a new processing target image is generated based on the classification result and the classification score of each image of the image group, and the correct answer label that corresponds to the contents of the processing target image to be generated is automatically assigned. Specifically, it is possible to automatically generate the processing target image that more accurately represents the contents represented by the correct answer label. Therefore, it is possible to reduce the cost required to generate the training data, and use a large amount of training data for learning.
the processing section may regenerate the classifier data based on new training data that is a set of the regenerated processing target image and the correct answer label that is assigned to an image selected from the image group when regenerating the processing target image.
the processing section may compare the classification score of each image of the image group with a given threshold value, may select an image for which the classification score has been calculated to be equal to or higher than the given threshold value from the image group, and may regenerate the processing target image based on a selected image group.
the processing section may change the given threshold value that is compared with the classification score each time the processing target image has been regenerated.
the processing section may set the entirety of the learning image to be a first processing target image.
the processing section may generate the processing target image that is smaller in number of pixels or area than a preceding processing target image.
the processing section may generate the image group based on the learning image, and may generate the processing target image that is larger in number of pixels or area than a preceding processing target image.
the processing section may repeat the process that generates the classifier data and the processing target image a given number of times.
the processing section may stop repeating the process that generates the classifier data and the processing target image when a difference in area or number of pixels between a preceding processing target image and a current processing target image is less than a given threshold value.
the processing section may segment the learning image or the processing target image into a plurality of areas to generate the image group that is a set of images that respectively correspond to the plurality of areas.
the processing section may overlap-segment the learning image or the processing target image into a plurality of areas to generate the image group that is a set of images that respectively correspond to the plurality of areas.
the processing section may over-segment the learning image or the processing target image into a plurality of areas to generate the image group that is a set of images that respectively correspond to the plurality of areas.
the processing section may change a segmentation size of the learning image or the processing target image each time the process that generates the classifier data and the processing target image has been performed.
the processing section may display a processing target image group on a display section, may acquire correction instruction information that instructs to correct the processing target image group, and may perform a correction process on the processing target image group based on the correction instruction information.
the processing section may display a list of the processing target image group on the display section, may acquire designation information that designates an unnecessary processing target image from the processing target image group as the correction instruction information, and may delete the processing target image designated by the designation information from the processing target image group.
the processing section may calculate the feature quantity of the processing target image using a bag of features.
the processing section may perform an object detection process based on the classification score of each image of the image group.
supervised learning has been studied in the field of machine learning.
the term “supervised learning” used herein refers to a method that performs a learning process using data to which the correct answer label is assigned to generate a discriminator (classifier). After completion of the learning process, the contents of unlabeled data to which the correct answer label is not assigned are discriminated using the generated discriminator to label the unlabeled data.
Such supervised learning has been used for a search engine, a log analysis system, and the like for which it is necessary to automatically discriminate the contents of a large amount of data.
the contents of an image may be classified using the classifier generated as described above when it is desired to detect the position of an object within an image.
An image that partially includes the detection target object or the like may be classified using the classifier.
the position of an object within an image may be detected, or the contents of an image may be classified using the method disclosed in JP-A-2008-282267, for example. In such a case, it is necessary to provide data that represents the position, the shape, and the like of an object or the like (within an image) that is represented by the correct answer label as training data in order to perform the learning process using an image that partially includes an object or the like.
the discrimination accuracy (classification accuracy) achieved using the resulting discriminator normally increases as the amount of training data used for the learning process increases, and it becomes possible to automatically assign the correct label to unlabeled data.
the correct answer label is normally manually assigned to the training data, it may be difficult to provide a large amount of training data, or the cost required to generate the training data may increase.
Semi-supervised learning has been developed from supervised learning.
Semi-supervised learning utilizes unlabeled data as the training data in addition to data to which the correct answer label is assigned.
Generative learning has been proposed as one type of semi-supervised learning in which image data is mainly used as the learning-discrimination target, and a new image is generated from an image to which the correct answer label is assigned, and used for the learning process.
Known generative learning is designed on the assumption that, when a new image is generated from an image to which the correct answer label is assigned, the correct answer label assigned to the new image is the same as that assigned to the original image.
small noise may be added to the image to which the correct answer label is assigned, or the brightness of the image to which the correct answer label is assigned may be slightly changed, for example.
a new image may be generated from an image to which the correct answer label is assigned using a method that segments the original image (to which the correct answer label is assigned) into a plurality of images, for example.
the correct answer label assigned to the generated image is not necessarily the same as the correct answer label assigned to the original image.
the original image includes a flower and the sky
the correct answer label “flower” is assigned to the original image
an image that includes only a flower, an image that includes only the sky, and the like may be generated as a result of segmenting the original image (to which the correct answer label is assigned), and it is rare that the correct answer label “flower” is assigned to each of the generated images. Therefore, the generated images (unlabeled images) cannot be used directly for the learning process.
the correct answer label may be manually assigned to an image group generated by segmenting an image to which the correct answer label is assigned, and the image group may be used for the learning process as new training data.
the image group may be used for the learning process as new training data.
the cost required to label the training data increases to a large extent.
the embodiments of the invention propose an image processing device, an information storage device, an image processing method, and the like that make it possible to detect the position of an object or the like, and improve the classification accuracy achieved using the resulting classifier by merely labeling a learning image when performing a learning process using a learning image that partially includes an object or the like that is represented by the correct answer label.
FIG. 1 illustrates a configuration example of an image processing device according to one embodiment of the invention.
the image processing device includes an input reception section 110 , a processing section 120 , and a storage section 130 .
the input reception section 110 is connected to the processing section 120
the processing section 120 is bidirectionally connected to the storage section 130 .
the configuration of the image processing device is not limited to the configuration illustrated in FIG. 1 .
Various modifications may be made, such as omitting some of the elements illustrated in FIG. 1 , or adding other elements.
Some or all of the functions of the image processing device may be implemented by a server that is connected to a network, or may be implemented by a terminal device that includes a display section and the like.
the input reception section 110 performs a process that receives a learning image and a correct answer label assigned to the learning image.
the input reception section 110 may be a communication section that communicates with an external server or an external storage section through a network that includes at least one of a cable network and a wireless network, or may be an interface that allows the user to input the correct answer label and the like, and includes a keyboard, a mouse, and the like.
the processing section 120 performs a process that generates classifier data that is used to classify an image, and the processing target image that is used to generate the classifier data.
the function of the processing section 120 may be implemented by hardware such as a processor (e.g., CPU) or an ASIC (e.g., gate array), a program, or the like. The details of the process performed by the processing section 120 are described later.
the storage section 130 stores the generated classifier data and the like, and serves as a work area for the processing section 120 and the like.
the function of the storage section 130 may be implemented by a memory (e.g., RAM), an HDD, or the like.
the input reception section 110 receives a learning image group used for the learning process, and an attribute class label (correct answer label) that is assigned to each learning image (S 101 ).
the processing section 120 sets the entirety of each learning image to be the processing target image (S 102 ). It is unnecessary to assign information about the position and the shape of the detection target within each learning image by setting the entirety of the learning image to be the first processing target image.
the processing section 120 calculates the feature quantity of each processing target image (S 103 ). Note that the feature quantity is calculated using a bag of features (BoF). When the processing target image has a different size, it is necessary to normalize the frequency of the BoF histogram corresponding to the size of the processing target image.
BoF bag of features
the processing section 120 performs a learning process using a set of the calculated feature quantity and the attribute class as training data to generate classifier data (S 104 ).
a support vector machine (SVM) is used to generate the classifier data.
SVM support vector machine
KDA kernel discriminant analysis
the processing section 120 then segments (overlap-segments or over-segments) each learning image to generate an image group (S 105 ).
the learning image is segmented into a plurality of blocks that may overlap each other to generate an image group that is a set of these images.
the image may be over-segmented using a JSEG segmentation method or the like. In this case, it is possible to determine a more accurate boundary between the detection target area and the background corresponding to contour-color information.
the processing section 120 then classifies each image of the generated image group using the classifier data to calculate the classification score (S 106 ).
the processing section 120 determines whether or not a termination condition has been satisfied (S 107 ). When it has been determined that the termination condition has been satisfied, the processing section 120 terminates the process.
the processing section 120 When it has been determined that the termination condition has not been satisfied, the processing section 120 generates the sum of the images of the generated image group for which the classification score is higher than a given threshold value to be a new processing target image (S 108 ). The processing section 120 performs the steps S 103 to S 108 until the termination condition is satisfied.
the image processing device includes the input reception section 110 that performs the process that receives the learning image and the correct answer label assigned to the learning image, the processing section 120 that performs the process that generates the classifier data that is used to classify an image, and the processing target image that is used to generate the classifier data, and the storage section 130 that stores the generated classifier data.
the processing section 120 generates the processing target image that is the entirety or part of the learning image.
the processing section 120 calculates the feature quantity of the processing target image, and generates the classifier data based on the training data that is a set of the feature quantity and the correct answer label that is assigned to the learning image that corresponds to the feature quantity.
the processing section 120 generates an image group based on the learning image or the processing target image, classifies each image of the generated image group using the classifier data to calculate the classification score of each image, and regenerates the processing target image based on the calculated classification score and the image group.
the term “learning image” used herein refers to an image that is used for the learning process.
the term “learning image” used herein refers to the original image of the processing target image and the image group (described later).
the learning image may be an image to which the correct answer label is assigned in advance, or may be an image to which the correct answer label is not assigned.
label (class) refers to a word, a symbol, and the like that represent the contents of data (image data).
corrected answer label used herein refers to a label that accurately (correctly) represents the contents of data.
the correct answer label is a label that corresponds to the class (attribute class) to which data is attributed. For example, when two labels “car” and “other than car” are provided, and an image is classified based on whether or not a car is included in the image, the label “car” is the correct answer label assigned to the learning image LIM 1 illustrated in FIG. 3 that includes two cars and the sky (clouds).
classifier discriminator, learning discriminator, classification model, or discrimination model
the classifier refers to a standard, a rule, and the like that are used to determine the label that should be assigned to test data.
the classifier is the learning results (classifier data) obtained by performing the learning process using the learning algorithm (e.g., support vector machine (SVM)) and the training data.
learning algorithm e.g., support vector machine (SVM)
processing target image refers to an image that is used directly to generate the classifier data, and refers to the entirety or part of the learning image. A specific processing target image generation method is described in detail later.
training data refers to data that is input directly to the learning algorithm.
the training data is a set of the feature quantity of the processing target image and the correct answer label that is assigned to the learning image that corresponds to the feature quantity.
the learning image that corresponds to the feature quantity refers to the original learning image of the processing target image that has the feature quantity.
a set of the processing target image and the correct answer label that is assigned to the original learning image of the processing target image is used as the first training data.
the training data is not limited thereto. For example, a correct answer label that differs from the correct answer label assigned to the original learning image may be used depending on the processing target image.
the image group that is generated from the learning image or the processing target image is a set of images used to regenerate the processing target image.
four images IM 1 to IM 4 (image group) are generated from the learning image LIM 1 .
the images IM 1 to IM 4 differ in contents from each other, and the correct answer label assigned to each image of the image group is not necessarily the same as the correct answer label assigned to the learning image LIM 1 . Therefore, the correct answer label has not been assigned to each image of the image group immediately after generation, and each image is labeled using the classifier data.
the number of images included in the image group is not limited to four (see FIG. 3 ). An arbitrary number of images may be included in the image group.
classification score refers to the degree of likelihood of the classification result (discrimination result).
classification score is the distance from the classification boundary when using the SVM as the classification method (discrimination method), and is the difference in (Mahalanobis) distance from the cluster center when using a discriminant analysis method.
classification score is the likelihood when using a statistical (Bayesian) discriminator as the classification method, and is the sum of the weighted votes of weak classifiers when using boosting.
the same correct answer label as that of the original learning image is basically assigned to the processing target image to be generated. Note that the configuration is not limited thereto.
the processing section 120 may regenerate the classifier data based on new training data that is a set of the regenerated processing target image and the correct answer label that is assigned to the image selected from the image group when regenerating the processing target image.
the images IM 1 to IM 4 of the image group generated from the learning image LIM 1 illustrated in FIG. 3 are classified based on the classifier data.
the label “other than car” is assigned to the images IM 1 and IM 2
the label “car” is assigned to the images IM 3 and IM 4 (see FIG. 4A ).
the label “car” is assigned to the processing target image LIM 2 .
the label “other than car” is assigned to the processing target image generated using the images IM 1 and IM 2 to which the label “other than car” is assigned. Specifically, the correct answer label that represents the contents of the processing target image is automatically assigned to the processing target image.
the images (selected images) used to generate the processing target image include an identical object. It is determined that the classification result obtained using the classifier data is likely to be correct as the classification score increases. Specifically, it is likely that an identical object is included in images for which the classification result is identical and the classification score is high.
the processing section 120 may compare the classification score of each image of the image group with a given threshold value, select an image for which the classification score has been calculated to be equal to or higher than the given threshold value from the image group, and regenerate the processing target image based on the selected image group.
the images IM 3 and IM 4 illustrated in FIG. 4A are selected.
the number of images to be selected is not particularly limited, and the processing target image need not necessarily be generated by combining a plurality of selected images (see FIG. 4B ).
the processing section 120 may set the entirety of the learning image to be the first processing target image.
the processing section 120 may generate the processing target image that is smaller in number of pixels or area than the preceding processing target image.
FIGS. 5A to 5C illustrate a specific example.
the learning image LIM 1 is segmented into a plurality of areas indicated by the dotted lines.
the selected images are selected from the images (image group) that respectively correspond to the plurality of areas, and combined to generate the processing target image LIM 3 illustrated in FIG. 5B .
the processing target image LIM 3 is smaller in number of pixels or area than the first processing target image since the areas other than the areas that include the cars are removed.
the processing target image LIM 4 illustrated in FIG. 5C can be generated by performing the process while reducing the segmentation size (as described later).
the processing target image LIM 4 is an image obtained by trimming the learning image along the contour of the car, and is smaller in number of pixels or area than the processing target image LIM 3 .
the processing section 120 may generate the image group based on the learning image, and generate the processing target image that is larger in number of pixels or area than the preceding processing target image.
the number of pixels or the area of the processing target image to be generated gradually decreases (or does not change).
the number of pixels or the area of the processing target image may increase when generating the image group from the original learning image.
FIGS. 6A and 6B illustrate a specific example.
the processing target image LIM 5 illustrated in FIG. 6A is an image of a car in which the tires are removed. Specifically, the tires were removed when generating the processing target image since the images of the tires were determined to be other than a car, or the classification score was lower than a given threshold value although the images of the tires were determined to be a car.
the processing target image LIM 6 illustrated in FIG. 6B in which the tires TY are recovered can be generated by generating the image group from the original learning image when the classification accuracy achieved using the classification model has been improved by repeated learning.
the processing target image in which part of the area represented by the correct answer label is deleted may be generated when the classification score threshold value for selecting an image is set to be large even though learning has not sufficiently advanced, and the classification accuracy achieved using the classifier data is low.
the processing section 120 may change the given threshold value that is compared with the classification score each time the processing target image has been regenerated.
FIGS. 7A to 7D illustrate a specific example.
the process that generates the classifier data and the processing target image is performed four times.
FIG. 7A illustrates a graph that represents the relationship between the number of times that the process that generates the classifier data and the processing target image has been performed (horizontal axis), and the threshold value (vertical axis).
FIG. 7A illustrates a change in the threshold value corresponding to Case 1 (CS 1 ) and Case 2 (CS 2 ).
Case 1 the threshold value is increased, and the segmentation size of the learning image is decreased (as described later) along with an increase in the number of times that the process that generates the classifier data and the processing target image has been performed (see the straight line CS 1 ).
the threshold value is fixed at TH 4
the segmentation size of the learning image is fixed at the minimum value (see the straight line CS 2 ).
the processing target image LIM 7 (that is not formed along the contour of the car, and includes an area other than the car) is generated when the process that generates the classifier data and the processing target image has been performed for the first time since the threshold value TH 1 is used, and the segmentation size is large.
the processing target image LIM 7 includes an area other than the car since the threshold value is small, and an image for which the classification score is low is also used to generate the processing target image.
the processing target image LIM 8 (that is formed along the contour of the car) is generated when the process that generates the classifier data and the processing target image has been performed for the first time since the threshold value TH 4 is used, and the segmentation size is a minimum.
the tires are removed in the processing target image LIM 8 . If it is not learnt that the tires are part of the car based on another piece of training data, it is not likely that the tires are recovered even when the process that generates the classifier data and the processing target image has been performed for the fourth time.
the processing section 120 may calculate the feature quantity of the processing target image using a bag of features.
bag of features bag of visual words (BoVW)
BoVW bag of visual words
a value used as the feature quantity (local feature quantity) of the image (pixel) may be a color feature quantity (e.g., hue, saturation, and value (HSV)), or may be a gradient feature quantity (e.g., scale-invariant feature transform (SIFT) and histograms of oriented gradients (HOG)), or may be a texture feature quantity (e.g., local binary pattern (LBP)).
HSV hue, saturation, and value
SIFT scale-invariant feature transform
HOG histograms of oriented gradients
texture feature quantity e.g., local binary pattern (LBP)
camera setting information e.g., imaging conditions and focal distance
imaging conditions and focal distance may also be used as the feature quantity.
the processing section 120 may segment the learning image or the processing target image into a plurality of areas to generate the image group that is a set of the images that respectively correspond to the plurality of areas (see FIG. 3 ).
the processing section 120 may overlap-segment the learning image or the processing target image into a plurality of areas to generate the image group that is a set of the images that respectively correspond to the plurality of areas.
the learning image LIM (or the processing target image) is segmented into areas CAR 1 to CAR 6 that overlap each other as illustrated in FIG. 8A to generate the image group.
a small area (CAR 5 , CAR 6 ) may be set over a large area (CAR 1 to CAR 4 ).
the image that corresponds to the area CAR 5 and the image that corresponds to the area CAR 6 are used when generating the processing target image that includes the car.
the processing section 120 may over-segment the learning image or the processing target image into a plurality of areas to generate the image group that is a set of the images that respectively correspond to the plurality of areas.
over-segmentation refers to segmenting an image into a plurality of areas at the boundary between objects and within each object.
the learning image LIM (or the processing target image) is segmented into areas CAR 5 to CAR 15 as illustrated in FIG. 8B to generate the image group.
the car is segmented into the areas CAR 11 to CAR 15 .
the processing section 120 may change the segmentation size of the learning image or the processing target image each time the process that generates the classifier data and the processing target image has been performed.
the classification accuracy achieved using the generated classifier data is improved as the process that generates the classifier data is repeated a larger number of times. It is considered that an improvement in classification accuracy gradually decreases as the classifier data is generated after the process that generates the classifier data has been repeated a given number of times. In this case, an improvement in classification accuracy with respect to an identical learning time decreases as the process that generates the classifier data is repeated a larger number of times. Specifically, the cost efficiency of the process that generates the classifier data deteriorates as the process that generates the classifier data is repeated a larger number of times.
the processing section 120 may repeat the process that generates the classifier data and the processing target image a given number of times.
An improvement in classification accuracy achieved using the classifier data decreases since the processing target image used for learning differs to only a small extent from the preceding processing target image as the process that generates the classifier data and the processing target image is repeated a larger number of times.
the processing section 120 may stop repeating the process that generates the classifier data and the processing target image when the difference in area or number of pixels between the preceding processing target image and the current processing target image is less than a given threshold value.
FIG. 9A illustrates a graph that represents the relationship between the number of times that the process that generates the classifier data and the processing target image has been performed (horizontal axis), and the area or the number of pixels of the generated processing target image (vertical axis). In the graph illustrated in FIG.
the area of the processing target image generated when the process that generates the classifier data and the processing target image has been performed for the first time is AR 1
the area of the processing target image generated when the process that generates the classifier data and the processing target image has been performed for the second time is AR 2
the area of the processing target image generated when the process that generates the classifier data and the processing target image has been performed for the third time is AR 3 .
the area AR 0 when the process that generates the classifier data and the processing target image has not been performed represents the area of the learning image.
the difference in area (or number of pixels) between the learning image and the processing target image generated when the process that generates the classifier data and the processing target image has been performed for the first time is ⁇ AR 01
the difference in area (or number of pixels) between the processing target image generated when the process that generates the classifier data and the processing target image has been performed for the first time and the processing target image generated when the process that generates the classifier data and the processing target image has been performed for the second time is ⁇ AR 12
the difference in area (or number of pixels) between the processing target image generated when the process that generates the classifier data and the processing target image has been performed for the second time and the processing target image generated when the process that generates the classifier data and the processing target image has been performed for the third time is ⁇ AR 23
the process that generates the classifier data and the processing target image is terminated when the process that generates the classifier data and the processing target image has been performed for the fourth time, since the difference ⁇ AR 34 in area between the processing target image LIM 10 (see FIG. 9B ) generated when the process that generates the classifier data and the processing target image has been performed for the third time and the processing target image LIM 11 (see FIG. 9C ) generated when the process that generates the classifier data and the processing target image has been performed for the fourth time exceeds the given threshold value TH.
the processing section 120 may display a processing target image group on a display section, acquire correction instruction information that instructs to correct the processing target image group, and perform a correction process on the processing target image group based on the correction instruction information.
correction instruction information refers to information that instructs the details of the correction process performed on the processing target image group, and is input by the user.
the processing section 120 may display a list of the processing target image group on the display section, acquire designation information that designates an unnecessary processing target image from the processing target image group as the correction instruction information, and delete the processing target image designated by the designation information from the processing target image group.
the processing section 120 performs the correction process on the processing target image group after the step S 108 (S 201 ).
a list of the processing target image group is displayed on a display section DS (see FIG. 11 ), and an image that has been determined by the user to be inappropriate as the processing target image (i.e., the image illustrated in FIG. 11 that is enclosed by the cursor CS) is edited or deleted.
the processing section 120 may regenerate the processing target image so that the processing target image includes the object or the like that is represented by the correct answer label in an appropriate manner (e.g., at the center of the image).
the area of the learning image occupied by the processing target image converges on the area that includes the car (see FIGS. 5A to 5C , for example). Specifically, it is possible to detect the presence and the position of the car within the original learning image.
the processing section 120 may perform an object detection process based on the classification score of each image of the image group.
the method according to the embodiments of the invention may also be applied to multimedia recognition (e.g., document recognition and voice recognition).
multimedia recognition e.g., document recognition and voice recognition
part or most of the process performed by the image processing device and the like according to the embodiments of the invention may be implemented by a program or a computer-readable information storage device storing the program.
the image processing device and the like according to the embodiments of the invention are implemented by causing a processor (e.g., CPU) to execute a program.
a program stored in an information storage device is read, and executed by a processor (e.g., CPU).
the information storage device (computer-readable device) stores a program, data, and the like.
the function of the information storage device may be implemented by an optical disk (e.g., DVD or CD), a hard disk drive (HDD), a memory (e.g., memory card or ROM), or the like.
the processor e.g., CPU performs various processes according to the embodiments of the invention based on a program (data) stored in the information storage device.
a program that causes a computer i.e., a device including an operation section, a processing section, a storage section, and an output section
a program that causes a computer to execute the process implemented by each section is stored in the information storage device.
the imaging device and the like according to the embodiments of the invention may include a processor and a memory.
the processor may be a central processing unit (CPU), for example. Note that the processor is not limited to a CPU. Various types of processors such as a graphics processing unit (GPU) and a digital signal processor (DSP) may also be used.
the processor may be a hardware circuit such as an application specific integrated circuit (ASIC).
the memory stores a computer-readable instruction. Each section of the imaging device and the like according to the embodiments of the invention is implemented by causing the processor to execute the instruction.
the memory may be a semiconductor memory (e.g., Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM)), a register, a hard disk, or the like.
the instruction may be an instruction included in an instruction set of a program, or may be an instruction that causes a hardware circuit of the processor to operate.

Landscapes

Engineering & Computer Science (AREA)
Theoretical Computer Science (AREA)
General Physics & Mathematics (AREA)
Physics & Mathematics (AREA)
Artificial Intelligence (AREA)
Computer Vision & Pattern Recognition (AREA)
Multimedia (AREA)
Evolutionary Computation (AREA)
Databases & Information Systems (AREA)
Software Systems (AREA)
Medical Informatics (AREA)
General Health & Medical Sciences (AREA)
Health & Medical Sciences (AREA)
Computing Systems (AREA)
Data Mining & Analysis (AREA)
Life Sciences & Earth Sciences (AREA)
Bioinformatics & Cheminformatics (AREA)
Bioinformatics & Computational Biology (AREA)
Evolutionary Biology (AREA)
General Engineering & Computer Science (AREA)
Image Analysis (AREA)

US14/878,210 2013-04-26 2015-10-08 Image processing device, information storage device, and image processing method Active US9552536B2 (en)

Applications Claiming Priority (3)

Application Number	Priority Date	Filing Date	Title
JP2013093344A JP6188400B2 (ja)	2013-04-26	2013-04-26	画像処理装置、プログラム及び画像処理方法
JP2013-093344		2013-04-26
PCT/JP2014/056886 WO2014174932A1 (ja)	2013-04-26	2014-03-14	画像処理装置、プログラム及び画像処理方法

Related Parent Applications (1)

Application Number	Title	Priority Date	Filing Date
PCT/JP2014/056886 Continuation WO2014174932A1 (ja)	2013-04-26	2014-03-14	画像処理装置、プログラム及び画像処理方法

Publications (2)

Publication Number	Publication Date
US20160026900A1 US20160026900A1 (en)	2016-01-28
US9552536B2 true US9552536B2 (en)	2017-01-24

Family

ID=51791517

Family Applications (1)

Application Number	Title	Priority Date	Filing Date
US14/878,210 Active US9552536B2 (en)	2013-04-26	2015-10-08	Image processing device, information storage device, and image processing method

Country Status (4)

Country	Link
US (1)	US9552536B2 (ja)
JP (1)	JP6188400B2 (ja)
CN (1)	CN105144239B (ja)
WO (1)	WO2014174932A1 (ja)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
US20210272288A1 (en) *	2018-08-06	2021-09-02	Shimadzu Corporation	Training Label Image Correction Method, Trained Model Creation Method, and Image Analysis Device
US11281912B2 (en) *	2019-01-09	2022-03-22	International Business Machines Corporation	Attribute classifiers for image classification
US20220172460A1 (en) *	2019-03-14	2022-06-02	Nec Corporation	Generation method, training data generation device and program
US11544563B2 (en)	2017-12-19	2023-01-03	Olympus Corporation	Data processing method and data processing device
US12026935B2 (en)	2019-11-29	2024-07-02	Olympus Corporation	Image processing method, training device, and image processing device

Families Citing this family (38)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
AU2016315938B2 (en) *	2015-08-31	2022-02-24	Cape Analytics, Inc.	Systems and methods for analyzing remote sensing imagery
JP6116650B1 (ja) *	2015-11-17	2017-04-19	エヌ・ティ・ティ・コムウェア株式会社	学習支援システム、学習支援方法、学習支援装置、および学習支援プログラム
JP6639523B2 (ja) *	2015-12-22	2020-02-05	オリンパス株式会社	学習画像自動選別装置、学習画像自動選別方法および学習画像自動選別プログラム
US10963676B2 (en)	2016-12-23	2021-03-30	Samsung Electronics Co., Ltd.	Image processing method and apparatus
TWI653885B (zh) *	2017-03-03	2019-03-11	宏碁股份有限公司	影像輸出方法及影像擷取裝置
JP6542824B2 (ja)	2017-03-13	2019-07-10	ファナック株式会社	入力画像から検出した対象物の像の尤度を計算する画像処理装置および画像処理方法
US10963739B2 (en)	2017-03-23	2021-03-30	Nec Corporation	Learning device, learning method, and learning program
JP6798619B2 (ja) *	2017-07-31	2020-12-09	富士通株式会社	情報処理装置、情報処理プログラム及び情報処理方法
JP6853159B2 (ja) *	2017-10-31	2021-03-31	トヨタ自動車株式会社	状態推定装置
JP6936957B2 (ja) *	2017-11-07	2021-09-22	オムロン株式会社	検査装置、データ生成装置、データ生成方法及びデータ生成プログラム
CN107864333B (zh) *	2017-11-08	2020-04-21	Oppo广东移动通信有限公司	图像处理方法、装置、终端及存储介质
JP6928876B2 (ja) *	2017-12-15	2021-09-01	京セラドキュメントソリューションズ株式会社	フォーム種別学習システムおよび画像処理装置
CN108304848B (zh) *	2018-01-10	2020-04-28	贝壳找房(北京)科技有限公司	户型特征的自动提取方法、***、电子设备和存储介质
JP6933164B2 (ja) *	2018-03-08	2021-09-08	株式会社Ｊｖｃケンウッド	学習用データ作成装置、学習用モデル作成システム、学習用データ作成方法、及びプログラム
WO2019203924A1 (en) *	2018-04-16	2019-10-24	Exxonmobil Research And Engineering Company	Automation of visual machine part ratings
CN108985214A (zh) *	2018-07-09	2018-12-11	上海斐讯数据通信技术有限公司	图像数据的标注方法和装置
JP6542445B1 (ja)	2018-07-31	2019-07-10	株式会社情報システムエンジニアリング	情報提供システム及び情報提供方法
JP7196529B2 (ja) *	2018-10-18	2022-12-27	コニカミノルタ株式会社	情報処理装置及びプログラム
EP3644275A1 (en) *	2018-10-22	2020-04-29	Koninklijke Philips N.V.	Predicting correctness of algorithmic segmentation
CN109523518B (zh) *	2018-10-24	2020-11-10	浙江工业大学	一种轮胎x光病疵检测方法
JP2022043364A (ja) *	2018-11-16	2022-03-16	株式会社Preferred Networks	学習装置、物体検出装置、学習方法、およびプログラム
KR102119136B1 (ko) *	2018-12-26	2020-06-05	인천대학교 산학협력단	지능형 이미지 분류 방법
US10956487B2 (en) *	2018-12-26	2021-03-23	Industrial Technology Research Institute	Method for establishing and processing cross-language information and cross-language information system
JP7075056B2 (ja) *	2018-12-27	2022-05-25	オムロン株式会社	画像判定装置、画像判定方法及び画像判定プログラム
CN109785313B (zh) *	2019-01-21	2023-03-14	山东女子学院	一种基于lbp的轮胎合格检测方法
JP7111088B2 (ja) *	2019-01-24	2022-08-02	カシオ計算機株式会社	画像検索装置、学習方法及びプログラム
JP7374453B2 (ja) *	2019-03-28	2023-11-07	株式会社イシダ	学習済みモデル生成方法、学習済みモデル生成装置、商品判別方法、商品判別装置、商品判別システム及び計量装置
JP6607590B1 (ja) *	2019-03-29	2019-11-20	株式会社情報システムエンジニアリング	情報提供システム及び情報提供方法
JP6607589B1 (ja)	2019-03-29	2019-11-20	株式会社情報システムエンジニアリング	情報提供システム及び情報提供方法
JP6651189B1 (ja)	2019-03-29	2020-02-19	株式会社情報システムエンジニアリング	機械学習用のデータ構造、学習方法及び情報提供システム
US11555701B2 (en) *	2019-05-02	2023-01-17	Corelogic Solutions, Llc	Use of a convolutional neural network to auto-determine a floor height and floor height elevation of a building
CN110490237B (zh) *	2019-08-02	2022-05-17	Oppo广东移动通信有限公司	数据处理方法、装置、存储介质及电子设备
US11508173B2 (en) *	2019-10-30	2022-11-22	Adobe Inc.	Machine learning prediction and document rendering improvement based on content order
CN111104881B (zh) *	2019-12-09	2023-12-01	科大讯飞股份有限公司	一种图像处理的方法和相关装置
WO2022082007A1 (en)	2020-10-15	2022-04-21	Cape Analytics, Inc.	Method and system for automated debris detection
WO2023283231A1 (en)	2021-07-06	2023-01-12	Cape Analytics, Inc.	System and method for property condition analysis
US11676298B1 (en)	2021-12-16	2023-06-13	Cape Analytics, Inc.	System and method for change analysis
AU2023208758A1 (en)	2022-01-19	2024-06-20	Cape Analytics, Inc.	System and method for object analysis

Citations (9)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
US5862259A (en) *	1996-03-27	1999-01-19	Caere Corporation	Pattern recognition employing arbitrary segmentation and compound probabilistic evaluation
US20030037036A1 (en) *	2001-08-20	2003-02-20	Microsoft Corporation	System and methods for providing adaptive media property classification
US20030174889A1 (en) *	2002-01-08	2003-09-18	Dorin Comaniciu	Image segmentation using statistical clustering with saddle point detection
US20080279460A1 (en)	2007-05-11	2008-11-13	Seiko Epson Corporation	Scene Classification Apparatus and Scene Classification Method
US7869648B2 (en) *	2003-10-24	2011-01-11	Adobe Systems Incorporated	Object extraction based on color and visual texture
US20110170769A1 (en)	2010-01-13	2011-07-14	Hitachi, Ltd.	Classifier learning image production program, method, and system
US20110176725A1 (en)	2010-01-21	2011-07-21	Sony Corporation	Learning apparatus, learning method and program
US8484139B2 (en) *	2007-08-08	2013-07-09	Hitachi, Ltd.	Data classification method and apparatus
US20140029839A1 (en) *	2012-07-30	2014-01-30	Xerox Corporation	Metric learning for nearest class mean classifiers

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
CN101853400B (zh) *	2010-05-20	2012-09-26	武汉大学	基于主动学习和半监督学习的多类图像分类方法
CN101937510B (zh) *	2010-09-14	2015-05-20	南京信息工程大学	基于类Haar和AdaBoost分类器的快速增量学习方法
CN102208037B (zh) *	2011-06-10	2012-10-24	西安电子科技大学	基于高斯过程分类器协同训练算法的高光谱图像分类方法
CN102436583B (zh) *	2011-09-26	2013-10-30	哈尔滨工程大学	基于对标注图像学习的图像分割方法
CN102508859B (zh) *	2011-09-29	2014-10-29	北京亿赞普网络技术有限公司	一种基于网页特征的广告分类方法及装置
CN102542295B (zh) *	2012-01-08	2013-10-16	西北工业大学	一种采用图像分类技术从遥感图像中进行滑坡检测的方法
CN103049760B (zh) *	2012-12-27	2016-05-18	北京师范大学	基于图像分块和位置加权的稀疏表示目标识别方法

2013
- 2013-04-26 JP JP2013093344A patent/JP6188400B2/ja active Active
2014
- 2014-03-14 CN CN201480022915.0A patent/CN105144239B/zh not_active Expired - Fee Related
- 2014-03-14 WO PCT/JP2014/056886 patent/WO2014174932A1/ja active Application Filing
2015
- 2015-10-08 US US14/878,210 patent/US9552536B2/en active Active

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
US5862259A (en) *	1996-03-27	1999-01-19	Caere Corporation	Pattern recognition employing arbitrary segmentation and compound probabilistic evaluation
US20030037036A1 (en) *	2001-08-20	2003-02-20	Microsoft Corporation	System and methods for providing adaptive media property classification
US20030174889A1 (en) *	2002-01-08	2003-09-18	Dorin Comaniciu	Image segmentation using statistical clustering with saddle point detection
US7869648B2 (en) *	2003-10-24	2011-01-11	Adobe Systems Incorporated	Object extraction based on color and visual texture
US20080279460A1 (en)	2007-05-11	2008-11-13	Seiko Epson Corporation	Scene Classification Apparatus and Scene Classification Method
JP2008282267A (ja)	2007-05-11	2008-11-20	Seiko Epson Corp	シーン識別装置、及び、シーン識別方法
US8484139B2 (en) *	2007-08-08	2013-07-09	Hitachi, Ltd.	Data classification method and apparatus
US20110170769A1 (en)	2010-01-13	2011-07-14	Hitachi, Ltd.	Classifier learning image production program, method, and system
JP2011145791A (ja)	2010-01-13	2011-07-28	Hitachi Ltd	識別器学習画像生成プログラム、方法、及びシステム
US20110176725A1 (en)	2010-01-21	2011-07-21	Sony Corporation	Learning apparatus, learning method and program
JP2011150541A (ja)	2010-01-21	2011-08-04	Sony Corp	学習装置、学習方法、及びプログラム
US20140029839A1 (en) *	2012-07-30	2014-01-30	Xerox Corporation	Metric learning for nearest class mean classifiers

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
International Search Report (ISR) dated Jun. 3, 2014 issued in International Application No. PCT/JP2014/056886.

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
US11544563B2 (en)	2017-12-19	2023-01-03	Olympus Corporation	Data processing method and data processing device
US20210272288A1 (en) *	2018-08-06	2021-09-02	Shimadzu Corporation	Training Label Image Correction Method, Trained Model Creation Method, and Image Analysis Device
US11830195B2 (en) *	2018-08-06	2023-11-28	Shimadzu Corporation	Training label image correction method, trained model creation method, and image analysis device
US11281912B2 (en) *	2019-01-09	2022-03-22	International Business Machines Corporation	Attribute classifiers for image classification
US20220172460A1 (en) *	2019-03-14	2022-06-02	Nec Corporation	Generation method, training data generation device and program
US11935277B2 (en) *	2019-03-14	2024-03-19	Nec Corporation	Generation method, training data generation device and program
US12026935B2 (en)	2019-11-29	2024-07-02	Olympus Corporation	Image processing method, training device, and image processing device

Also Published As

Publication number	Publication date
CN105144239A (zh)	2015-12-09
CN105144239B (zh)	2018-11-20
US20160026900A1 (en)	2016-01-28
JP2014215852A (ja)	2014-11-17
WO2014174932A1 (ja)	2014-10-30
JP6188400B2 (ja)	2017-08-30

Legal Events

Date	Code	Title	Description
2015-10-08	AS	Assignment	Owner name: OLYMPUS CORPORATION, JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ANDO, JUN;REEL/FRAME:036756/0868 Effective date: 20150909
2016-11-08	AS	Assignment	Owner name: OLYMPUS CORPORATION, JAPAN Free format text: CHANGE OF ADDRESS;ASSIGNOR:OLYMPUS CORPORATION;REEL/FRAME:040578/0441 Effective date: 20160401
2016-12-01	FEPP	Fee payment procedure	Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY
2017-01-04	STCF	Information on status: patent grant	Free format text: PATENTED CASE
2020-07-14	MAFP	Maintenance fee payment	Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4

Publication	Publication Date	Title
US9552536B2 (en)	2017-01-24	Image processing device, information storage device, and image processing method
TWI774659B (zh)	2022-08-21	圖像文字的識別方法和裝置
US8897575B2 (en)	2014-11-25	Multi-scale, perspective context, and cascade features for object detection
US11055571B2 (en)	2021-07-06	Information processing device, recording medium recording information processing program, and information processing method
US20180165551A1 (en)	2018-06-14	Technologies for improved object detection accuracy with multi-scale representation and training
US9665962B2 (en)	2017-05-30	Image distractor detection and processng
JP5775225B2 (ja)	2015-09-09	マルチレイヤ連結成分をヒストグラムと共に用いるテキスト検出
WO2018103608A1 (zh)	2018-06-14	一种文字检测方法、装置及存储介质
JP5591360B2 (ja)	2014-09-17	分類及び対象物検出の方法及び装置、撮像装置及び画像処理装置
US9165369B1 (en)	2015-10-20	Multi-object detection and recognition using exclusive non-maximum suppression (eNMS) and classification in cluttered scenes
EP3203417B1 (en)	2023-09-27	Method for detecting texts included in an image and apparatus using the same
US9202126B2 (en)	2015-12-01	Object detection apparatus and control method thereof, and storage medium
US9418440B2 (en)	2016-08-16	Image segmenting apparatus and method
US10169673B2 (en)	2019-01-01	Region-of-interest detection apparatus, region-of-interest detection method, and recording medium
CN107330027B (zh)	2020-05-22	一种弱监督的深度台标检测方法
US20150221097A1 (en)	2015-08-06	Harmless frame filter, harmful image blocking apparatus having the same, and method for filtering harmless frames
US8254690B2 (en)	2012-08-28	Information processing apparatus, information processing method, and program
CN114998595B (zh)	2022-11-08	弱监督语义分割方法、语义分割方法及可读存储介质
CN116670687A (zh)	2023-08-29	用于调整训练后的物体检测模型以适应域偏移的方法和***
JP5796107B2 (ja)	2015-10-21	テキスト検出の方法及び装置
JP2008165496A (ja)	2008-07-17	画像正規化装置、対象物検出装置、対象物検出システム及びプログラム
JP2006285959A (ja)	2006-10-19	顔判別装置の学習方法、顔判別方法および装置並びにプログラム
Jaiswal et al.	2015	Automatic image cropping using saliency map
JP6513311B2 (ja)	2019-05-15	文字認識装置および文字認識方法
KR20230006939A (ko)	2023-01-12	노이즈 부가를 이용한 위변조 이미지 판정 방법 및 그 장치