WO2010128646A1 - 画像処理装置および方法、並びにプログラム - Google Patents
画像処理装置および方法、並びにプログラム Download PDFInfo
- Publication number
- WO2010128646A1 WO2010128646A1 PCT/JP2010/057648 JP2010057648W WO2010128646A1 WO 2010128646 A1 WO2010128646 A1 WO 2010128646A1 JP 2010057648 W JP2010057648 W JP 2010057648W WO 2010128646 A1 WO2010128646 A1 WO 2010128646A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- information
- map
- difference
- subject
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/251—Fusion techniques of input or preprocessed data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/803—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of input or preprocessed data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/162—Detection; Localisation; Normalisation using pixel segmentation or colour matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/167—Detection; Localisation; Normalisation using comparisons between temporally consecutive images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20016—Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
Definitions
- the present invention relates to an image processing apparatus, method, and program, and more particularly, to an image processing apparatus, method, and program that can more easily identify a region of a subject on an image.
- Non-Patent Documents 1 and 2 Conventionally, a technique called visual attention is known as a technique for specifying a subject area on an image (see, for example, Non-Patent Documents 1 and 2).
- luminance information, color information, and edge information are extracted from the input image that has been input, and based on the extracted information, an information map that indicates the likelihood of the subject in each region on the input image, Generated for each piece of extracted information.
- Each information map is normalized by a filtering process using a DOG filter or linear normalization and added, and further normalized, and information obtained as a result is used as a subject map.
- This subject map is information indicating the likelihood of the subject in each region of the input image, and by using the subject map, it is possible to specify which region on the input image contains the subject.
- the average value of the R (red), G (green), and B (blue) components of the pixels of the input image is extracted as luminance information, and the color information of the pixels of the input image is extracted.
- the difference between the R and G components and the difference between the B and Y (yellow) components are extracted.
- edge strengths in directions of 0 degrees, 45 degrees, 90 degrees, and 135 degrees are extracted from the input image using a Gabor filter.
- the above-described technique requires a large amount of processing and takes a long time to generate a subject map.
- the filter processing using the Gabor filter requires an exponential operation, and thus the processing amount is large, and the DOG filter has a large number of taps, so that the processing amount of the filter processing using the DOG filter also increases.
- the processing amount for normalization can be suppressed, but it is difficult to remove noise from the information map at the time of normalization.
- Subject detection accuracy is reduced. That is, the noise region may be erroneously detected as the subject region.
- the circuit scale becomes large due to the exponential calculation of the Gabor filter and the number of taps of the DOG filter.
- the present invention has been made in view of such a situation, and makes it possible to obtain information for specifying a subject area on an image more easily and quickly.
- An image processing apparatus provides an extracted information image generating unit that generates a plurality of the extracted information images having different resolutions based on an extracted information image composed of predetermined information extracted from each region of an input image.
- a difference image generation means for generating a difference image by obtaining a difference between two predetermined extraction information images of the plurality of extraction information images, and weighted addition of the plurality of difference images.
- An information map generating means for generating an information map indicating a feature amount of a feature of a region of the subject on the input image, and an average value of the values of the regions of the information map from the values of the regions of the information map.
- Subtracting and normalizing means for normalizing the information map, and adding the weighted addition of the normalized plurality of information maps, the coverage in each region of the input image.
- a subject map generating means for generating a subject map indicating the area likeness of the body.
- the image processing apparatus extracts the image indicating the edge strength of each region of the input image by performing weighted addition of pixel values of some pixels of the input image using a predetermined coefficient.
- Edge image generation means for generating an information image can be further provided.
- the extracted information image generating means is different from each other by setting an average value of pixel values of adjacent pixels of the extracted information image to pixel values of pixels of another extracted information image different from the extracted information image.
- the plurality of extracted information images having a resolution can be generated.
- An image processing method or program generates a plurality of the extracted information images having different resolutions based on an extracted information image composed of predetermined information extracted from each region of an input image.
- a difference image is generated by obtaining a difference between two of the extracted information images of the extracted information image, and a plurality of the difference images are weighted and added, thereby obtaining a region of the subject on the input image.
- Generating an information map indicating the feature amount of the features of the information map subtracting the average value of the values of each area of the information map from the value of each area of the information map, and normalizing the information map
- adding a plurality of the information maps with weights to generate a subject map indicating the likelihood of the subject in each region of the input image.
- a plurality of the extracted information images having different resolutions are generated based on extracted information images made up of predetermined information extracted from each region of the input image, and the plurality of extracted information images
- a difference image is generated by obtaining a difference between two predetermined extracted information images, and a feature of a subject area on the input image is obtained by weighted addition of the plurality of difference images.
- An information map indicating the amount is generated, the average value of the values of each area of the information map is subtracted from the value of each area of the information map, the information map is normalized, and the plurality of normalized
- a subject map that indicates the likelihood of the subject in each region of the input image is generated.
- information for specifying a subject area on an image can be obtained more easily and quickly.
- FIG. 1 It is a figure which shows the structural example of one Embodiment of the image processing apparatus to which this invention is applied. It is a figure which shows the structural example of a brightness
- FIG. 1 is a diagram showing a configuration example of an embodiment of an image processing apparatus to which the present invention is applied.
- the image processing apparatus 11 includes a luminance information extraction unit 21, a color information extraction unit 22, an edge information extraction unit 23, a face information extraction unit 24, a motion information extraction unit 25, a subject map generation unit 26, and a subject region specification unit 27. Is done.
- the image processing device 11 includes an imaging device that captures an input image including a subject, and the input image obtained by the imaging is supplied to the luminance information extraction unit 21 to the motion information extraction unit 25 and the subject region specification unit 27. Is done.
- This input image is a video signal composed of a Y (luminance) component, a Cr (color difference) component, and a Cb (color difference) component.
- the luminance information extraction unit 21 to the motion information extraction unit 25 extract predetermined information from the supplied input image, and indicate the likelihood of the subject in each region of the input image based on the extracted information image composed of the extracted information. Generate an information map.
- the information included in these information maps is information indicating the feature quantities of features that are included more in the area including the subject, and the information map is obtained by arranging the information corresponding to each area of the input image. The That is, the information map can be said to be information indicating the feature amount in each region of the input image.
- the subject means an object on the input image that is estimated to be noticed by the user when the user glances at the input image, that is, an object that is estimated to be looked at by the user. Therefore, the subject is not necessarily limited to a person.
- the luminance information extraction unit 21 to the motion information extraction unit 25 generate a luminance information map, a color information map, an edge information map, a face information map, and a motion information map as information maps.
- the luminance information extraction unit 21 generates a luminance information map using the luminance image composed of the Y (luminance) component of the supplied input image as an extraction information image, and supplies the luminance information map to the subject map generation unit 26.
- the color information extraction unit 22 generates a color information map using the Cr image composed of the Cr component and the Cb image composed of the Cb component of the supplied input image as an extraction information image, and supplies the color information map to the subject map generation unit 26.
- the edge information extraction unit 23 generates an edge information map using an edge image composed of the edge intensity of each region of the supplied input image as an extraction information image, and supplies the generated edge information map to the subject map generation unit 26.
- the face information extraction unit 24 generates a face information map using an image including information about a human face as a subject in each region of the supplied input image as an extraction information image, and supplies the face information map to the subject map generation unit 26.
- the motion information extraction unit 25 generates a motion information map using an image including information regarding motion in each region of the supplied input image as an extraction information image, and supplies the motion information map to the subject map generation unit 26.
- the subject map generating unit 26 adds the information maps supplied from the luminance information extracting unit 21 to the motion information extracting unit 25 to generate a subject map, and supplies the subject map to the subject region specifying unit 27.
- This subject map is information for specifying a region including a subject in the input image.
- the subject region specifying unit 27 uses the subject map from the subject map generating unit 26 to specify the region of the subject on the supplied input image and outputs the specification result.
- FIG. 2 is a block diagram illustrating a configuration example of the luminance information extraction unit 21.
- the luminance information extraction unit 21 includes a pyramid image generation unit 51, a difference calculation unit 52, a weighted addition unit 53, and a normalization unit 54.
- the pyramid image generation unit 51 generates an image composed of the Y component of the supplied input image as a luminance image, generates a plurality of luminance images having different resolutions using the luminance image, and converts the luminance images into luminance pyramid images.
- the difference calculation unit 52 the pixel value of the pixel of the luminance image generated from the input image is the value of the Y component of the pixel of the input image at the same position as the pixel.
- pyramid images L1 to L7 having seven resolution layers from level L1 to level L7 are generated.
- the pyramid image L1 at the level L1 has the highest resolution, and the resolution of the pyramid image is sequentially decreased from the level L1 to the level L7.
- a luminance image having the same resolution (number of pixels) as the input image, which is composed of the Y component of the input image, is set as the pyramid image L1 of level L1.
- the pyramid image L (of level L (i + 1)) i + 1) is generated.
- the pyramid image Li is down-converted so as to have half the number of pixels in the horizontal direction in the figure, and the resulting image is the image Li ′.
- the average value of the pixel values of the pixels g1 and g2 adjacent to each other in the horizontal direction of the pyramid image Li is set as the pixel value of the pixel g3 of the image Li ′.
- the image Li ′ is down-converted so as to have half the number of pixels in the vertical direction in the figure, and the resulting image is a pyramid image L (i + 1) of level L (i + 1).
- the average value of the pixel values of the pixels g3 and g4 adjacent to each other in the vertical direction of the image Li ′ is the pixel value of the pixel g5 of the pyramid image L (i + 1).
- the difference calculation unit 52 selects two pyramid images having different hierarchies from the pyramid images of the respective hierarchies supplied from the pyramid image generation unit 51. A difference image of the pyramid image is obtained and a difference image of brightness is generated.
- the pyramid images of each layer have different sizes (number of pixels), when generating the difference image, the smaller pyramid image is up-converted to match the larger pyramid image.
- the difference calculation unit 52 When the difference calculation unit 52 generates a difference image having a predetermined number of luminances, the difference calculation unit 52 supplies the generated difference image to the weighted addition unit 53.
- the weighted addition unit 53 generates a luminance information map by weighted addition of the difference image supplied from the difference calculation unit 52 and supplies the luminance information map to the normalization unit 54.
- the normalization unit 54 normalizes the luminance information map from the weighted addition unit 53 and supplies it to the subject map generation unit 26.
- FIG. 5 is a block diagram illustrating a configuration example of the color information extraction unit 22.
- the color information extraction unit 22 includes a pyramid image generation unit 81, a pyramid image generation unit 82, a difference calculation unit 83, a difference calculation unit 84, a weighted addition unit 85, a weighted addition unit 86, a normalization unit 87, and a normalization unit. 88.
- the pyramid image generation unit 81 sets an image composed of the Cr component of the supplied input image as a Cr image
- the pyramid image generation unit 82 sets an image composed of the Cb component of the supplied input image as a Cb image.
- the pixel values of the pixels of the Cr image and the Cb image are the values of the Cr component and the Cb component of the pixel of the input image at the same position as the pixel.
- the pyramid image generation unit 81 and the pyramid image generation unit 82 generate a plurality of Cr images and Cb images having different resolutions using the Cr image and the Cb image. Then, the pyramid image generation unit 81 and the pyramid image generation unit 82 supply the generated Cr image and Cb image to the difference calculation unit 83 and the difference calculation unit 84 as a Cr pyramid image and a Cb pyramid image.
- pyramid images of seven resolution layers from level L1 to level L7 are generated.
- the difference calculation unit 83 and the difference calculation unit 84 select two pyramid images having different hierarchies from among the plurality of pyramid images from the pyramid image generation unit 81 and the pyramid image generation unit 82, and calculate the difference between the selected pyramid images.
- the difference image of Cr and the difference image of Cb are calculated
- the smaller pyramid image is up-converted to have the same size as the larger pyramid image.
- the difference calculation unit 83 and the difference calculation unit 84 When the difference calculation unit 83 and the difference calculation unit 84 generate a predetermined number of Cr difference images and Cb difference images, the difference calculation unit 83 and the difference calculation unit 84 supply the generated difference images to the weighted addition unit 85 and the weighted addition unit 86. To do.
- the weighted addition unit 85 and the weighted addition unit 86 weight-add the difference images supplied from the difference calculation unit 83 and the difference calculation unit 84 to generate a Cr color information map and a Cb color information map,
- the data is supplied to the normalization unit 87 and the normalization unit 88.
- the normalization unit 87 and the normalization unit 88 normalize the color information maps from the weighted addition unit 85 and the weighted addition unit 86 and supply them to the subject map generation unit 26.
- FIG. 6 is a block diagram illustrating a configuration example of the edge information extraction unit 23.
- the edge information extraction unit 23 includes an edge image generation unit 111 to an edge image generation unit 114, a pyramid image generation unit 115 to a pyramid image generation unit 118, a difference calculation unit 119 to a difference calculation unit 122, a weighted addition unit 123 to a weighted addition.
- the edge image generation unit 111 to the edge image generation unit 114 perform a filtering process on the supplied input image, and, for example, change edge strengths in directions of 0 degrees, 45 degrees, 90 degrees, and 135 degrees to pixel values of pixels. Is generated as an extracted information image.
- the pixel value of the pixel of the edge image generated by the edge image generation unit 111 indicates the edge strength in the direction of 0 degree in the pixel of the input image at the same position as the pixel.
- the direction of each edge refers to a direction determined based on a predetermined direction on the input image.
- the edge image generation unit 111 to the edge image generation unit 114 supply the generated edge image to the pyramid image generation unit 115 to the pyramid image generation unit 118.
- the pyramid image generation unit 115 through the pyramid image generation unit 118 generate a plurality of edge images having different resolutions using the edge images supplied from the edge image generation unit 111 through the edge image generation unit 114. Then, the pyramid image generation unit 115 to the pyramid image generation unit 118 supply the generated edge images in the respective directions to the difference calculation unit 119 to the difference calculation unit 122 as pyramid images in the respective directions of the edges.
- the difference calculation unit 119 through the difference calculation unit 122 select two pyramid images having different hierarchies from among the plurality of pyramid images from the pyramid image generation unit 115 through the pyramid image generation unit 118, and calculate the difference between the selected pyramid images.
- the difference image of each direction of an edge is calculated
- the smaller pyramid image is up-converted to have the same size as the larger pyramid image.
- the difference calculation unit 119 to the difference calculation unit 122 When the difference calculation unit 119 to the difference calculation unit 122 generate a difference image in each direction of the edge by a predetermined number, the difference calculation unit 119 to the difference calculation unit 122 supplies the generated difference images to the weighted addition unit 123 to the weighted addition unit 126.
- the weighted addition unit 123 to weighted addition unit 126 performs weighted addition of the difference images supplied from the difference calculation unit 119 to the difference calculation unit 122 to generate an edge information map in each direction, and the normalization unit 127 to This is supplied to the normalization unit 130.
- the normalization unit 127 to normalization unit 130 normalize the edge information maps from the weighted addition unit 123 to the weighted addition unit 126 and supply them to the subject map generation unit 26.
- FIG. 7 is a block diagram illustrating a configuration example of the face information extraction unit 24.
- the face information extraction unit 24 includes a face detection unit 161, a face information map generation unit 162, and a normalization unit 163.
- the face detection unit 161 detects a human face region as a subject from the supplied input image, and supplies the detection result to the face information map generation unit 162 as an extracted information image.
- the face information map generation unit 162 generates a face information map based on the detection result from the face detection unit 161 and supplies the face information map to the normalization unit 163.
- the normalization unit 163 normalizes the face information map supplied from the face information map generation unit 162 and supplies the normalized face information map to the subject map generation unit 26.
- FIG. 8 is a block diagram illustrating a configuration example of the motion information extraction unit 25.
- the motion information extraction unit 25 includes a local motion vector extraction unit 191, a global motion vector extraction unit 192, a difference calculation unit 193, and a normalization unit 194.
- the local motion vector extraction unit 191 detects a motion vector of each pixel of the input image as a local motion vector using the supplied input image and another input image whose imaging time is different from the input image, and performs a difference It supplies to the calculation part 193.
- the global motion vector extraction unit 192 detects a global motion vector using the supplied input image and another input image whose imaging time is different from the input image, and supplies the global motion vector to the difference calculation unit 193.
- This global motion vector indicates the direction of motion of the entire input image, and is, for example, an average value of motion vectors of each pixel of the input image.
- the difference calculation unit 193 obtains an absolute value of the difference between the local motion vector from the local motion vector extraction unit 191 and the global motion vector from the global motion vector extraction unit 192 to generate a motion difference image, and a normalization unit 194.
- the pixel value of the pixel in the motion difference image is the absolute value of the difference between the local motion vector of the pixel of the input image at the same position as the pixel and the global motion vector of the entire input image. Therefore, the pixel value of the pixel of the motion difference image indicates the relative amount of motion of the object (or background) displayed on the pixel of the input image with respect to the entire input image, that is, the background.
- the normalization unit 194 generates a motion information map by normalizing the motion difference image from the difference calculation unit 193 and supplies the motion information map to the subject map generation unit 26.
- the generation of the motion information map is performed when, for example, an input image captured continuously in time is supplied, for example, an input image continuously shot or a moving image. Done in case.
- step S11 the luminance information extraction unit 21 performs luminance information extraction processing, generates a luminance information map based on the supplied input image, and supplies the luminance information map to the subject map generation unit 26.
- step S 12 the color information extraction unit 22 performs color information extraction processing, generates a color information map based on the supplied input image, and supplies the color information map to the subject map generation unit 26.
- step S ⁇ b> 13 the edge information extraction unit 23 performs edge information extraction processing, generates an edge information map based on the supplied input image, and supplies it to the subject map generation unit 26.
- step S ⁇ b> 14 the face information extraction unit 24 performs face information extraction processing, generates a face information map based on the supplied input image, and supplies the face information map to the subject map generation unit 26.
- step S15 the motion information extraction unit 25 performs a motion information extraction process, generates a motion information map based on the supplied input image, and supplies the motion information map to the subject map generation unit 26.
- the motion information extraction process is not performed when an input image captured continuously in time is not supplied to the motion information extraction unit 25.
- step S ⁇ b> 16 the subject map generation unit 26 generates a subject map by weighted addition of the luminance information map or motion information map supplied from the luminance information extraction unit 21 to motion information extraction unit 25, and the subject region specification unit 27. To supply.
- the subject map generation unit 26 linearly combines the information maps using the information weight Wb, which is a weight obtained in advance for each information map. That is, if a predetermined pixel of an information map obtained by linear combination is a target pixel, the pixel value of the target pixel is set to the pixel value of each information map pixel at the same position as the target pixel, and the information weight Wb for each information map. Is the sum of the values obtained by multiplication.
- the subject map generation unit 26 performs arithmetic processing using a sigmoid function on the pixel value of each pixel of the information map obtained by linear combination (hereinafter also referred to as linear combination information map).
- the subject map generator 26 holds in advance a conversion table obtained by tabulating the sigmoid function.
- This conversion table consists of a predetermined value as an input and an output value obtained by substituting that value into the sigmoid function. If the linear combination information map is converted by the conversion table, the linear combination information map is converted by the sigmoid function. An information map similar to that obtained when converted is obtained.
- the sigmoid function is a hyperbolic cosine function (hyperbolic tangent function) shown in the following equation (1).
- Equation (1) a and b represent predetermined constants, and x is a pixel value of a pixel of the linear combination information map to be converted.
- the conversion table is obtained by limiting the range of the input value x to a range from ⁇ 2 to 2 and discretizing the input value x in 1/128 units.
- a table In such a conversion table, when the input value x is smaller than ⁇ 2, the input value x is treated as ⁇ 2, and when the input value x is larger than 2, the input value x is 2. Treated as being. Further, in the conversion table, the output value f (x) increases as the input value x increases.
- the subject map generation unit 26 changes the pixel value of the pixel of the linear combination information map from the pixel value x (input value x) to the output value f (x) corresponding to the pixel value x, thereby obtaining linear combination information. Convert the map. That is, the subject map generation unit 26 uses the linear combination information map converted using the conversion table as a linear combination information map that has been subjected to arithmetic processing using a sigmoid function.
- the subject map generation unit 26 multiplies the pixel value of each pixel of the linear combination information map converted by the conversion table by a subject weight Wc, which is a weight obtained in advance for each pixel, to obtain a subject map.
- a value obtained by multiplying the pixel value of the pixel at the same position as the pixel of interest of the converted linear combination information map by the subject weight Wc is The pixel value of the target pixel is used.
- a Cr color information map and a Cb color information map are used as the color information map used for generating the subject map, and 0 degree, 45 degree, 90 degree, An edge information map in each direction of 135 degrees is used. Further, the information weight Wb and the subject weight Wc are obtained in advance by learning.
- the subject map is supplied from the subject map generating unit 26 to the subject region specifying unit 27, and the process proceeds to step S17.
- step S17 the subject region specifying unit 27 uses the subject map supplied from the subject map generating unit 26 to specify the subject region on the supplied input image.
- the subject region specifying unit 27 detects and detects a region having a pixel value equal to or larger than a predetermined threshold on the subject map and having a predetermined area (number of pixels) or more. It is assumed that an area on the input image corresponding to the selected area is an area including the subject.
- the subject region specifying unit 27 When the subject region specifying unit 27 detects a region including the subject on the input image, the subject region specifying unit 27 outputs the detection result to the subsequent stage, and the subject region specifying process ends.
- the detection result of the subject area obtained in this way is used for various processes such as performing predetermined image processing on the subject area of the input image.
- the subject area specifying result may be used for, for example, image processing in which the subject area of the input image is displayed at the center of the screen when the input image is displayed as a slide show.
- the subject area specifying unit 27 may perform predetermined processing on the input image using the detection result of the subject area and output the result.
- the image processing apparatus 11 generates a subject map from the input image, and specifies the region of the subject in the input image using the subject map.
- step S ⁇ b> 41 the pyramid image generation unit 51 generates pyramid images of the respective layers of the levels L ⁇ b> 1 to L ⁇ b> 7 based on the luminance image composed of the Y (luminance) component of the supplied input image. Supply.
- pyramid images of eight layers are generated.
- the image processing apparatus 11 since the image processing apparatus 11 generates seven layers of pyramid images, the number of generated pyramid images is reduced by one. Become. Therefore, the luminance information map can be obtained more easily and quickly than in the past.
- a luminance image is generated by obtaining an average value of R, G, and B components of the input image.
- the image processing apparatus 11 can obtain a luminance image more easily and quickly by using the Y (luminance) component of the input image as it is as the luminance image. This also eliminates the need to provide a circuit for generating a luminance image, and the image processing apparatus 11 can be downsized.
- step S42 the difference calculation unit 52 generates a difference image using the pyramid image supplied from the pyramid image generation unit 51 and supplies the difference image to the weighted addition unit 53.
- the difference calculation unit 52 includes the level L2 and level L5, the level L2 and level L6, the level L3 and level L6, the level L3 and level L7, and the level L4 and level among the luminance pyramid images of each layer.
- the difference of the pyramid image of the combination of each layer of L7 is obtained.
- a total of five luminance difference images are obtained.
- the pyramid image at level L5 is up-converted according to the size of the pyramid image at level L2.
- the pixel value of one pixel of the level L5 pyramid image before up-conversion is set as the pixel value of several pixels adjacent to each other in the pyramid image of level L5 after up-conversion corresponding to the pixel. Then, the difference between the pixel value of the pixel of the level L5 pyramid image and the pixel value of the pixel of the level L2 pyramid image at the same position as the pixel is obtained, and the difference is set as the pixel value of the pixel of the difference image.
- the process of generating these difference images is equivalent to performing a filter process using a bandpass filter on the luminance image and extracting a predetermined frequency component from the luminance image.
- the pixel value of the pixel of the difference image obtained in this way is the difference between the pixel values of the pyramid image at each level, that is, the difference between the luminance at a predetermined pixel in the input image and the average luminance around the pixel. Is shown.
- an area having a large luminance difference from the surroundings in an image is an area that catches the eye of a person who sees the image, so that the area is likely to be a subject area. Therefore, it can be said that in each difference image, a pixel having a larger pixel value is a region that is more likely to be a subject region.
- step S 43 the weighted addition unit 53 generates a luminance information map based on the difference image supplied from the difference calculation unit 52 and supplies the luminance information map to the normalization unit 54.
- the weighted addition unit 53 weights and adds the five supplied difference images with the difference weight Wa that is a weight for each difference image obtained in advance, and generates a luminance information map. That is, the pixel values of the pixels at the same position in each difference image are multiplied by the difference weight Wa to obtain the sum of the pixel values multiplied by the difference weight Wa.
- the difference images are up-converted so that the difference images have the same size.
- the difference weight Wa is determined in advance by learning.
- the weighted addition unit 53 converts the pixel value of the pixel of the obtained luminance information map using the same conversion table as the conversion table held by the subject map generation unit 26, and obtained as a result.
- the luminance information map is supplied to the normalization unit 54.
- the conversion can be performed more easily and quickly by converting the luminance information map using the conversion table.
- step S44 the normalization unit 54 normalizes the luminance information map from the weighted addition unit 53, and supplies the luminance information map obtained as a result to the subject map generation unit 26 as a final luminance information map.
- the luminance information map is output, the luminance information extraction process ends, and the process thereafter proceeds to step S12 in FIG.
- the normalization unit 54 linearly normalizes the luminance information map. For example, when the pixel value range of the luminance information map is a range from 0 to 200, the pixel value range is set to a range from 0 to 255 by linear normalization.
- the normalization unit 54 obtains the average value of the pixel values of each pixel of the linear normalized luminance information map. That is, the total value of the pixel values of all the pixels in the luminance information map is divided by the number of pixels in the luminance information map to obtain an average value.
- the normalization unit 54 sets a value obtained by subtracting the obtained average value from the pixel value of each pixel of the linearly normalized luminance information map as the pixel value of the pixel of the final luminance information map.
- the pixel value of the pixel is set to 0. That is, among the pixels of the linearly normalized luminance information map, the final pixel value of the pixel whose pixel value is equal to or less than the average value is set to 0.
- the luminance information map is obtained by subtracting the average value from the pixel value of each pixel.
- the noise is reliably removed from. This is because the pixel value of the pixel in the noise portion is set to 0.
- the average value is calculated from the pixel value of the luminance information map. Subtracting and normalizing does not reduce the detection accuracy of the subject.
- the luminance information map can be normalized more easily and quickly by linearly normalizing the luminance information map and subtracting the average value from the pixel value of the luminance information map after linear normalization.
- the luminance information extraction processing unit 21 it is possible to obtain the same noise removal effect as that when the DOG filter is used by simple processing such as linear normalization, average value calculation, and subtraction.
- the normalization unit 54 can normalize the luminance information map more quickly with simpler processing than when the DOG filter is used. In addition, the normalization can more reliably remove noise from the luminance information map, and the detection accuracy of the subject area will not be reduced.
- the luminance information extraction unit 21 generates a luminance information map from the input image. According to the luminance information map thus obtained, it is possible to easily detect a region having a large difference in luminance, that is, a region that is easily noticeable by an observer who looks at the input image.
- step S71 the pyramid image generation unit 81 and the pyramid image generation unit 82 generate pyramid images of the respective levels L1 to L7 based on the Cr image and the Cb image formed of the color difference components of the supplied input image. . That is, processing similar to that described with reference to FIGS. 3 and 4 is performed, and a Cr pyramid image and a Cb pyramid image are generated.
- the pyramid image generation unit 81 and the pyramid image generation unit 82 supply the generated pyramid image to the difference calculation unit 83 and the difference calculation unit 84.
- the color information extraction unit 22 may generate a pyramid image of seven layers as in the case of the luminance information extraction unit 21, a color information map can be obtained more easily and more quickly than in the past. Will be able to.
- the difference between the R and G components of the pixel of the input image and the difference between the B and Y (yellow) components are extracted as the color information, so a process for obtaining these differences is required.
- the image processing apparatus 11 can obtain an extracted information image relating to color more easily and quickly by directly using the color difference component of the input image as the Cr image and the Cb image. Further, it is not necessary to provide a circuit for obtaining the difference, and the image processing apparatus 11 can be reduced in size.
- step S72 the difference calculation unit 83 and the difference calculation unit 84 generate a difference image based on the pyramid images supplied from the pyramid image generation unit 81 and the pyramid image generation unit 82, and the weighted addition unit 85 and the weighted addition. To the unit 86.
- the difference calculation unit 83 includes each of level L2 and level L5, level L2 and level L6, level L3 and level L6, level L3 and level L7, and level L4 and level L7 among the pyramid images of Cr of each layer.
- the difference of the pyramid image of the hierarchy combination is obtained. Thereby, a total of five difference images of Cr are obtained.
- the smaller pyramid image is up-converted in accordance with the pyramid image having the larger number of pixels.
- the difference calculation unit 84 also performs the same processing as the difference calculation unit 83 to generate a total of five Cb difference images.
- the process of generating these difference images is equivalent to performing a filtering process using a bandpass filter on the Cr image or Cb image and extracting a predetermined frequency component from the Cr image or Cb image.
- the pixel value of the pixel of the difference image obtained in this way is the difference between the pyramid images of each level, that is, the specific color component in the pixel of the input image and the average specific color component around the pixel The difference is shown.
- an area of a color that stands out from the surroundings in the image that is, an area that has a large difference from the surroundings of a specific color component is an eye-catching area of the person who sees the image. Is likely. Therefore, it can be said that in each difference image, a pixel having a larger pixel value is a region that is more likely to be a subject region.
- step S73 the weighted addition unit 85 and the weighted addition unit 86 generate a Cr color information map and a Cb color information map based on the difference images supplied from the difference calculation unit 83 and the difference calculation unit 84, The data is supplied to the normalization unit 87 and the normalization unit 88.
- the weighted addition unit 85 weights and adds the difference image of Cr supplied from the difference calculation unit 83 with the difference weight Wa for each difference image obtained in advance to form a color information map of one Cr. .
- the weighted addition unit 85 converts the pixel value of the obtained Cr color information map using the same conversion table as the conversion table held by the subject map generation unit 26, and obtains the result.
- the obtained color information map is supplied to the normalization unit 87.
- the weighted addition unit 86 weights and adds the Cb difference image supplied from the difference calculation unit 84 with the previously obtained difference weight Wa to obtain a color information map of one Cb.
- the color information map is converted using the conversion table and supplied to the normalization unit 88.
- weighted addition unit 85 and the weighted addition unit 86 conversion can be performed more easily and quickly by converting the color information map using the conversion table. Note that when the color information map is generated, the difference images are up-converted so that the difference images have the same size.
- step S74 the normalization unit 87 and the normalization unit 88 normalize the color information map from the weighted addition unit 85 and the weighted addition unit 86, and use the resulting color information map as final color information.
- the map is supplied to the subject map generation unit 26 as a map.
- the normalization unit 87 and the normalization unit 88 perform processing similar to the processing in step S44 in FIG. 10 to normalize the Cr color information map and the Cb color information map.
- the color information map can be normalized more easily and quickly by linearly normalizing the color information map and subtracting the average value from the pixel values of the color information map after linear normalization.
- the color information extraction unit 22 extracts an image of a specific color component from the input image, and generates a color information map from the image. According to the color information map obtained in this way, a region having a large specific color component compared to the surroundings in the input image, that is, a region that is easily noticeable by an observer who glances at the input image is easily detected. be able to.
- the color information extraction unit 22 has been described as extracting Cr and Cb components as color information extracted from the input image. However, the difference between the R (red) component and the G (green) component is described. And the difference between the B (blue) component and the Y (yellow) component may be extracted.
- step S111 the edge image generation unit 111 to the edge image generation unit 114 set the edge strengths in the directions of 0 degrees, 45 degrees, 90 degrees, and 135 degrees as pixel values based on the supplied input image. Generate an edge image.
- the edge image generation unit 111 to the edge image generation unit 114 hold the filters illustrated in FIG. 13 in advance, and generate an edge image as an extraction information image using these filters.
- each of filter1, filter2, filter45, and filter135 is one filter.
- the numerical values “ ⁇ 1, ⁇ 2, ⁇ 1, 2, 4, 2, ⁇ 1, ⁇ 2, ⁇ 1” in filter 1 indicate coefficients to be multiplied by the pixels of the input image.
- a predetermined direction in the input image for example, in FIG. 4, the horizontal direction is the x direction, and the direction perpendicular to the x direction, that is, the vertical direction in FIG. 4, is called the y direction.
- the coefficients of filter1 are arranged in the same arrangement as the pixels to be multiplied by those coefficients. Therefore, for example, the pixel located at both ends of the pixels arranged in the x direction is multiplied by the coefficient “ ⁇ 1”, and among the pixels arranged in the x direction, the pixel located at the center is multiplied by the coefficient “4”.
- the coefficients of other filters are also arranged in the same arrangement as the pixels multiplied by those coefficients.
- coefficients “1”, “3”, “3”, “1”, “1”, “1”, “8” are respectively applied to the pixel values of the eight pixels continuously arranged in the x direction.
- Each of “3”, “3”, and “1” is multiplied, and the sum of the pixel values multiplied by the coefficient is divided by “16”.
- the value obtained as a result is the pixel at the center of the eight pixels arranged in succession (more specifically, the pixel multiplied by the fourth or fifth coefficient “1” from the left in the figure). Is a pixel value obtained by performing filter processing using filter2.
- pixels in a region consisting of a total of 9 pixels of 3 pixels in the x direction and 3 pixels in the y direction are used, and coefficients “0”, “1”, “2”, “ ⁇ 1”, “0”, “1”, “ ⁇ 2”, “ ⁇ 1”, “0” are multiplied.
- the sum of the pixel values of the pixels multiplied by the coefficient is divided by “8”, and the value obtained as a result is subjected to filter processing using filter 45 for the pixel located at the center of the region to be processed
- the pixel value obtained by applying Therefore, for example, a pixel located at the center of the region to be processed is multiplied by a coefficient “0”, and a pixel adjacent to the left side of the pixel is multiplied by a coefficient “ ⁇ 1”.
- pixels in a region consisting of a total of 9 pixels, 3 pixels in the x direction and 3 pixels in the y direction, are used, and coefficients “2” and “1” are used for the pixel values of these pixels.
- the sum of the pixel values of the pixels multiplied by the coefficient is divided by “8”, and the resulting value is subjected to filter processing using the filter 135 for the pixel located at the center of the region to be processed The pixel value obtained by applying.
- the edge image generation unit 111 performs filter processing using filter1 on the input image, and further performs filter processing using filter2 on the image obtained as a result of the filtering. An image. Further, the edge image generation unit 112 sets an image obtained by performing filter processing using the filter 45 on the input image as an edge image in the 45 degree direction.
- the edge image generation unit 113 performs a filter process using filter2 on the input image, and further obtains an image obtained by performing a filter process using filter1 on the resulting image as an edge image in the 90-degree direction. To do. Furthermore, the edge image generation unit 114 sets an image obtained by performing filter processing using the filter 135 on the input image as an edge image in the 135-degree direction.
- the edge image generation unit 111 to the edge image generation unit 114 generate edge images in respective directions using at least one of filter1, filter2, filter45, and filter135 that are held in advance.
- These filters are filters obtained by approximating a Gabor filter and have characteristics close to those of a Gabor filter.
- the filter processing using these filters is a calculation of weighted addition using a predetermined coefficient, and a complicated operation such as an exponent operation is not necessary for the filter processing.
- a Gabor filter is used to obtain an edge image.
- the image processing apparatus 11 performs filter processing by combining filter1, filter2, filter45, and filter135, thereby making it easier and faster to perform edge processing. Can be obtained.
- the filter used for generating the edge image is not limited to the example shown in FIG. 13, and may be a combination of Sobel Filter and Roberts Filter. In such a case, for example, a filter shown in FIG. 14 is used.
- each of filter0, filter90, filter45, and filter135 is a single filter.
- the coefficients of each filter are arranged in the same arrangement as the pixels of the input image to be multiplied by these coefficients.
- the numerical value “1, 2, 1, 0, 0, 0, ⁇ 1, ⁇ 2, ⁇ 1” in filter 0 indicates a coefficient to be multiplied by the pixel of the input image.
- pixels in a region consisting of a total of 9 pixels of 3 pixels in the x direction and 3 pixels in the y direction are used, and coefficients “1”, “2”, “1”, “0”, “0”, “0”, “ ⁇ 1”, “ ⁇ 2”, “ ⁇ 1” are multiplied.
- the sum of the pixel values of the pixels multiplied by the coefficient is divided by “8”, and the obtained value is subjected to filter processing using filter0 for the pixel located at the center of the region to be processed
- the pixel value obtained by applying Therefore, for example, a pixel located at the center of the region to be processed is multiplied by a coefficient “0”, and a pixel adjacent to the upper side of the pixel is multiplied by a coefficient “2”.
- pixels in a region consisting of a total of 9 pixels of 3 pixels in the x direction and 3 pixels in the y direction are used, and coefficients “1”, “ “0”, “ ⁇ 1”, “2”, “0”, “ ⁇ 2”, “1”, “0”, “ ⁇ 1” are multiplied.
- the sum of the pixel values of the pixels multiplied by the coefficient is divided by “8”, and the obtained value is subjected to filter processing using the filter 90 for the pixel located at the center of the region to be processed The pixel value obtained by applying.
- pixels in a region consisting of a total of 4 pixels, 2 pixels in the x direction and 2 pixels in the y direction, are used, and coefficients “0” and “1” are used for the pixel values of these pixels. ”,“ ⁇ 1 ”,“ 0 ”.
- the sum of the pixel values of the pixels multiplied by the coefficient is divided by “2”, and the resulting value is the pixel located at the center of the region to be processed (more specifically, the upper left in the figure)
- pixels in a region consisting of a total of four pixels of two pixels in the x direction and two pixels in the y direction are used, and coefficients “1” and “0” are used for the pixel values of these pixels. ”,“ 0 ”, and“ ⁇ 1 ”.
- the sum of the pixel values of the pixels multiplied by the coefficient is divided by “2”, and the resulting value is the pixel located at the center of the region to be processed (more specifically, the upper left in the figure)
- the edge image generation unit 111 to the edge image generation unit 114 perform filter processing using each of filter0, filter45, filter90, and filter135 on the input image, and the resulting image is converted to 0 degrees, 45 degrees, Edge images in the directions of 90 degrees and 135 degrees are used.
- the edge image generation unit 111 to the edge image generation unit 114 generate edge images in the respective directions
- the generated edge images are supplied to the pyramid image generation unit 115 to the pyramid image generation unit 118. .
- step S112 the pyramid image generation unit 115 to pyramid image generation unit 118 generate a pyramid image based on the edge images from the edge image generation unit 111 to the edge image generation unit 114, and the difference calculation unit 119 to the difference calculation unit. 122.
- step S113 the difference calculation unit 119 through the difference calculation unit 122 generate a difference image based on the pyramid images supplied from the pyramid image generation unit 115 through the pyramid image generation unit 118, and the weighted addition unit 123 through the weighted addition. To the unit 126.
- the difference calculation unit 119 includes the level L2 and level L5, the level L2 and level L6, the level L3 and level L6, the level L3 and level L7, and the level L4 and level L7 among the pyramid images in the 0 degree direction of each hierarchy.
- the difference of the pyramid image of each layer combination is obtained. Thereby, a total of five difference images are obtained.
- the smaller pyramid image is up-converted in accordance with the pyramid image having the larger number of pixels.
- the difference calculation unit 120 to the difference calculation unit 122 perform the same process as the difference calculation unit 119 to generate a total of five difference images.
- the process of generating these difference images is equivalent to performing a filter process using a bandpass filter on the edge image and extracting a predetermined frequency component from the edge image.
- the pixel values of the pixels of the difference image obtained in this way are the difference in edge strength between the pyramid images at each level, that is, the edge strength at a predetermined position of the input image and the average edge strength around that position. The difference is shown.
- an area having a higher edge strength than the surrounding area in an image is an area that catches the eye of a person who views the image, and therefore, the area is highly likely to be an object area. Therefore, it can be said that in each difference image, a pixel having a larger pixel value is a region that is more likely to be a subject region.
- step S ⁇ b> 114 the weighted addition unit 123 to weighted addition unit 126 has directions of 0 degrees, 45 degrees, 90 degrees, and 135 degrees based on the difference images supplied from the difference calculation sections 119 to 122. Generate an edge information map.
- the weighted addition unit 123 weights and adds the 0 degree difference image supplied from the difference calculation unit 119 with the difference weight Wa for each difference image obtained in advance. Information map.
- the weighted addition unit 123 converts the pixel value of the obtained pixel of the edge information map in the 0 degree direction using the same conversion table as the conversion table held by the subject map generation unit 26, The obtained edge information map is supplied to the normalization unit 127.
- the weighted addition unit 124 to the weighted addition unit 126 weight-adds the difference images in the respective directions supplied from the difference calculation unit 120 to the difference calculation unit 122 with the difference weight Wa obtained in advance, One edge information map is assumed. Then, the weighted addition unit 124 to the weighted addition unit 126 convert the obtained edge map using the conversion table, and supply the converted edge map to the normalization unit 128 to the normalization unit 130.
- the conversion can be performed more easily and quickly by converting the edge information map using the conversion table. Note that when the edge information map is generated, the difference image is up-converted so that each difference image has the same size.
- step S115 the normalization unit 127 to normalization unit 130 normalize the edge information map from the weighted addition unit 123 to the weighted addition unit 126, and use the resulting edge information map as final edge information.
- the map is supplied to the subject map generation unit 26 as a map.
- the normalization unit 127 to the normalization unit 130 perform processing similar to the processing in step S44 in FIG. 10 to normalize the edge information map in each direction.
- the edge information map can be normalized more easily and quickly by linearly normalizing the edge information map and subtracting the average value of the pixel values from the edge information map after linear normalization.
- the edge information extraction unit 23 obtains a difference image of an edge in a specific direction from the input image, and generates an edge information map from the difference image. According to the edge information map for each direction obtained in this way, in the input image, compared to the surrounding area, a region having a large edge strength in a specific direction, that is, the eyes of the observer who glanced at the input image. An easily attached region can be easily detected.
- the face detection unit 161 detects a human face region from the supplied input image, and supplies the detection result to the face information map generation unit 162.
- the face detection unit 161 performs a filtering process using a Gabor filter on the input image, and extracts characteristic regions such as human eyes, mouth, and nose from the input image, whereby a face region in the input image is obtained. Is detected.
- step S142 the face information map generation unit 162 generates a face information map using the detection result from the face detection unit 161, and supplies the face information map to the normalization unit 163.
- candidate areas a plurality of rectangular areas (hereinafter referred to as candidate areas) on the input image that are estimated to include a face are detected as face detection results from the input image.
- candidate areas a plurality of candidate areas are detected near a predetermined position on the input image, and some of these candidate areas may overlap each other. That is, for example, when a plurality of areas including the face are obtained as candidate areas for one face area on the input image, some of these candidate areas overlap each other.
- the face information map generation unit 162 generates a detection image having the same size as the input image for each candidate region, with respect to the candidate region obtained by the face detection.
- the pixel value of a pixel in the same region as the candidate region to be processed on the detected image is set to a value larger than the pixel value of a pixel in a region different from the candidate region.
- the pixel value of the pixel on the detected image increases as the pixel is located at the same position as the pixel in the candidate area where it is estimated that the human face is more likely to be included.
- the face information map generating unit 162 adds the detected images obtained in this way to generate one image, which is used as a face information map. Therefore, on the face information map, the pixel value of the pixel in the same region as the region where a part of the plurality of candidate regions on the input image overlaps increases, and the possibility that a face is included is higher.
- step S143 the normalization unit 163 normalizes the face information map supplied from the face information map generation unit 162, and uses the obtained face information map as a final face information map to the subject map generation unit 26. Supply.
- the normalization unit 163 normalizes the face information map by performing the same processing as the processing in step S44 of FIG.
- the face information extraction unit 24 detects a face from the input image, and generates a face information map from the detection result. According to the face information map obtained in this way, it is possible to easily detect a human face area as a subject in an input image.
- step S171 the local motion vector extraction unit 191 uses the supplied input image to detect a local motion vector of each pixel of the input image using a gradient method or the like, and supplies the local motion vector to the difference calculation unit 193.
- step S172 the global motion vector extraction unit 192 detects a global motion vector using the supplied input image, and supplies the global motion vector to the difference calculation unit 193.
- step S173 the difference calculation unit 193 obtains the absolute value of the difference between the local motion vector from the local motion vector extraction unit 191 and the global motion vector from the global motion vector extraction unit 192, and generates a motion difference image. . Then, the difference calculation unit 193 supplies the generated motion difference image to the normalization unit 194.
- step S174 the normalization unit 194 generates a motion information map by normalizing the difference image supplied from the difference calculation unit 193, and uses the resulting motion information map as a final motion information map. This is supplied to the subject map generator 26.
- the normalization unit 194 normalizes the motion information map by performing the same process as the process of step S44 of FIG.
- the motion information extraction unit 25 detects a motion from the input image and generates a motion information map from the detection result. According to the motion information map obtained in this way, it is possible to easily detect a region of a moving object in the input image.
- a region of a moving object is a region that is easily noticed by an observer who looks at the input image, and is likely to be a subject.
- Each information map is obtained by the luminance information extraction processing or motion information extraction processing described above, and a subject map is generated from these information maps.
- the information map is linearly normalized, and the average value is subtracted from the pixel value of the information map after linear normalization, thereby normalizing the information map more easily and quickly. be able to.
- an information map for specifying the region of the subject on the image can be obtained more easily and quickly.
- the average value is subtracted from the pixel value of the information map, so that noise can be more reliably removed by simpler processing.
- the image processing apparatus 11 extracts a plurality of pieces of information estimated to have more subject areas from the input image, and generates a subject map using these pieces of information, so that the subject can be more reliably obtained from the input image.
- the area is detected.
- the pixel value of the subject map becomes larger on the input image as it is estimated that the observer who glanced at the input image looks more closely. Therefore, not only when the subject is a person, but also animals, plants, buildings, etc. Even general ones can be detected.
- Such a subject map is generated by extracting information such as brightness, color, edge, face, and movement from an input image. That is, the difference images obtained from the pyramid images of the extracted information are weighted and added by the difference weight Wa to form an information map, and those information maps are weighted and added by the information weight Wb. Further, the image (map) obtained as a result is multiplied by the subject weight Wc to obtain a subject map.
- the difference weight Wa, the information weight Wb, and the subject weight Wc used when the subject map is generated are obtained by learning using a neural network, for example.
- the learning image used for learning these weights is not limited to a person, and if an image including a general subject is used, the subject map generated using the weight obtained by learning can be A specific subject can be detected more reliably.
- a subject map is generated using the difference weight Wa, the information weight Wb, and the subject weight Wc to which initial values are given, and a learning image including the subject.
- a difference image An (m) (where 1 ⁇ n ⁇ N, 1 ⁇ m ⁇ 6) is generated for each piece of information extracted when the subject map is generated from a learning image prepared in advance.
- the difference images An (1) to An (6) are difference images for one piece of information extracted from the learning image.
- the difference image A1 (1) to the difference image A1 (6) are luminance difference images generated by using the luminance pyramid image obtained from the learning image.
- the difference images AN (1) to AN (6) are 0-degree difference images generated by using pyramid images of edges in the 0-degree direction obtained from the learning images. .
- FIG. 17 shows an example in which six difference images are obtained for each piece of information extracted from the learning image
- the number of difference images may be any number.
- the number of difference images is five.
- difference weight Wan (m) the difference weight Wan (m).
- each of the difference images A1 (1) to A1 (6) is weighted and added by the difference weights Wa1 (1) to Wa1 (6) for each difference image to obtain the information map B1 in.
- the information map B1 in is calculated by the above-described equation (1), that is, the sigmoid function f (x), and as a result, the information map B1 out is obtained.
- the value f (x) obtained by substituting the pixel value x of the pixel of the information map B1 in into the equation (1) is the pixel value of the pixel of the information map B1 out at the same position as the pixel.
- the information map B1 out obtained in this way corresponds to an information map generated in the image processing apparatus 11, for example, a luminance information map.
- f (x) is not limited to the hyperbolic cosine function, and may be any function.
- f (x) is a function that outputs a value “1” when x ⁇ 0 and outputs a value “ ⁇ 1” when x ⁇ 0.
- the information maps Bn out (where 1 ⁇ n ⁇ N) are weighted by the information weight Wb for each information map.
- the subject map C in is added.
- the subject map C in is calculated by the sigmoid function f (x), and as a result, the subject map C out is obtained.
- this subject map C out the subject weight Wc is normalized by multiplying, it is the final subject map.
- weighted addition is performed using an information map obtained without generating a difference image, for example, an information map such as a face information map.
- an information map such as a face information map.
- the information weight Wb multiplied by the information map Bn out is also referred to as an information weight Wbn.
- the process for generating a subject map during learning in this way is called Forward Propagation.
- a process called Back Propagation is performed, and the difference weight Wa, the information weight Wb, and the subject weight Wc are updated.
- each weight should be increased or decreased by using the generated subject map and an image label that is prepared in advance for the learning image and is information indicating the region of the subject on the learning image. The difference between the weights as values is obtained.
- the image label is an image having the same size as the learning image
- the pixel value of the pixel at the same position as the pixel of the subject area on the learning image is set to 1
- the area without the subject on the learning image is set.
- This is an image in which the pixel value of a pixel at the same position as the pixel is zero.
- Equation (2) ⁇ indicates a learning rate that is a predetermined constant, and C in indicates a subject map C in . More specifically, C in in Equation (2) is a pixel value of one pixel of the subject map C in , and the subject weight difference ⁇ Wc is obtained for each pixel. ⁇ C is the difference between the subject maps, and is obtained by the following equation (3).
- EV indicates an evaluation map
- f ′ (C in ) is a value obtained by substituting the subject map C in into a function obtained by differentiating the sigmoid function f (x). is there.
- the function f ′ (x) obtained by differentiating the function f (x) is specifically a function represented by the following equation (4).
- the subject weight difference ⁇ Wc is obtained in this way, the subject weight difference ⁇ Wc is added to the previous subject weight Wc and updated to obtain a new subject weight Wc.
- the information weight difference ⁇ Wbn which is the amount by which the information weight Wbn should be changed, is calculated by the following equation (5). Desired.
- Equation (5) ⁇ indicates a learning rate that is a predetermined constant, and Bn in indicates an information map Bn in . More specifically, Bn in in equation (5) is the pixel value of one pixel of the information map Bn in , and the information weight difference ⁇ Wbn is obtained for each pixel. ⁇ Bn is the difference between the information maps, and is obtained by the following equation (6).
- ⁇ C represents a value obtained by calculating Expression (3) described above
- f ′ (Bn in ) is a function obtained by differentiating the sigmoid function f (x).
- Wc is the updated subject weight Wc.
- the information weight difference ⁇ Wbn for the information map Bn in is obtained in this manner, the information weight difference ⁇ Wbn is added to the information weight Wbn of the information map Bn in and updated to obtain a new information weight Wbn. .
- the updated information weight Wbn and the difference image An (m) generated at the time of subject map generation are used, and the difference weight difference ⁇ that is the amount by which the difference weight Wa should be changed by the following equation (7). Wan (m) is required.
- ⁇ Wan (m) ⁇ ⁇ An (m) ⁇ ⁇ An (m) (7)
- Equation (7) ⁇ indicates a learning rate that is a predetermined constant, and An (m) indicates the difference image An (m).
- An (m) in Expression (7) is a pixel value of one pixel of the difference image An (m), and the difference ⁇ Wan (m) is obtained for each pixel.
- ⁇ An (m) is a difference between the difference images, and is obtained by the following equation (8).
- ⁇ Bn represents a value obtained by calculating Expression (6) described above
- f ′ (An (m)) is a function obtained by differentiating the sigmoid function f (x). Is a value obtained by substituting the difference image An (m).
- Wbn is the updated information weight Wbn.
- the difference weight difference ⁇ Wan (m) with respect to the difference image An (m) is obtained in this way, the difference weight difference ⁇ Wan (m) becomes the difference weight Wan (m) of the difference image An (m). And a new difference weight Wan (m) is obtained.
- the absolute value of the maximum pixel value of the pixels of the evaluation map is equal to or less than a predetermined threshold value, and This is repeated until the respective weights are updated a predetermined number of times or more. That is, the process of updating the weight is performed until a subject map that can extract a subject from the image with sufficient accuracy is obtained.
- an evaluation map is generated from a subject map generated using a pre-assigned weight and an image label, and each weight should be changed by back calculation from the evaluation map.
- a difference in weight which is a change amount, is obtained.
- the image label is information indicating the region of the subject on the learning image, it can be said to be information indicating the correct answer of the subject map. Therefore, the evaluation map, which is the difference between the subject map and the image label, indicates an error between the ideal subject map and the subject map generated using the given weight, and is calculated back using the evaluation map. Then, an error between the given weight and the ideal weight can be obtained.
- the obtained error is a change amount that should change the given weight, and if this change amount is added to the weight, the ideal weight at the present time can be obtained. If the subject map is generated using the newly obtained weight in this manner, the subject can be detected more reliably from the image by the subject map.
- the difference ⁇ Wan (m), the information weight difference ⁇ Wbn, and the subject weight difference ⁇ Wc are obtained as weight change amounts, and each weight is updated.
- the series of processes described above can be executed by hardware or software.
- a program constituting the software may execute various functions by installing a computer incorporated in dedicated hardware or various programs. For example, it is installed from a program recording medium in a general-purpose personal computer or the like.
- FIG. 19 is a block diagram showing an example of a hardware configuration of a computer that executes the above-described series of processes by a program.
- a CPU Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- An input / output interface 605 is further connected to the bus 604.
- the input / output interface 605 includes an input unit 606 including a keyboard, a mouse, and a microphone, an output unit 607 including a display and a speaker, a recording unit 608 including a hard disk and a non-volatile memory, and a communication unit 609 including a network interface.
- a drive 610 for driving a removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory is connected.
- the CPU 601 loads the program recorded in the recording unit 608 to the RAM 603 via the input / output interface 605 and the bus 604 and executes the program, for example. Is performed.
- the program executed by the computer (CPU 601) is, for example, a magnetic disk (including a flexible disk), an optical disk (CD-ROM (Compact Disc-Read Only Memory), DVD (Digital Versatile Disc), etc.), a magneto-optical disc, or a semiconductor
- the program is recorded on a removable medium 611 that is a package medium including a memory or the like, or is provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting.
- the program can be installed in the recording unit 608 via the input / output interface 605 by attaching the removable medium 611 to the drive 610. Further, the program can be received by the communication unit 609 via a wired or wireless transmission medium and installed in the recording unit 608. In addition, the program can be installed in the ROM 602 or the recording unit 608 in advance.
- the program executed by the computer may be a program that is processed in time series in the order described in this specification, or in parallel or at a necessary timing such as when a call is made. It may be a program for processing.
- 11 Image processing device 21 Luminance information extraction unit, 22 Color information extraction unit, 23 Edge information extraction unit, 24 Face information extraction unit, 25 Motion information extraction unit, 26 Subject map generation unit, 53 Weighted addition unit, 54 Normalization Part, 85 weighted addition part, 86 weighted addition part, 87 normalization part, 88 normalization part, 123 weighted addition part, 124 weighted addition part, 125 weighted addition part, 126 weighted addition part, 127 regular Normalization unit, 128 normalization unit, 129 normalization unit, 130 normalization unit
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Human Computer Interaction (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
Description
図1は、本発明を適用した画像処理装置の一実施の形態の構成例を示す図である。
次に、図1の輝度情報抽出部21乃至動き情報抽出部25のより詳細な構成について説明する。
図5は、色情報抽出部22の構成例を示すブロック図である。
図6は、エッジ情報抽出部23の構成例を示すブロック図である。
図7は、顔情報抽出部24の構成例を示すブロック図である。
図8は、動き情報抽出部25の構成例を示すブロック図である。
ところで、画像処理装置11に入力画像が供給されると、画像処理装置11は、被写体領域特定処理を開始して、入力画像における被写体の領域を特定し、その特定結果を出力する。以下、図9のフローチャートを参照して、被写体領域特定処理について説明する。
次に、図9のステップS11乃至ステップS15の処理のそれぞれに対応する処理について説明する。
次に、図11のフローチャートを参照して、図9のステップS12の処理に対応する色情報抽出処理について説明する。
次に、図12のフローチャートを参照して、図9のステップS13の処理に対応するエッジ情報抽出処理について説明する。
次に、図15のフローチャートを参照して、図9のステップS14の処理に対応する顔情報抽出処理について説明する。
さらに、図16のフローチャートを参照して、図9のステップS15の処理に対応する動き情報抽出処理について説明する。
ところで、画像処理装置11では、入力画像から被写体の領域がより多く有すると推定される複数の情報を抽出し、それらの情報を用いて被写体マップを生成することにより、入力画像からより確実に被写体の領域を検出している。被写体マップの画素値は、入力画像上において、入力画像を一瞥した観察者がより目を向けると推定される領域ほど大きくなるので、被写体が人である場合に限らず、動物や植物、建物などの一般的なものであっても、検出することができる。
Claims (6)
- 入力画像の各領域から抽出された所定の情報からなる抽出情報画像に基づいて、互いに異なる解像度の複数の前記抽出情報画像を生成する抽出情報画像生成手段と、
前記複数の前記抽出情報画像のうちの所定の2つの前記抽出情報画像の差分を求めることにより、差分画像を生成する差分画像生成手段と、
複数の前記差分画像を重み付き加算することにより、前記入力画像上の被写体の領域が有する特徴の特徴量を示す情報マップを生成する情報マップ生成手段と、
前記情報マップの各領域の値から、前記情報マップの各領域の値の平均値を減算して、前記情報マップを正規化する正規化手段と、
正規化された複数の前記情報マップを重み付き加算することにより、前記入力画像の各領域における前記被写体の領域らしさを示す被写体マップを生成する被写体マップ生成手段と
を備える画像処理装置。 - 前記入力画像のいくつかの画素の画素値を、予め定められた係数を用いて重み付き加算することで、前記入力画像の各領域のエッジ強度を示す画像を、前記抽出情報画像として生成するエッジ画像生成手段をさらに備える
請求項1に記載の画像処理装置。 - 前記抽出情報画像生成手段は、前記抽出情報画像の互いに隣接する画素の画素値の平均値を、前記抽出情報画像とは異なる他の抽出情報画像の画素の画素値とすることにより、互いに異なる解像度の前記複数の前記抽出情報画像を生成する
請求項1に記載の画像処理装置。 - 前記入力画像は、輝度成分および色差成分からなる画像とされ、
前記抽出情報画像は、前記所定の情報としての前記入力画像の輝度成分または色差成分からなる画像とされる
請求項1に記載の画像処理装置。 - 入力画像の各領域から抽出された所定の情報からなる抽出情報画像に基づいて、互いに異なる解像度の複数の前記抽出情報画像を生成する抽出情報画像生成手段と、
前記複数の前記抽出情報画像のうちの所定の2つの前記抽出情報画像の差分を求めることにより、差分画像を生成する差分画像生成手段と、
複数の前記差分画像を重み付き加算することにより、前記入力画像上の被写体の領域が有する特徴の特徴量を示す情報マップを生成する情報マップ生成手段と、
前記情報マップの各領域の値から、前記情報マップの各領域の値の平均値を減算して、前記情報マップを正規化する正規化手段と、
正規化された複数の前記情報マップを重み付き加算することにより、前記入力画像の各領域における前記被写体の領域らしさを示す被写体マップを生成する被写体マップ生成手段と
を備える画像処理装置の画像処理方法であって、
前記抽出情報画像生成手段が、前記入力画像から前記抽出情報画像を生成し、
前記差分画像生成手段が、前記複数の前記抽出情報画像から前記差分画像を生成し、
前記情報マップ生成手段が、前記複数の前記差分画像を重み付き加算して前記情報マップを生成し、
前記正規化手段が、前記情報マップを正規化し、
前記被写体マップ生成手段が、前記情報マップを重み付き加算して前記被写体マップを生成する
ステップを含む画像処理方法。 - 入力画像の各領域から抽出された所定の情報からなる抽出情報画像に基づいて、互いに異なる解像度の複数の前記抽出情報画像を生成し、
前記複数の前記抽出情報画像のうちの所定の2つの前記抽出情報画像の差分を求めることにより、差分画像を生成し、
複数の前記差分画像を重み付き加算することにより、前記入力画像上の被写体の領域が有する特徴の特徴量を示す情報マップを生成し、
前記情報マップの各領域の値から、前記情報マップの各領域の値の平均値を減算して、前記情報マップを正規化し、
正規化された複数の前記情報マップを重み付き加算することにより、前記入力画像の各領域における前記被写体の領域らしさを示す被写体マップを生成する
ステップを含む処理をコンピュータに実行させるプログラム。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/999,771 US8577137B2 (en) | 2009-05-08 | 2010-04-30 | Image processing apparatus and method, and program |
EP10772165A EP2299403A1 (en) | 2009-05-08 | 2010-04-30 | Image processing device, method, and program |
CN201080002008.1A CN102084396B (zh) | 2009-05-08 | 2010-04-30 | 图像处理设备和方法 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2009-113413 | 2009-05-08 | ||
JP2009113413A JP5229575B2 (ja) | 2009-05-08 | 2009-05-08 | 画像処理装置および方法、並びにプログラム |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2010128646A1 true WO2010128646A1 (ja) | 2010-11-11 |
Family
ID=43050147
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2010/057648 WO2010128646A1 (ja) | 2009-05-08 | 2010-04-30 | 画像処理装置および方法、並びにプログラム |
Country Status (8)
Country | Link |
---|---|
US (1) | US8577137B2 (ja) |
EP (1) | EP2299403A1 (ja) |
JP (1) | JP5229575B2 (ja) |
KR (1) | KR20120018267A (ja) |
CN (1) | CN102084396B (ja) |
MY (1) | MY154278A (ja) |
TW (1) | TWI423168B (ja) |
WO (1) | WO2010128646A1 (ja) |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011247957A (ja) * | 2010-05-24 | 2011-12-08 | Toshiba Corp | パターン検査方法および半導体装置の製造方法 |
WO2013054160A1 (en) * | 2011-10-11 | 2013-04-18 | Sony Ericsson Mobile Communications Ab | Light sensitive, low height, and high dynamic range camera |
JP5826001B2 (ja) * | 2011-11-30 | 2015-12-02 | キヤノン株式会社 | 画像処理装置、及びその制御方法 |
CN103313049A (zh) * | 2012-03-14 | 2013-09-18 | 富士通株式会社 | 图像压缩方法和装置 |
JP5895720B2 (ja) * | 2012-06-06 | 2016-03-30 | 富士通株式会社 | 被写体追跡装置、被写体追跡方法及び被写体追跡用コンピュータプログラム |
US9518935B2 (en) * | 2013-07-29 | 2016-12-13 | Kla-Tencor Corporation | Monitoring changes in photomask defectivity |
US10805649B2 (en) | 2017-01-04 | 2020-10-13 | Samsung Electronics Co., Ltd. | System and method for blending multiple frames into a single frame |
US10451563B2 (en) | 2017-02-21 | 2019-10-22 | Kla-Tencor Corporation | Inspection of photomasks by comparing two photomasks |
JP7091031B2 (ja) * | 2017-07-27 | 2022-06-27 | サムスン エレクトロニクス カンパニー リミテッド | 撮像装置 |
JP6919539B2 (ja) | 2017-12-06 | 2021-08-18 | 富士通株式会社 | 演算処理装置および演算処理装置の制御方法 |
JP2021005301A (ja) * | 2019-06-27 | 2021-01-14 | 株式会社パスコ | 建物抽出処理装置及びプログラム |
CN110728662B (zh) * | 2019-09-26 | 2022-06-28 | 中国国家铁路集团有限公司 | 轨道类型识别方法及装置 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000149031A (ja) * | 1998-11-09 | 2000-05-30 | Canon Inc | 画像処理装置及び方法並びに記憶媒体 |
JP2008210009A (ja) * | 2007-02-23 | 2008-09-11 | Fujifilm Corp | 画像識別装置,画像識別方法,撮像装置及び撮像方法 |
JP2010055194A (ja) * | 2008-08-26 | 2010-03-11 | Sony Corp | 画像処理装置および方法、学習装置および方法、並びにプログラム |
Family Cites Families (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3362364B2 (ja) * | 1992-07-17 | 2003-01-07 | オムロン株式会社 | ファジイ推論システムおよび方法ならびに前件部処理装置 |
US6005978A (en) * | 1996-02-07 | 1999-12-21 | Cognex Corporation | Robust search for image features across image sequences exhibiting non-uniform changes in brightness |
US6674915B1 (en) * | 1999-10-07 | 2004-01-06 | Sony Corporation | Descriptors adjustment when using steerable pyramid to extract features for content based search |
US6785427B1 (en) * | 2000-09-20 | 2004-08-31 | Arcsoft, Inc. | Image matching using resolution pyramids with geometric constraints |
JP3658761B2 (ja) * | 2000-12-12 | 2005-06-08 | 日本電気株式会社 | 画像検索システムとその画像検索方法、及び画像検索プログラムを記録した記憶媒体 |
US6670963B2 (en) * | 2001-01-17 | 2003-12-30 | Tektronix, Inc. | Visual attention model |
US20020154833A1 (en) * | 2001-03-08 | 2002-10-24 | Christof Koch | Computation of intrinsic perceptual saliency in visual environments, and applications |
DE60218928D1 (de) * | 2001-04-30 | 2007-05-03 | St Microelectronics Pvt Ltd | Effiziente Niedrigleistungsbewegungsschätzung für eine Video-Vollbildsequenz |
US7343028B2 (en) * | 2003-05-19 | 2008-03-11 | Fujifilm Corporation | Method and apparatus for red-eye detection |
WO2004111931A2 (en) * | 2003-06-10 | 2004-12-23 | California Institute Of Technology | A system and method for attentional selection |
JP4277739B2 (ja) * | 2004-06-08 | 2009-06-10 | ソニー株式会社 | 映像デコーダ |
EP1766552A2 (en) * | 2004-06-23 | 2007-03-28 | Strider Labs, Inc. | System and method for 3d object recognition using range and intensity |
CN1296861C (zh) * | 2004-09-10 | 2007-01-24 | 倪蔚民 | 基于图像纹理特征随机度信息的模式识别方法 |
US20090015683A1 (en) * | 2005-03-15 | 2009-01-15 | Omron Corporation | Image processing apparatus, method and program, and recording medium |
US7334901B2 (en) * | 2005-04-22 | 2008-02-26 | Ostendo Technologies, Inc. | Low profile, large screen display using a rear projection array system |
US7426312B2 (en) * | 2005-07-05 | 2008-09-16 | Xerox Corporation | Contrast enhancement of images |
US7623683B2 (en) * | 2006-04-13 | 2009-11-24 | Hewlett-Packard Development Company, L.P. | Combining multiple exposure images to increase dynamic range |
CN101408942B (zh) * | 2008-04-17 | 2011-01-12 | 浙江师范大学 | 一种复杂背景下的车牌定位方法 |
-
2009
- 2009-05-08 JP JP2009113413A patent/JP5229575B2/ja not_active Expired - Fee Related
-
2010
- 2010-04-19 TW TW099112223A patent/TWI423168B/zh not_active IP Right Cessation
- 2010-04-30 US US12/999,771 patent/US8577137B2/en not_active Expired - Fee Related
- 2010-04-30 KR KR1020107029726A patent/KR20120018267A/ko not_active Application Discontinuation
- 2010-04-30 WO PCT/JP2010/057648 patent/WO2010128646A1/ja active Application Filing
- 2010-04-30 EP EP10772165A patent/EP2299403A1/en not_active Withdrawn
- 2010-04-30 CN CN201080002008.1A patent/CN102084396B/zh not_active Expired - Fee Related
- 2010-04-30 MY MYPI2010006206A patent/MY154278A/en unknown
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000149031A (ja) * | 1998-11-09 | 2000-05-30 | Canon Inc | 画像処理装置及び方法並びに記憶媒体 |
JP2008210009A (ja) * | 2007-02-23 | 2008-09-11 | Fujifilm Corp | 画像識別装置,画像識別方法,撮像装置及び撮像方法 |
JP2010055194A (ja) * | 2008-08-26 | 2010-03-11 | Sony Corp | 画像処理装置および方法、学習装置および方法、並びにプログラム |
Non-Patent Citations (1)
Title |
---|
RICHAR O. DUDA; PETER E. HART; DAVID G. STORK: "Patten Classification", WILEY-INTERSCIENCE |
Also Published As
Publication number | Publication date |
---|---|
US20120121173A1 (en) | 2012-05-17 |
CN102084396B (zh) | 2014-02-05 |
MY154278A (en) | 2015-05-29 |
US8577137B2 (en) | 2013-11-05 |
JP2010262506A (ja) | 2010-11-18 |
TWI423168B (zh) | 2014-01-11 |
TW201044324A (en) | 2010-12-16 |
EP2299403A1 (en) | 2011-03-23 |
JP5229575B2 (ja) | 2013-07-03 |
KR20120018267A (ko) | 2012-03-02 |
CN102084396A (zh) | 2011-06-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5229575B2 (ja) | 画像処理装置および方法、並びにプログラム | |
Chen et al. | Robust image and video dehazing with visual artifact suppression via gradient residual minimization | |
CN111741211B (zh) | 图像显示方法和设备 | |
WO2010024265A1 (ja) | 画像処理装置および方法、学習装置および方法、並びにプログラム | |
CN111402146B (zh) | 图像处理方法以及图像处理装置 | |
CN108389224B (zh) | 图像处理方法及装置、电子设备和存储介质 | |
US9344690B2 (en) | Image demosaicing | |
Zhang et al. | Opinion-unaware blind quality assessment of multiply and singly distorted images via distortion parameter estimation | |
JP5310247B2 (ja) | 画像処理装置および方法、並びにプログラム | |
CN109472757B (zh) | 一种基于生成对抗神经网络的图像去台标方法 | |
CN113284061B (zh) | 一种基于梯度网络的水下图像增强方法 | |
CN111445496B (zh) | 一种水下图像识别跟踪***及方法 | |
CN117058606A (zh) | 一种x射线图像违禁品检测方法 | |
Singh et al. | Weighted least squares based detail enhanced exposure fusion | |
CN112365429B (zh) | 一种知识驱动的图像模糊区域清晰度增强方法 | |
Viacheslav et al. | Low-level features for inpainting quality assessment | |
Chaczko et al. | A preliminary investigation on computer vision for telemedicine systems using OpenCV | |
Chaczko et al. | Teaching Computer Vision for telemedicine systems using OpenCV | |
JP2011018199A (ja) | 画像処理装置および方法、並びにプログラム | |
JP7512150B2 (ja) | 情報処理装置、情報処理方法およびプログラム | |
Sizyakin et al. | Virtual restoration of paintings using adaptive adversarial neural network | |
Kannan et al. | ARITM: A Perceptual Quality Magnification of Noise Suppressed Digital Images using Adaptive Recursion and Image Transformation Model | |
Wu et al. | Underwater image restoration with multi-scale shallow feature extraction and detail enhancement network | |
Aydin | RAW Bayer Domain Image Alignment | |
KANG | An Information-theoretic analysis of generative adversarial networks for image restoration in physics-based vision |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201080002008.1 Country of ref document: CN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2010772165 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 12999771 Country of ref document: US |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10772165 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 20107029726 Country of ref document: KR Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |