CN108230409B

CN108230409B - Image similarity quantitative analysis method based on multi-factor synthesis of color and content

Info

Publication number: CN108230409B
Application number: CN201810263619.XA
Authority: CN
Inventors: 黄凯; 郭延文; 杜振龙
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2018-03-28
Filing date: 2018-03-28
Publication date: 2020-04-17
Anticipated expiration: 2038-03-28
Also published as: CN108230409A

Abstract

The invention discloses a quantitative analysis method for image similarity based on multi-factor synthesis of color and content, which comprises the following steps: step 1, reading images img1 and img2, and preprocessing; step 2, analyzing color similarity; step 3, analyzing the content similarity; step 4, analyzing the comprehensive similarity of the color and the content; step 5, comprehensive operation: performing comprehensive operation on the results obtained in the steps 2, 3 and 4 to obtain a comprehensive similarity value similarity _ pri of the image; and 6, correcting the comprehensive similarity value similarity _ pri of the image: and (3) finely correcting the similarity _ pri according to the result obtained by the preprocessing in the step 1 to obtain a quantitative similarity value similarity _ val of the last two images.

Description

Image similarity quantitative analysis method based on multi-factor synthesis of color and content

Technical Field

The invention belongs to the fields of computer vision, digital image processing and the like, and particularly relates to a quantitative analysis method for image similarity based on multi-factor synthesis of color and content.

Background

The image similarity calculation mainly comprises the steps of carrying out similarity scoring on any two images and judging the similarity degree between the two images according to the degree of the score. The similarity calculation based on vision is mainly determined according to comprehensive factors such as color, content and scene correlation of images.

Image similarity calculation has become one of indispensable techniques in the field of image processing and computer vision. Industrial vision systems use similarity calculations for images in a variety of applications, such as defect detection, radar image recognition, traffic management, industrial monitoring, face recognition, medical diagnostics, and the like. In addition, many basic researches in the field of computer vision also require the calculation of image similarity, such as image matching, image retrieval, image classification, and the like. Even the similarity value between images can be used as one of the features of data in deep learning, so how to reasonably and accurately calculate the similarity value between images becomes important.

At present, the image similarity calculation based on vision mainly carries out simple histogram statistics on color information of an image, and judges the similarity of the image according to the similarity of the histogram. The SIFT feature points can keep good invariance to rotation, scale scaling and brightness change, and meanwhile, the image brightness change, affine change and noise also have good stability, so that the other similarity calculation for the image mainly utilizes the SIFT features to carry out SIFT matching on the image, and then judges the similarity of the image according to the matching result. Ssim (structural similarity index), the structural similarity index measures the similarity of an image through three comparison modules, namely brightness, contrast and structure, and is mainly used for judging the distortion degree of the image.

In a comprehensive view, the image similarity calculation method based on vision is simple at present, a systematic and reasonable method for quantitatively analyzing the image similarity based on vision does not exist, and a huge space for continuous research and perfection exists.

Disclosure of Invention

The purpose of the invention is as follows: the invention aims to solve the technical problem of reasonably scoring the similarity of any two images.

The technical scheme is as follows: the method is characterized in that similarity scoring is carried out on any two images according to information such as color, content, time, GPS and the like of the images, the final scoring result is a numerical value between 0 and 1, the larger the numerical value is, the more similar the two images are, and the method specifically comprises the following steps:

step 1, reading two images to be compared, img1 and img2, and preprocessing the images by taking img1 as reference images;

step 2, color similarity analysis: carrying out significance region detection on the images img1 and img2, distinguishing the foreground and the background of the images, carrying out interval division on RGB (Red, Green and Blue) color spaces, carrying out pixel-level interval clustering on the images to generate image foreground and background fingerprints, respectively carrying out correlation analysis on the image foreground and background fingerprints, and comprehensively generating a quantitative similarity value similarity _ color based on image color information;

and step 3, content similarity analysis: performing quantitative similarity analysis on the image according to the result obtained by preprocessing in the step 1, performing threshold segmentation on the Y component based on an OTSU method in a YUV color space to obtain a content map of the image, calculating a combined gradient domain combined Grad, and calculating a quantitative similarity value similarity _ contents of the image based on content information according to the content map of the image and the combined gradient domain combined Grad;

and 4, analyzing comprehensive similarity of colors and contents: respectively carrying out interval division on RGB color spaces of the images img1 and img2, carrying out pixel-level interval clustering, respectively generating an image pixel type label map labelMap1 of the image img1 and an image pixel type label map labelMap2 of the image img2, respectively carrying out granularity roughening operation on labelMap1 and labelMap2, and respectively recording the results as labelMap New1 and labelMap New 2; calculating quantitative similarity values of images based on color and content comprehensive information according to labelmapNew1 and labelmapNew 2;

step 5, comprehensive operation: carrying out comprehensive operation on the quantitative similarity value similarity _ color based on the image color information, the quantitative similarity value similarity _ contents based on the content information and the quantitative similarity value similarity _ color contents based on the color and content comprehensive information obtained in the steps 2, 3 and 4 to obtain a comprehensive similarity value similarity _ pri of the image;

and 6, correcting the comprehensive similarity value similarity _ pri of the image: and (3) finely correcting the comprehensive similarity value similarity _ pri of the images according to the result obtained by the preprocessing in the step (1) to obtain the quantitative similarity value similarity _ val of the last two images.

The method comprises the following steps of 1:

reading two images img1 and img2 to be compared in an RGB form, reading an image EXIF (Exchangeable image file, in which various information of the image such as an aperture, a shutter, white balance, ISO, a focal length, date and time, GPS and the like when the image is shot) information, obtaining image time and GPS information, and respectively performing brightness equalization processing on img1 and img2, specifically comprising the following steps:

step 1-1, converting an image RGB space into a YUV (one color space, the same as the RGB space, and belongs to one color space) space, and respectively obtaining the brightness components of two images, wherein the brightness component of img1 is Y1, and the brightness component of img2 is Y2;

step 1-2, respectively calculating the mean values of Y1 and Y2, and respectively recording the mean values as

Calculating the mean difference

Step 1-3, calculate new Y1, Y2 values:

if it is not

Then

If it is not

Then

Where (i, j) represents the position of the pixel unit in the original image, i represents the number of rows, j represents the number of columns, Y1(i, j) represents the pixel value at (i, j) corresponding to the luminance component Y1, and Y2(i, j) represents the pixel value at (i, j) corresponding to the luminance component Y2;

and 1-4, converting the YUV space into an RGB space, and respectively obtaining an image balanceImg1 corresponding to img1 and an image balanceImg2 corresponding to img 2.

The step 2 of the invention comprises the following steps:

step 2-1, downsampling img1 and img2 by using a salient object detection algorithm based on region comparison and carrying out salient region detection to obtain a foreground foregroudlmg 1 and a background backgroudlmg 1 of img1, a foreground foregroudlmg 2 and a background backgroudlmg 2 of img2 (specifically: Cheng, Ming-Ming, et al. "Salientobject detection and segmentation." IEEE Transactions on Pattern Analysis & Machine Analysis 1(2014): 1-1);

step 2-2, calculating fingerprints of foregroundImg1, backploulim 1, foregroundImg2 and backploulim 2, wherein the specific calculation steps are as follows: for the foregroudlimg 1, performing interval division on an RGB color space, averagely dividing each channel of three channels of RGB into 8 intervals, namely, the length of each interval is 32, forming 512 intervals by the three channels, initializing a 512-dimensional vector a, wherein each dimension of the vector a represents one interval; carrying out interval statistical clustering on each pixel of the foregroundImg1, counting the number of each interval pixel as a vector a, wherein the dimension value is obtained, and carrying out normalization processing to obtain the vector a as a foregroundImg1 fingerprint finrprintFore 1; the same calculation steps are carried out on backsgroundimg 1, forkroundimg 2 and backsgroundimg 2 to obtain fingerprint fingerprintBack1 of backsgroundimg 1, fingerprint fingerprintform 2 of forkroundimg 2 and fingerprint fingerprintBack2 of backsgroundimg 2 respectively;

step 2-3, calculating the correlation coefficient between finger printform 1 and finger printform 2 as coeff _ form; calculating the correlation coefficient between finger print Back1 and finger print Back2 as coeff _ back, wherein the correlation coefficient is rho_XYThe calculation formula is as follows:

wherein Cov (X, Y) is the covariance of X and Y, and D (X), D (Y) are the variance of X and the variance of Y, respectively. And finally, based on the image color information similarity value similarity _ color, the similarity _ color is as follows:

similarity_color＝cFore*coeff_fore+cBack*coeff_back，

wherein the content of the first and second substances,

cBack＝1-cFore，S_{foregroundImg1}is the foreground area of Img1, S_img1Is img1 area.

Step 3 of the invention comprises the following steps:

step 3-1, down-sampling the balanceImg1 and the balanceImg2 obtained in the step 1, converting the down-sampled data into YUV color space, respectively obtaining luminance components YContents1 and 2 of the balanceImg2 of the balanceImg1, and carrying out Bilateral filtering processing on YContents1 and YContents2 (reference: https:// en. wikipedia. org/wiki/Bilateral _ filter);

step 3-2, threshold segmentation based on OTSU method (https:// en. wikipedia.org/wiki/Otsu% 27s _ method) is carried out on YConsets 1 and YConsets 2 to obtain content map of YConsets 1 as fullContents1 and content map of YConsets 2 as fullContents2, and content difference map contentsDiff is calculated by the following formula:

contentsDiff＝fullContents1⊙fullContents2；

step 3-3, calculating gradient domains of YContents1 and YContents2 to obtain gradient maps grad1 and grad2 of YContents1 and YContents2 respectively, and calculating a combined gradient domain combinegrade by using the following formula:

combineGrad＝grad1.*grad2，

denotes dot product operations between matrices;

normalized treatment of combineGrad (reference: https:// baike. ***. com/item/% E5% BD% 92% E4% B8% 80% E5% 8C% 96% E6% 96% B9% E6% B3% 95/10089118);

step 3-4, calculating similarity value similarity _ contents based on the image content information, wherein the specific formula is as follows:

step 4 of the invention comprises the following steps:

step 4-1, downsampling balance img1 and balance img2 to the same size and performing pixel level clustering: the method comprises the steps of carrying out interval division on an RGB color space, averagely dividing each channel of three channels of RGB into 2 intervals, namely, the length of each interval is 128, forming 8 intervals by the three channels, and representing one label and 8 labels by each interval; label labeling is carried out on each pixel in the balance img1 and the balance img2 to form a label image, an image pixel class label image of the generated image img1 is labeled label map1, and an image pixel class label image of the image img2 is labeled label map 2;

step 4-2, fine to coarse grain size roughening operations were performed on labelMap1, labelMap2, respectively: calculating all communicated components of the label graph, finding out the label with the largest adjacent area of all the communicated components with the area smaller than one thousandth of the area of the label graph, marking the label as label Max, and marking all the labels of the communicated components as label Max; finally, the result of fine to coarse grain size roughening operation by labelMap1 is recorded as labelMap new1, and the result of fine to coarse grain size roughening operation by labelMap2 is recorded as labelMap new 2;

step 4-3, defining that if the label values of the corresponding pixel positions in labelMapNew1 and labelMapNew2 are equal, the pixels are overlapped, and the similarity value similarity _ colorContents based on the color and content comprehensive information is as follows:

wherein S is_coverThe overlapping area of labelMapNew1 and labelMapNew2, S_labelMapNew1Is the area of labelMapNew 1.

The step 5 of the invention comprises the following steps:

calculating an image comprehensive similarity value similarity _ pri by the following formula:

wherein, a is similarity _ color, b is similarity _ contents, c is similarity _ color contents, max () is maximum value function, min () is minimum value function, mid is median function, mean () is mean function.

Step 6 of the invention comprises the following steps:

step 6-1, calculating the time correlation coeff according to the image time information obtained in step 1_timeThe calculation formula is as follows:

coeff_time＝max(time_effect*time_coeff+(1-time_effect)*coeff_pri,coeff_pri)，

wherein, time_c、time_coeffRepresenting an intermediate variable, detalT is the shooting time interval of two images, taking seconds as a unit, and e is the base number of a natural logarithm; time_const＝86400，time_effect＝0.2，coeff_pri＝ similarity_pri；

Step 6-2, calculating the GPS correlation coeff according to the image GPS information obtained in the step 1_gpsThe calculation formula is as follows:

coeff_gps＝max(gps_effect*gps_coeff+(1-gps_effect)*coeff_pri,coeff_pri)，

wherein gps_c、gps_coeffRepresenting an intermediate variable, and the detalGPS is the shooting place distance of the two images and takes meters as a unit; gps_const＝5000，gps_effect＝0.2,coeff_pri＝similarity_pri；

And 6-3, calculating the final similarity value similarity _ val of the two images, wherein the calculation formula is as follows:

the invention has the beneficial effects that:

(1) the invention relates to an image similarity calculation method, which can effectively and reasonably calculate the similarity value of any two images.

(2) The method greatly improves the operation speed through the downsampling strategy without influencing the final result, and is an efficient image similarity calculation method.

Drawings

The foregoing and other advantages of the invention will become more apparent from the following detailed description of the invention when taken in conjunction with the accompanying drawings.

FIG. 1 is a basic flow diagram of the process of the present invention.

Fig. 2a is a reference image of the first group of images fig. 2b and fig. 2 c.

Fig. 2b is a similarity comparison image having a similarity value of 0.83 with respect to fig. 2 a.

Fig. 2c is a similarity comparison image having a similarity value of 0.58 with respect to fig. 2 a.

Fig. 3a is a reference image of the second group of images, like fig. 3b, fig. 3 c.

Fig. 3b is a similarity comparison image having a similarity value of 0.97 with respect to fig. 3 a.

Fig. 3c is a similarity comparison image having a similarity value of 0.53 with respect to fig. 3 a.

Fig. 4a is a reference image of the third group of images fig. 4b, 4 c.

Fig. 4b is a similarity comparison image having a similarity value of 0.51 with respect to fig. 4 a.

Fig. 4c is a similarity comparison image having a similarity value of 0.76 with respect to fig. 4 a.

Fig. 5a shows a fourth set of reference images like fig. 5b and 5 c.

Fig. 5b is a similarity comparison image having a similarity value of 0.84 with respect to fig. 5 a.

Fig. 5c is a similarity comparison image having a similarity value of 0.74 with respect to fig. 5 a.

Detailed Description

The invention is further explained below with reference to the drawings and the embodiments.

The flow chart of the method is shown in figure 1, the method of the invention scores similarity of any two images mainly according to information such as color, content, time, GPS and the like of the images, the final scoring result is a numerical value between 0 and 1, the larger the numerical value is, the more similar the two images are, and the method specifically comprises the following steps:

step 2, analyzing color similarity: carrying out significance region detection on the images img1 and img2, distinguishing the foreground and the background of the images, carrying out interval division on RGB (Red, Green and Blue) color spaces, carrying out pixel-level interval clustering on the images to generate image foreground and background fingerprints, respectively carrying out correlation analysis on the image foreground and background fingerprints, and comprehensively generating a quantitative similarity value similarity _ color based on image color information;

The method comprises the following steps of 1:

Calculating the mean difference

Step 1-3, calculate new Y1, Y2 values:

if it is not

Then

If it is not

Then

The step 2 of the invention comprises the following steps:

similarity_color＝cFore*coeff_fore+cBack*coeff_back，

wherein the content of the first and second substances,

Step 3 of the invention comprises the following steps:

contentsDiff＝fullContents1⊙fullContents2；

combineGrad＝grad1.*grad2，

step 4 of the invention comprises the following steps:

The step 5 of the invention comprises the following steps:

Step 6 of the invention comprises the following steps:

coeff_time＝max(time_effect*time_coeff+(1-time_effect)*coeff_pri,coeff_pri)，

coeff_gps＝max(gps_effect*gps_coeff+(1-gps_effect)*coeff_pri,coeff_pri)，

the experimental development and operating environment of this example is: intel core i 5-45903.30 GHz quad CPU (processor), NVIDIA GeForce GTX 760GPU (display card), 8G memory, software environment Microsoft Visual Studio2013 (software development tool) and OpenCV 2.4.11 (computer vision open source library).

In this embodiment, four sets of images are outputted, where fig. 2a, 2b, and 2c are the first set, fig. 3a, 3b, and 3c are the second set, fig. 4a, 4b, and 4c are the third set, and fig. 5a, 5b, and 5c are the fourth set. Fig. 2a is a reference image of the first group of images fig. 2b and fig. 2 c. Fig. 2b is a similarity comparison image having a similarity value of 0.83 with respect to fig. 2 a. Fig. 2c is a similarity comparison image having a similarity value of 0.58 with respect to fig. 2 a.

Fig. 3a is a reference image of the second group of images, like fig. 3b, fig. 3 c. Fig. 3b is a similarity comparison image having a similarity value of 0.97 with respect to fig. 3 a. Fig. 3c is a similarity comparison image having a similarity value of 0.53 with respect to fig. 3 a.

Fig. 4a is a reference image of the third group of images fig. 4b, 4 c. Fig. 4b is a similarity comparison image having a similarity value of 0.51 with respect to fig. 4 a. Fig. 4c is a similarity comparison image having a similarity value of 0.76 with respect to fig. 4 a.

Fig. 5a shows a fourth set of reference images like fig. 5b and 5 c. Fig. 5b is a similarity comparison image having a similarity value of 0.84 with respect to fig. 5 a. Fig. 5c is a similarity comparison image having a similarity value of 0.74 with respect to fig. 5 a.

As can be seen from the figure, the invention can obtain better analysis results for any two images.

The present invention provides a quantitative analysis method for image similarity based on multi-factor synthesis of color and content, and a plurality of methods and ways for implementing the technical solution, and the above description is only a preferred embodiment of the present invention, it should be noted that, for those skilled in the art, a plurality of improvements and modifications can be made without departing from the principle of the present invention, and these improvements and modifications should also be regarded as the protection scope of the present invention. All the components not specified in the present embodiment can be realized by the prior art.

Claims

1. The image similarity quantitative analysis method based on multi-factor synthesis of color and content is characterized by comprising the following steps of:

step 2, color similarity analysis: respectively carrying out significance region detection on the images img1 and img2, distinguishing image foreground and background, carrying out interval division on RGB color space, carrying out pixel-level interval clustering on the images to generate image foreground and background fingerprints, respectively carrying out correlation analysis on the image foreground and background fingerprints, and comprehensively generating a quantitative similarity value similarity _ color based on image color information;

and step 3, content similarity analysis: performing quantitative similarity analysis on the image according to the result obtained by preprocessing in the step 1, performing threshold segmentation on the Y component based on an OTSU method in a YUV color space to obtain a content map of the image, calculating a joint gradient domain combined Grad, and calculating a quantitative similarity value similarity _ contents of the image based on content information according to the content map of the image and the joint gradient domain combined Grad;

and 6, correcting the comprehensive similarity value similarity _ pri of the image: according to the result obtained by the preprocessing in the step 1, finely correcting the comprehensive similarity value similarity _ pri of the images to obtain the quantitative similarity value similarity _ val of the last two images;

the step 1 comprises the following steps:

reading two images to be compared img1 and img2 in an RGB form, reading EXIF information of the images, obtaining image time and GPS information, and respectively carrying out brightness equalization processing on img1 and img2, wherein the method comprises the following specific steps:

step 1-1, converting an image RGB space into a YUV space, and respectively obtaining the brightness components of two images, wherein the brightness component of img1 is Y1, and the brightness component of img2 is Y2;

Calculating the mean difference

Step 1-3, calculate new Y1, Y2 values:

if it is not

Then

Y2＝Y2，

If it is not

Then

Y1＝Y1，

step 1-4, converting the YUV space into an RGB space, and respectively obtaining an image balanceImg1 corresponding to img1 and an image balanceImg2 corresponding to img 2;

the step 2 comprises the following steps:

step 2-1, downsampling img1 and img2 by using a salient object detection algorithm based on region comparison and carrying out salient region detection to obtain a foreground forego dImg1 and a background Img1 of img1, a foreground forego dImg2 and a background Img2 of img 2;

step 2-2, calculating fingerprints of foregroundImg1, backploulim 1, foregroundImg2 and backploulim 2, wherein the specific calculation steps are as follows: for the forego undimg1, performing interval division on an RGB color space, averagely dividing each channel of three channels of RGB into 8 intervals, namely, each interval is 32 in length, forming 512 combinations by eight intervals of each channel of the three channels, covering a part of the color space by each combination, mutually excluding the color spaces, covering the whole RGB color space by a union set of the eight intervals, initializing a 512-dimensional vector a, wherein each dimension corresponds to or marks one of the combinations, and each dimension is one interval; carrying out interval statistical clustering on each pixel of the foregroundImg1, and counting the number of the pixels of the foregroundImg1 in each interval, wherein the numerical value of the pixels is used as the value of the corresponding dimension of the vector a; normalizing the vector a to obtain a vector a1 as a forego and dImg1 fingerprint finger printFore 1; the same calculation steps are carried out on backsgroundimg 1, forkroundimg 2 and backsgroundimg 2 to obtain fingerprint fingerprintBack1 of backsgroundimg 1, fingerprint fingerprintform 2 of forkroundimg 2 and fingerprint fingerprintBack2 of backsgroundimg 2 respectively;

step 2-3, calculating the correlation coefficient between finger printform 1 and finger printform 2 as coeff _ form; calculating a correlation coefficient between finger print back1 and finger print back2 as coeff _ back, and finally calculating similarity _ color based on image color information as follows:

similarity_color＝cFore*coeff_fore+cBack*coeff_back，

wherein the content of the first and second substances,

cBack＝1-cFore，S_{foregroundImg1}is the foreground area of Img1, S_img1Img1 area;

the step 3 comprises the following steps:

step 3-1, down-sampling and converting the balance img1 and the balance img2 obtained in the step 1 into YUV color space, respectively obtaining a luminance component YContents1 of the balance img1 and a luminance component YContents2 of the balance img2, and carrying out bilateral filtering processing on the YContents1 and the YContents 2;

step 3-2, threshold segmentation based on OTSU method is carried out on YContents1 and YContents2 to obtain content graphs of YContents1 and YContents2 respectively as fullContents1 and fullContents2, and content difference graphs contentsDiff are calculated by using the following formula:

contentsDiff＝fullContents1⊙fullContents2；

combineGrad＝grad1.*grad2，

carrying out normalization processing on the combineGrad;

step 4 comprises the following steps:

step 4-1, downsampling balance img1 and balance img2 to the same size and performing pixel level clustering: the method comprises the following steps of (1) carrying out interval division on an RGB color space, averagely dividing each channel of three channels of RGB into 2 intervals, wherein the three channels have 8 combinations, and each combination is marked by one label and has 8 labels; label labeling is carried out on each pixel in the balance img1 and the balance img2 to form a label image, namely label map1 and label map2, an image pixel class label image of the generated image img1 is labeled as label map1, and an image pixel class label image of the image img2 is labeled as label map 2;

step 4-2, performing grain size roughening operation on labelMap1 and labelMap2 respectively: calculating all communicated components of the label graph, finding out the label with the largest adjacent area of all the communicated components with the area smaller than one thousandth of the area of the label graph, marking the label as label Max, and marking all the labels of the communicated components as label Max; finally, the results of the particle size roughening operations performed by labelMap1 were recorded as labelMap new1, and the results of the particle size roughening operations performed by labelMap2 were recorded as labelMap new 2;

wherein S is_coverOverlapped by labelMapNew1 and labelMapNew2Area, S_labelMapNew1Is the area of labelMapNew 1.

2. The method of claim 1, wherein step 5 comprises the steps of:

3. The method of claim 2, wherein step 6 comprises the steps of:

coeff_time＝max(time_effect*time_coeff+(1-time_effect)*coeff_pri，coeff_pri)，

the digital camera comprises a camera, a camera module and a camera module, wherein the camera module is used for shooting two images, and the camera module is used for shooting two images; time_const＝86400，time_effect＝0.2，coeff_pri＝similarity_pri；

Step 6-2, calculating 6PS correlation coeff according to the image GPS information obtained in the step 1_gpsThe calculation formula is as follows:

coeff_gps＝max(gps_effect*gps_coeff+(1-gps_effect)*coeff_pri，coeff_pri)，

the digital GPS is used for acquiring the shooting distance of two images, wherein the digital GPS is used for acquiring the shooting distance of the two images in meters; gps_const＝5000，gps_effect＝0.2，coeff_pri＝similarity_pri；