CN111507970B - Image fusion quality detection method and device - Google Patents

Image fusion quality detection method and device Download PDF

Info

Publication number
CN111507970B
CN111507970B CN202010311554.9A CN202010311554A CN111507970B CN 111507970 B CN111507970 B CN 111507970B CN 202010311554 A CN202010311554 A CN 202010311554A CN 111507970 B CN111507970 B CN 111507970B
Authority
CN
China
Prior art keywords
image
fusion
fused
wavelet
coefficient
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010311554.9A
Other languages
Chinese (zh)
Other versions
CN111507970A (en
Inventor
苑贵全
骞一凡
朱冬
杨易
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Seven Teng Robot Co ltd
Original Assignee
Chongqing Qiteng Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Qiteng Technology Co Ltd filed Critical Chongqing Qiteng Technology Co Ltd
Priority to CN202010311554.9A priority Critical patent/CN111507970B/en
Publication of CN111507970A publication Critical patent/CN111507970A/en
Application granted granted Critical
Publication of CN111507970B publication Critical patent/CN111507970B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20048Transform domain processing
    • G06T2207/20064Wavelet transform [DWT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20076Probabilistic image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses an image fusion quality detection method and device. The method comprises the steps of searching a tracked image frame from each frame of a video image; respectively carrying out image fusion on a plurality of tracked person images according to a wavelet transform image fusion method, a contour wavelet fusion method and a scale invariant feature transform image fusion method; and respectively calculating average gradients according to the fusion results, and judging the image fusion quality according to the average gradients. By adopting the image fusion quality detection method provided by the application, the quality of image fusion is detected by calculating the average gradient of three different image fusion modes, and the tracked person is better positioned.

Description

Image fusion quality detection method and device
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a method and an apparatus for detecting image fusion quality.
Background
In video security monitoring, tracking and positioning of people are the most important problems. However, when a person is identified, it is easy to recognize the position of the person because the person is blocked by an object.
People tracking is generally realized in the prior art by searching one or more pictures which can best embody the face of a person from a video to manually infer the facial features of the person to be shot. However, human inference is largely unable to accurately understand the situation of the photographed person, thereby causing failure in person tracking. In the prior art, a manual identification mode is more and more abandoned, and various automatic image identification methods are proposed, but the quality of detected images is also a key condition for person tracking, so that a method capable of detecting the image quality is urgently needed to inform a tracker of the accuracy of person tracking.
Disclosure of Invention
The application provides an image fusion quality detection method, which comprises the following steps:
searching a tracked image frame from each frame of the video image;
respectively carrying out image fusion on a plurality of tracked person images according to a wavelet transform image fusion method, a contour wavelet fusion method and a scale invariant feature transform image fusion method;
and respectively calculating average gradients according to the fusion results, and judging the image fusion quality according to the average gradients.
The image fusion quality detection method as described above, wherein the image frame of the tracked object is searched from each frame of the video image, specifically:
constructing a deep convolutional neural network model;
from the input layer, sequentially passing through a first convolution layer, a first depth convolution layer, a second depth convolution layer, a third convolution layer and a third depth convolution layer;
and inputting the output image into the global average pooling layer and the full connection layer to reach a softmax layer, outputting the occurrence probability of the tracked person by the softmax layer, and if the output probability is 1, determining the image frame as the image frame of the tracked person.
The image fusion quality detection method as described above, further includes preprocessing the tracked object image after the tracked object image is found, so as to eliminate or reduce noise in the image.
The image fusion quality detection method described above, wherein the image fusion is performed on the preprocessed multiple tracked person images according to a wavelet transform image fusion method, specifically includes the following sub-steps:
decomposing each tracked person image by using a discrete wavelet transform function to obtain a source image;
fusing wavelet coefficients corresponding to the source images based on a modulus maximum fusion algorithm to obtain fused images;
and performing wavelet inverse transformation on the fused image to obtain an image fusion result based on wavelet transformation.
The image fusion quality detection method described above, wherein image fusion is performed on a plurality of preprocessed tracked person images according to a contour wavelet fusion method, specifically includes the following sub-steps:
decomposing each tracked person image by using an edge contour transformation function to obtain a source image, and decomposing the source image to obtain a contour wavelet coefficient;
comparing the high-frequency coefficient in the contour wavelet coefficient obtained by decomposition, and taking the maximum value of the high-frequency coefficient as the high-frequency coefficient of the fused image;
calculating the mean value of low-frequency coefficients in the contour wavelet coefficients obtained by decomposition, and taking the mean value of the low-frequency coefficients as the low-frequency coefficients of the fused image;
and forming the low-frequency coefficient and the high-frequency coefficient of the fused image into a coefficient of the fused image, and performing contour wavelet fusion inverse transformation on the coefficient of the fused image to obtain an image fusion result based on a contour wavelet fusion method.
The method for detecting image fusion quality as described above, wherein the image fusion of the preprocessed images of the plurality of tracked objects is performed according to a scale-invariant feature transform image fusion method, and the method specifically includes the following sub-steps:
carrying out linear filtering on the two tracked person images to obtain contrast, direction and brightness characteristic saliency maps of the two tracked person images, and solving intersection of the contrast, direction and brightness characteristic saliency maps to obtain a visual saliency area, a unique saliency area and a public saliency area;
determining a fusion coefficient of the fusion image according to the low-frequency components of the visual salient region, the unique salient region and the common salient region;
and performing multi-scale inverse transformation on the fusion coefficient by using a multi-scale fusion algorithm to reconstruct a fusion image.
The application also provides an image fusion quality detection device, including:
the tracked person image searching module is used for searching the tracked person image frame from each frame of the video image;
the image fusion module is used for respectively carrying out image fusion on a plurality of tracked person images according to a wavelet transform image fusion method, a contour wavelet fusion method and a scale invariant feature transform image fusion method;
and the fusion image quality detection module based on the average gradient is used for respectively calculating the average gradient according to the fusion result and judging the image fusion quality according to the average gradient.
The image fusion quality detection device as described above, wherein the tracked object image search module is specifically configured to construct a deep convolutional neural network model; from the input layer, sequentially passing through a first convolution layer, a first depth convolution layer, a second depth convolution layer, a third convolution layer and a third depth convolution layer; and inputting the output image into the global average pooling layer and the full connection layer to reach a softmax layer, outputting the occurrence probability of the tracked person by the softmax layer, and if the output probability is 1, determining the image frame as the image frame of the tracked person.
The image fusion quality detection device comprises an image fusion module, a tracking module and a processing module, wherein the image fusion module comprises a wavelet transform image fusion submodule, and the wavelet transform image fusion submodule is specifically used for decomposing each tracked image by using a discrete wavelet transform function to obtain a source image; fusing wavelet coefficients corresponding to the source images based on a modulus maximum fusion algorithm to obtain fused images; and performing wavelet inverse transformation on the fused image to obtain an image fusion result based on wavelet transformation.
The image fusion quality detection device as described above, wherein the image fusion module performs image fusion on a plurality of tracked person images according to a contour wavelet fusion method, and is specifically configured to decompose each tracked person image with an edge contour transformation function to obtain a source image, and decompose the source image to obtain a contour wavelet coefficient; comparing the high-frequency coefficient in the contour wavelet coefficient obtained by decomposition, and taking the maximum value of the high-frequency coefficient as the high-frequency coefficient of the fused image; calculating the mean value of low-frequency coefficients in the contour wavelet coefficients obtained by decomposition, and taking the mean value of the low-frequency coefficients as the low-frequency coefficients of the fused image; and forming the low-frequency coefficient and the high-frequency coefficient of the fused image into a coefficient of the fused image, and performing contour wavelet fusion inverse transformation on the coefficient of the fused image to obtain an image fusion result based on a contour wavelet fusion method.
The image fusion quality detection device described above, wherein in the image fusion module, image fusion is performed on a plurality of tracked person images according to a scale-invariant feature transform image fusion method, and is specifically configured to perform linear filtering on two tracked person images to obtain contrast, direction, and brightness feature saliency maps thereof, and solve an intersection of the contrast, direction, and brightness feature saliency maps to obtain a visual saliency region, a unique saliency region, and a common saliency region; determining a fusion coefficient of the fusion image according to the low-frequency components of the visual salient region, the unique salient region and the common salient region; and performing multi-scale inverse transformation on the fusion coefficient by using a multi-scale fusion algorithm to reconstruct a fusion image.
The beneficial effect that this application realized is as follows: by adopting the image fusion quality detection method provided by the application, the quality of image fusion is detected by calculating the average gradient of three different image fusion modes, and the tracked person is better positioned.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings can be obtained by those skilled in the art according to the drawings.
Fig. 1 is a flowchart of an image fusion quality detection method according to an embodiment of the present application;
FIG. 2 is a flowchart of an image fusion method for pre-processing a plurality of images of a tracked person according to a wavelet transform image fusion method;
FIG. 3 is a flow chart of an image fusion method for pre-processing a plurality of tracked images according to a contour wavelet fusion method;
FIG. 4 is a flowchart of an image fusion method for pre-processing a plurality of images of a tracked person according to a scale-invariant feature transform image fusion method;
fig. 5 is a schematic diagram of an image fusion quality detection apparatus according to the second embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example one
An embodiment of the present application provides an image fusion quality detection method, as shown in fig. 1, including:
step 110, searching a tracked image frame from each frame of the video image;
because the real-time requirement on the tracked person is higher, the image judgment is carried out by adopting the deep convolutional neural network model with high calculation speed;
the method for searching the image frame of the tracked object from each frame of the video image specifically comprises the following steps: constructing a deep convolution neural network model, and sequentially performing convolution C from an input layer1Layer (output image size [256, 8)]) Depth convolution layer D1(output image size [128, 16)]) And a convolution layer C2(output image size [64,64,32 ]]) Depth convolution layer D2(output image size [32,32,64 ]]) And a convolution layer C3(output image size [16, 128)]) Depth convolution layer D3(output image size [8, 256 ]]) Then, the image passes through a global average pooling layer and a full connection layer, finally reaches a softmax layer, and judges whether a tracked person exists in the image according to an output result of the softmax layer;
the softmax layer outputs the probability of the appearance of the tracked person, and if the softmax layer output value is 1, it indicates that the tracked person appears in the frame image, and if the softmax layer output value is 0, it indicates that the tracked person does not appear in the frame image.
Step 120, preprocessing the image frame of the tracked person;
specifically, due to the influence of the acquisition conditions, illumination and other factors, the images carry noise, and therefore, the images of the tracked persons need to be preprocessed to eliminate or reduce the noise in the images.
Step 130, respectively carrying out image fusion on the preprocessed images of the plurality of tracked persons according to a wavelet transform image fusion method, a contour wavelet fusion method and a scale invariant feature transform image fusion method;
in the embodiment of the application, image fusion is respectively carried out based on a wavelet transform image fusion method, a scale invariant feature transform image fusion method and a contour wavelet fusion method to obtain image fusion results under different methods, and the quality of image fusion is judged according to the comparison of the image fusion results under different methods;
the image fusion of the preprocessed multiple tracked person images according to the wavelet transform image fusion method, as shown in fig. 2, specifically includes the following sub-steps:
step 210, decomposing each tracked person image by using a db1 discrete wavelet transform function to obtain a source image;
step 220, fusing wavelet coefficients corresponding to the source image based on a modulus maximum fusion algorithm to obtain a fused image;
specifically, the wavelet system is fused by the following formula:
i_wave(x,y)=MAX{w1(x,y),w2(x,y),w3(x,y)……wi(x,y)}
wherein x represents a wavelet coefficient w1、w2……wiY represents the wavelet coefficient w1、w2……wiNumber of columns, wi(x, y) denotes a wavelet system wiAnd the value at x rows and y columns, i _ wave (x, y) represents the value of the fused wavelet coefficient i _ wave at x rows and y columns, and i is the total number of the source images.
And step 230, performing wavelet inverse transformation on the fused image to obtain an image fusion result based on wavelet transformation.
The image fusion of the preprocessed images of the plurality of tracked persons is performed according to a contour wavelet fusion method, as shown in fig. 3, and specifically includes the following sub-steps:
step 310, decomposing each tracked person image by using an edge contour transformation function to obtain a source image, and decomposing the source image to obtain a contour wavelet coefficient;
optionally, a source image is decomposed in three layers, and the highest layer is decomposed in eight directions and the next highest layer is decomposed in four directions to obtain a decomposed contour wavelet coefficient Li= low frequency coefficient cliHigh frequency coefficient chiAnd i is the total number of source images.
Step 320, comparing high-frequency coefficients in the contour wavelet coefficients obtained by decomposition, and taking the maximum value of the high-frequency coefficients as the high-frequency coefficients of the fused image;
specifically, the high-frequency coefficient maximum value is obtained by the following formula:
freH(x,y)=MAX{ch1(x,y),ch2(x,y),……,chi(x,y)};
wherein x represents a high frequency coefficient chiThe number of lines of (a) and y represents a high-frequency coefficient chiNumber of columns of (ch)i(x, y) represents a high frequency coefficient chiThe values at x rows and y columns, freH (x, y) represents the values of the fused high-frequency coefficient freH at x rows and y columns, and i is the total number of source images.
Step 330, calculating the mean value of the low-frequency coefficients in the contour wavelet coefficients obtained by decomposition, and taking the mean value of the low-frequency coefficients as the low-frequency coefficients of the fused image;
specifically, the low frequency coefficient minimum is calculated by:
Figure BDA0002458030700000051
wherein x represents a low frequency coefficient cliY represents the low frequency coefficient cliNumber of columns of (ch)i(x, y) represents the low frequency coefficient cliThe values at x rows and y columns, freL (x, y), represent the values of the fused low frequency coefficient freL at x rows and y columns, and i is the total number of source images.
And 340, forming the low-frequency coefficient and the high-frequency coefficient of the fused image into a coefficient of the fused image, and performing contour wavelet fusion inverse transformation on the coefficient of the fused image to obtain an image fusion result based on a contour wavelet fusion method.
The image fusion of the preprocessed multiple tracked person images is performed according to a scale-invariant feature transformation image fusion method, as shown in fig. 4, the method specifically includes the following sub-steps:
step 410, performing linear filtering on the two tracked person images to obtain contrast, direction and brightness characteristic saliency maps of the two tracked person images, and solving intersection of the contrast, direction and brightness characteristic saliency maps to obtain a visual saliency area, a unique saliency area and a public saliency area;
the contrast characteristic saliency map is obtained by filtering a source image by using a Gaussian pyramid, then performing a layer-by-layer difference solving method on a filtering result to obtain contrast characteristic saliency point distribution, and applying an entropy threshold segmentation method to the characteristic saliency point distribution; the direction characteristic saliency map is specifically that a filter is utilized to filter a source image in multiple directions, filtering results are added to obtain direction characteristic point distribution of the source image, and then an entropy threshold segmentation method is applied to the direction characteristic point distribution to generate the direction characteristic saliency map; the brightness characteristic saliency map is specifically a brightness characteristic saliency map of a source image generated by smoothing the source image by using an average filter to eliminate noise and gray level abrupt change influence and then applying an entropy threshold segmentation method to the smoothed image.
Step 420, determining a fusion coefficient of the fusion image according to the low-frequency components of the visual salient region, the unique salient region and the public salient region;
specifically, for each point in a low-frequency component obtained by dividing two source images, if the unique salient region of a certain source image corresponding to the point is 1, determining that the fused image is the low-frequency coefficient corresponding to the source image, if the point corresponds to a public salient region, taking the mean value of the low-frequency coefficients of the two source images as the low-frequency coefficient of the fused image, if the point does not belong to any salient region, calculating the domain variance of the two images, wherein the larger the variance is, the more abundant the source image belongs to the region of the point, and taking the low-frequency coefficient of the source image corresponding to the point as the low-frequency coefficient of the fused image.
And 430, performing multi-scale inverse transformation on the fusion coefficient by using a multi-scale fusion algorithm to reconstruct a fusion image.
Referring back to fig. 1, step 140, respectively calculating an average gradient according to the fusion result, and determining the image fusion quality according to the average gradient;
specifically, the average gradient of the fused image results after the fusion by the three methods is calculated by the following formula:
Figure BDA0002458030700000061
wherein M and N are the total number of rows and columns of the image, f (M and N) is a fused image function, and f (M and N) isi,nj) Is the image point of the ith row and the jth column,
Figure BDA0002458030700000062
is the derivative of the image point in the ith row and jth column in the ith row direction,
Figure BDA0002458030700000063
the derivative of the image point of the ith row and the jth column in the jth row direction;
the larger the calculated average gradient value is, the more image layers are, the clearer the image is, and the fused image with the largest average gradient value is correspondingly used as the fused image with the optimal quality, namely, the fusion method corresponding to the fused image is determined to have the optimal person tracking effect in the video image.
Example two
An embodiment of the present application provides an image fusion quality detection apparatus, as shown in fig. 5, including:
a tracked person image searching module 510, configured to search the tracked person image frame from each frame of the video image;
the image fusion module 520 is used for respectively carrying out image fusion on a plurality of tracked person images according to a wavelet transform image fusion method, a contour wavelet fusion method and a scale invariant feature transform image fusion method;
and an average gradient-based fusion image quality detection module 530, configured to calculate average gradients according to the fusion results, and determine image fusion quality according to the average gradients.
The tracked person image searching module 510 is specifically configured to construct a deep convolutional neural network model; from the input layer, sequentially passing through a first convolution layer, a first depth convolution layer, a second depth convolution layer, a third convolution layer and a third depth convolution layer; and inputting the output image into the global average pooling layer and the full connection layer to reach a softmax layer, outputting the occurrence probability of the tracked person by the softmax layer, and if the output probability is 1, determining the image frame as the image frame of the tracked person.
Further, the image fusion module 520 includes a wavelet transform image fusion sub-module 521, an outline wavelet fusion sub-module 522 and a scale-invariant feature transform image fusion sub-module 523;
the wavelet transform image fusion submodule 521 is specifically configured to decompose each tracked person image by using a discrete wavelet transform function to obtain a source image; fusing wavelet coefficients corresponding to the source images based on a modulus maximum fusion algorithm to obtain fused images; and performing wavelet inverse transformation on the fused image to obtain an image fusion result based on wavelet transformation.
The contour wavelet fusion submodule 522 is specifically configured to decompose each tracked person image by using an edge contour transformation function to obtain a source image, and decompose the source image to obtain a contour wavelet coefficient; comparing the high-frequency coefficient in the contour wavelet coefficient obtained by decomposition, and taking the maximum value of the high-frequency coefficient as the high-frequency coefficient of the fused image; calculating the mean value of low-frequency coefficients in the contour wavelet coefficients obtained by decomposition, and taking the mean value of the low-frequency coefficients as the low-frequency coefficients of the fused image; and forming the low-frequency coefficient and the high-frequency coefficient of the fused image into a coefficient of the fused image, and performing contour wavelet fusion inverse transformation on the coefficient of the fused image to obtain an image fusion result based on a contour wavelet fusion method.
The scale-invariant feature transformation image fusion submodule 523 is specifically configured to perform linear filtering on two images of a tracked person to obtain a contrast feature saliency map, a direction feature saliency map and a brightness feature saliency map of the tracked person, and solve an intersection of the contrast feature saliency map, the direction feature saliency map and the brightness feature saliency map to obtain a visual saliency region, a unique saliency region and a common saliency region; determining a fusion coefficient of the fusion image according to the low-frequency components of the visual salient region, the unique salient region and the common salient region; and performing multi-scale inverse transformation on the fusion coefficient by using a multi-scale fusion algorithm to reconstruct a fusion image.
The above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the present disclosure, which should be construed in light of the above teachings. Are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (8)

1. An image fusion quality detection method is characterized by comprising the following steps:
searching a tracked image frame from each frame of the video image;
respectively carrying out image fusion on a plurality of tracked person images according to a wavelet transform image fusion method, a contour wavelet fusion method and a scale invariant feature transform image fusion method;
respectively calculating average gradients according to the fusion results, and judging the image fusion quality according to the average gradients;
the average gradient of the fused image results after the three methods are fused is calculated by the following formula:
Figure FDA0003093665080000011
wherein M and N are the total number of rows and columns of the image, f (M and N) is a fused image function, and f (M and N) isi,nj) Is the image point of the ith row and the jth column,
Figure FDA0003093665080000012
is the derivative of the image point in the ith row and jth column in the ith row direction,
Figure FDA0003093665080000013
the derivative of the image point of the ith row and the jth column in the jth row direction;
the larger the calculated average gradient value is, the more image layers are, the clearer the image is, and the fused image with the largest average gradient value is correspondingly used as the fused image with the optimal quality, namely the fused image corresponding to the fusion method is determined to have the optimal character tracking effect in the video image;
the method for fusing the images of the plurality of the tracked persons after the preprocessing is performed according to the scale-invariant feature transformation image fusion method specifically comprises the following substeps:
carrying out linear filtering on the two tracked person images to obtain contrast, direction and brightness characteristic saliency maps of the two tracked person images, and solving intersection of the contrast, direction and brightness characteristic saliency maps to obtain a visual saliency area, a unique saliency area and a public saliency area; the contrast characteristic saliency map is obtained by filtering a source image by using a Gaussian pyramid, then performing a layer-by-layer difference solving method on a filtering result to obtain contrast characteristic saliency point distribution, and applying an entropy threshold segmentation method to the characteristic saliency point distribution; the direction characteristic saliency map is specifically that a filter is utilized to filter a source image in multiple directions, filtering results are added to obtain direction characteristic point distribution of the source image, and then an entropy threshold segmentation method is applied to the direction characteristic point distribution to generate the direction characteristic saliency map; the brightness characteristic saliency map is specifically a brightness characteristic saliency map of a source image generated by smoothing the source image by using an average filter to eliminate noise and gray level abrupt change influence and then applying an entropy threshold segmentation method to the smoothed image;
determining a fusion coefficient of the fusion image according to the low-frequency components of the visual salient region, the unique salient region and the common salient region; specifically, for each point in a low-frequency component obtained by dividing two source images, if a unique salient region of a certain source image corresponding to the point is 1, determining that the fused image is a low-frequency coefficient corresponding to the source image, if the point corresponds to a public salient region, taking the mean value of the low-frequency coefficients of the two source images as the low-frequency coefficient of the fused image, if the point does not belong to any salient region, calculating the neighborhood variance of the two images, wherein the larger the variance is, the richer the source image belongs to the region of the point is, and taking the low-frequency coefficient of the source image corresponding to the point as the low-frequency coefficient of the fused image;
and performing multi-scale inverse transformation on the fusion coefficient by using a multi-scale fusion algorithm to reconstruct a fusion image.
2. The method according to claim 1, wherein the image frame of the tracked object is searched from each frame of the video image, and the method comprises the following steps:
constructing a deep convolutional neural network model;
from the input layer, sequentially passing through a first convolution layer, a first depth convolution layer, a second depth convolution layer, a third convolution layer and a third depth convolution layer;
and inputting the output image into the global average pooling layer and the full connection layer to reach a softmax layer, outputting the occurrence probability of the tracked person by the softmax layer, and if the output probability is 1, determining the image frame as the image frame of the tracked person.
3. The image fusion quality detection method according to claim 1, wherein the image fusion is performed on the preprocessed images of the plurality of tracked persons according to a wavelet transform image fusion method, and the image fusion quality detection method specifically comprises the following sub-steps:
decomposing each tracked person image by using a discrete wavelet transform function to obtain a source image;
fusing wavelet coefficients corresponding to the source images based on a modulus maximum fusion algorithm to obtain fused images;
and performing wavelet inverse transformation on the fused image to obtain an image fusion result based on wavelet transformation.
4. The image fusion quality detection method according to claim 1, wherein the image fusion is performed on the preprocessed images of the plurality of tracked persons according to a contour wavelet fusion method, and the image fusion quality detection method specifically comprises the following sub-steps:
decomposing each tracked person image by using an edge contour transformation function to obtain a source image, and decomposing the source image to obtain a contour wavelet coefficient;
comparing the high-frequency coefficient in the contour wavelet coefficient obtained by decomposition, and taking the maximum value of the high-frequency coefficient as the high-frequency coefficient of the fused image;
calculating the mean value of low-frequency coefficients in the contour wavelet coefficients obtained by decomposition, and taking the mean value of the low-frequency coefficients as the low-frequency coefficients of the fused image;
and forming the low-frequency coefficient and the high-frequency coefficient of the fused image into a coefficient of the fused image, and performing contour wavelet fusion inverse transformation on the coefficient of the fused image to obtain an image fusion result based on a contour wavelet fusion method.
5. An image fusion quality detection apparatus, comprising:
the tracked person image searching module is used for searching the tracked person image frame from each frame of the video image;
the image fusion module is used for respectively carrying out image fusion on a plurality of tracked person images according to a wavelet transform image fusion method, a contour wavelet fusion method and a scale invariant feature transform image fusion method;
the fusion image quality detection module based on the average gradient is used for respectively calculating the average gradient according to the fusion result and judging the image fusion quality according to the average gradient;
the average gradient of the fused image results after the three methods are fused is calculated by the following formula:
Figure FDA0003093665080000031
wherein M and N are the total number of rows and columns of the image, f (M and N) is the fused image function, f (mi and nj) is the image point of the ith row and the jth column,
Figure FDA0003093665080000032
is the derivative of the image point in the ith row and jth column in the ith row direction,
Figure FDA0003093665080000033
the derivative of the image point of the ith row and the jth column in the jth row direction;
the larger the calculated average gradient value is, the more image layers are, the clearer the image is, and the fused image with the largest average gradient value is correspondingly used as the fused image with the optimal quality, namely the fused image corresponding to the fusion method is determined to have the optimal character tracking effect in the video image;
the method for fusing the images of the plurality of the tracked persons after the preprocessing is performed according to the scale-invariant feature transformation image fusion method specifically comprises the following substeps:
carrying out linear filtering on the two tracked person images to obtain contrast, direction and brightness characteristic saliency maps of the two tracked person images, and solving intersection of the contrast, direction and brightness characteristic saliency maps to obtain a visual saliency area, a unique saliency area and a public saliency area; the contrast characteristic saliency map is obtained by filtering a source image by using a Gaussian pyramid, then performing a layer-by-layer difference solving method on a filtering result to obtain contrast characteristic saliency point distribution, and applying an entropy threshold segmentation method to the characteristic saliency point distribution; the direction characteristic saliency map is specifically that a filter is utilized to filter a source image in multiple directions, filtering results are added to obtain direction characteristic point distribution of the source image, and then an entropy threshold segmentation method is applied to the direction characteristic point distribution to generate the direction characteristic saliency map; the brightness characteristic saliency map is specifically a brightness characteristic saliency map of a source image generated by smoothing the source image by using an average filter to eliminate noise and gray level abrupt change influence and then applying an entropy threshold segmentation method to the smoothed image;
determining a fusion coefficient of the fusion image according to the low-frequency components of the visual salient region, the unique salient region and the common salient region; specifically, for each point in a low-frequency component obtained by dividing two source images, if a unique salient region of a certain source image corresponding to the point is 1, determining that the fused image is a low-frequency coefficient corresponding to the source image, if the point corresponds to a public salient region, taking the mean value of the low-frequency coefficients of the two source images as the low-frequency coefficient of the fused image, if the point does not belong to any salient region, calculating the neighborhood variance of the two images, wherein the larger the variance is, the richer the source image belongs to the region of the point is, and taking the low-frequency coefficient of the source image corresponding to the point as the low-frequency coefficient of the fused image;
and performing multi-scale inverse transformation on the fusion coefficient by using a multi-scale fusion algorithm to reconstruct a fusion image.
6. The image fusion quality detection apparatus of claim 5, wherein the tracked object image search module is specifically configured to construct a deep convolutional neural network model; from the input layer, sequentially passing through a first convolution layer, a first depth convolution layer, a second depth convolution layer, a third convolution layer and a third depth convolution layer; and inputting the output image into the global average pooling layer and the full connection layer to reach a softmax layer, outputting the occurrence probability of the tracked person by the softmax layer, and if the output probability is 1, determining the image frame as the image frame of the tracked person.
7. The image fusion quality detection apparatus of claim 5, wherein the image fusion module comprises a wavelet transform image fusion sub-module, and the wavelet transform image fusion sub-module is specifically configured to decompose each tracked image by a discrete wavelet transform function to obtain a source image; fusing wavelet coefficients corresponding to the source images based on a modulus maximum fusion algorithm to obtain fused images; and performing wavelet inverse transformation on the fused image to obtain an image fusion result based on wavelet transformation.
8. The image fusion quality detection device of claim 5, wherein the image fusion module comprises a contour wavelet fusion sub-module, and the contour wavelet fusion sub-module is specifically configured to decompose each tracked person image with an edge contour transformation function to obtain a source image, and decompose the source image to obtain a contour wavelet coefficient; comparing the high-frequency coefficient in the contour wavelet coefficient obtained by decomposition, and taking the maximum value of the high-frequency coefficient as the high-frequency coefficient of the fused image; calculating the mean value of low-frequency coefficients in the contour wavelet coefficients obtained by decomposition, and taking the mean value of the low-frequency coefficients as the low-frequency coefficients of the fused image; and forming the low-frequency coefficient and the high-frequency coefficient of the fused image into a coefficient of the fused image, and performing contour wavelet fusion inverse transformation on the coefficient of the fused image to obtain an image fusion result based on a contour wavelet fusion method.
CN202010311554.9A 2020-04-20 2020-04-20 Image fusion quality detection method and device Active CN111507970B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010311554.9A CN111507970B (en) 2020-04-20 2020-04-20 Image fusion quality detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010311554.9A CN111507970B (en) 2020-04-20 2020-04-20 Image fusion quality detection method and device

Publications (2)

Publication Number Publication Date
CN111507970A CN111507970A (en) 2020-08-07
CN111507970B true CN111507970B (en) 2022-01-11

Family

ID=71869729

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010311554.9A Active CN111507970B (en) 2020-04-20 2020-04-20 Image fusion quality detection method and device

Country Status (1)

Country Link
CN (1) CN111507970B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102208103A (en) * 2011-04-08 2011-10-05 东南大学 Method of image rapid fusion and evaluation
CN103745203A (en) * 2014-01-15 2014-04-23 南京理工大学 Visual attention and mean shift-based target detection and tracking method
CN105761214A (en) * 2016-01-14 2016-07-13 西安电子科技大学 Remote sensing image fusion method based on contourlet transform and guided filter
CN106649487A (en) * 2016-10-09 2017-05-10 苏州大学 Image retrieval method based on interest target
CN106897999A (en) * 2017-02-27 2017-06-27 江南大学 Apple image fusion method based on Scale invariant features transform
CN107330854A (en) * 2017-06-15 2017-11-07 武汉大学 A kind of image super-resolution Enhancement Method based on new type formwork
CN107680054A (en) * 2017-09-26 2018-02-09 长春理工大学 Multisource image anastomosing method under haze environment
CN108564088A (en) * 2018-04-17 2018-09-21 广东工业大学 Licence plate recognition method, device, equipment and readable storage medium storing program for executing

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102208103A (en) * 2011-04-08 2011-10-05 东南大学 Method of image rapid fusion and evaluation
CN103745203A (en) * 2014-01-15 2014-04-23 南京理工大学 Visual attention and mean shift-based target detection and tracking method
CN105761214A (en) * 2016-01-14 2016-07-13 西安电子科技大学 Remote sensing image fusion method based on contourlet transform and guided filter
CN106649487A (en) * 2016-10-09 2017-05-10 苏州大学 Image retrieval method based on interest target
CN106897999A (en) * 2017-02-27 2017-06-27 江南大学 Apple image fusion method based on Scale invariant features transform
CN107330854A (en) * 2017-06-15 2017-11-07 武汉大学 A kind of image super-resolution Enhancement Method based on new type formwork
CN107680054A (en) * 2017-09-26 2018-02-09 长春理工大学 Multisource image anastomosing method under haze environment
CN108564088A (en) * 2018-04-17 2018-09-21 广东工业大学 Licence plate recognition method, device, equipment and readable storage medium storing program for executing

Also Published As

Publication number Publication date
CN111507970A (en) 2020-08-07

Similar Documents

Publication Publication Date Title
CN104680508B (en) Convolutional neural networks and the target object detection method based on convolutional neural networks
CN103886589B (en) Object-oriented automated high-precision edge extracting method
CN112819772B (en) High-precision rapid pattern detection and recognition method
Kamencay et al. Improved Depth Map Estimation from Stereo Images Based on Hybrid Method.
CN101794439B (en) Image splicing method based on edge classification information
CN110929593A (en) Real-time significance pedestrian detection method based on detail distinguishing and distinguishing
CN111160291B (en) Human eye detection method based on depth information and CNN
CN104504395A (en) Method and system for achieving classification of pedestrians and vehicles based on neural network
CN112396036A (en) Method for re-identifying blocked pedestrians by combining space transformation network and multi-scale feature extraction
CN111507968B (en) Image fusion quality detection method and device
Choudhary et al. Enhancement in morphological mean filter for image denoising using glcm algorithm
CN116311212B (en) Ship number identification method and device based on high-speed camera and in motion state
Favorskaya et al. Intelligent inpainting system for texture reconstruction in videos with text removal
CN111507970B (en) Image fusion quality detection method and device
Widynski et al. A contrario edge detection with edgelets
CN111507969B (en) Image fusion quality detection method and device
Scharfenberger et al. Image saliency detection via multi-scale statistical non-redundancy modeling
Varkonyi-Koczy Fuzzy logic supported corner detection
Zeng et al. A new texture feature based on PCA pattern maps and its application to image retrieval
Swarnalatha et al. A centroid model for the depth assessment of images using rough fuzzy set techniques
Lyasheva et al. Application of image weight models to increase canny contour detector resilience to interference
Liu Restoration method of motion blurred image based on feature fusion and particle swarm optimization algorithm
Punia et al. Automatic detection of liver in CT images using optimal feature based neural network
Lyasheva et al. Image Borders Detection Based on the Weight Model Analysis and the Morphological Gradient Operation
CN110992285B (en) Image defogging method based on hierarchical neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Yuan Guiquan

Inventor after: Qian Yifan

Inventor after: Zhu Dong

Inventor after: Yang Yi

Inventor before: Yuan Guiquan

Inventor before: Qian Yifan

CB03 Change of inventor or designer information
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20211223

Address after: 401122 No. 21-1, building 7, No. 2, Huizhu Road, Yubei District, Chongqing

Applicant after: Chongqing QiTeng Technology Co.,Ltd.

Address before: 102400 no.18-d11961, Jianshe Road, Kaixuan street, Liangxiang, Fangshan District, Beijing

Applicant before: Beijing yingmaiqi Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder

Address after: 401122 No. 21-1, building 7, No. 2, Huizhu Road, Yubei District, Chongqing

Patentee after: Seven Teng Robot Co.,Ltd.

Address before: 401122 No. 21-1, building 7, No. 2, Huizhu Road, Yubei District, Chongqing

Patentee before: Chongqing QiTeng Technology Co.,Ltd.

CP01 Change in the name or title of a patent holder
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A Method and Device for Detecting Image Fusion Quality

Effective date of registration: 20230810

Granted publication date: 20220111

Pledgee: Chongqing Yuzhong Sub branch of China Construction Bank Corp.

Pledgor: Seven Teng Robot Co.,Ltd.

Registration number: Y2023980051686

PE01 Entry into force of the registration of the contract for pledge of patent right