CN111047604B - Transparency mask extraction method and device for high-definition image and storage medium - Google Patents

Transparency mask extraction method and device for high-definition image and storage medium Download PDF

Info

Publication number
CN111047604B
CN111047604B CN201911203685.9A CN201911203685A CN111047604B CN 111047604 B CN111047604 B CN 111047604B CN 201911203685 A CN201911203685 A CN 201911203685A CN 111047604 B CN111047604 B CN 111047604B
Authority
CN
China
Prior art keywords
region
pixel
node
value
foreground
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911203685.9A
Other languages
Chinese (zh)
Other versions
CN111047604A (en
Inventor
冯夫健
王林
黄翰
谭棉
刘爽
魏嘉银
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Minzu University
Original Assignee
Guizhou Minzu University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou Minzu University filed Critical Guizhou Minzu University
Priority to CN201911203685.9A priority Critical patent/CN111047604B/en
Publication of CN111047604A publication Critical patent/CN111047604A/en
Application granted granted Critical
Publication of CN111047604B publication Critical patent/CN111047604B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a transparency mask extraction method, a device and a storage medium of a high-definition image, wherein the method comprises the steps of marking an unknown region in the high-definition image; dividing the unknown region into a plurality of sub-regions according to the pixel information in the unknown region; converting each sub-region into nodes of a graph structure, calculating edge weights between adjacent nodes, and generating the graph structure according to each edge weight; generating a node optimization queue according to the edge weights among the nodes, determining a foreground area and a background area according to the node optimization queue, selecting pixel values, and carrying out optimal value solving on the pixel values. According to the method, the high-definition image is divided into areas through the pixel points, the divided areas are expressed in the form of nodes of the graph structure, the edge weights are calculated, the node optimization queue is obtained through the edge weights, the foreground area and the background area are rapidly determined in the node optimization queue, and the optimal foreground mask value is finally obtained, so that the calculation accuracy is high and rapid.

Description

Transparency mask extraction method and device for high-definition image and storage medium
Technical Field
The invention mainly relates to the technical field of image processing, in particular to a transparency mask extraction method and device for a high-definition image and a storage medium.
Background
At present, mobile equipment such as mobile phones and cameras are higher and higher in resolution of photographed images, the transparency mask extraction technology of high-resolution high-definition images is mainly applied to film and television special effects, different foreground targets are synthesized into a specified scene, the higher the extraction precision is, the better the visual effect of image synthesis is, and the problems of overlong calculation time and low calculation precision exist in the conventional method for extracting the transparency mask of high-definition images at present.
Disclosure of Invention
The invention aims to solve the technical problem of providing a transparency mask extraction method, a device and a storage medium for a high-definition image aiming at the defects of the prior art.
The technical scheme for solving the technical problems is as follows: a transparency mask extraction method of a high-definition image comprises the following steps:
inputting a high-definition image, and marking an unknown region, a foreground region and a background region in the high-definition image;
dividing the unknown region into a plurality of sub-regions according to pixel information in the unknown region;
converting each sub-region into nodes of a graph structure, calculating edge weights between adjacent nodes, and generating the graph structure according to each edge weight;
generating a node optimization queue according to edge weights among nodes, selecting pixel values among the plurality of subareas, the foreground area and the background area, carrying out optimal value solving on the selected pixel values according to the node optimization queue, and taking the optimal value obtained by solving as an optimal foreground mask value.
The other technical scheme for solving the technical problems is as follows: a transparency mask extraction apparatus of a high definition image, comprising:
the calibration module is used for inputting a high-definition image and calibrating an unknown region, a foreground region and a background region in the high-definition image;
the region segmentation module is used for segmenting the unknown region into a plurality of sub-regions according to the pixel information in the unknown region;
the graph structure generation module is used for converting each sub-region into nodes of the graph structure, calculating edge weights between adjacent nodes and generating the graph structure according to each edge weight;
and the optimization module is used for generating a node optimization queue according to the edge weights among the nodes, selecting pixel values from the plurality of subareas, the foreground area and the background area, carrying out optimal value solving on the selected pixel values according to the node optimization queue, and taking the optimal value obtained by solving as an optimal foreground mask value.
The other technical scheme for solving the technical problems is as follows: a transparency mask extraction apparatus for a high definition image, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, which when executed by the processor implements a transparency mask extraction method for a high definition image as described above.
The other technical scheme for solving the technical problems is as follows: a computer readable storage medium storing a computer program which, when executed by a processor, implements a transparency mask extraction method of a high definition image as described above.
The beneficial effects of the invention are as follows: and carrying out region division on the high-definition image through pixel points, expressing the divided regions in the form of nodes of a graph structure, calculating edge weights, obtaining a node optimization queue through the edge weights, rapidly determining a foreground region and a background region in the node optimization queue, and carrying out optimization solution on pixel values in the region to finally obtain an optimal foreground mask value, wherein the calculation accuracy is high and rapid.
Drawings
Fig. 1 is a flowchart illustrating a transparency mask extraction method for a high-definition image according to an embodiment of the present invention;
fig. 2 is a schematic functional block diagram of a transparency mask extracting device for a high-definition image according to an embodiment of the present invention;
fig. 3 is a schematic node diagram of a graph structure according to an embodiment of the present invention.
Detailed Description
The principles and features of the present invention are described below with reference to the drawings, the examples are illustrated for the purpose of illustrating the invention and are not to be construed as limiting the scope of the invention.
Fig. 1 is a flowchart illustrating a transparency mask extraction method for a high-definition image according to an embodiment of the present invention.
As shown in fig. 1, a transparency mask extraction method for a high-definition image includes the following steps:
inputting a high-definition image, and marking an unknown region in the high-definition image;
dividing the unknown region into a plurality of sub-regions according to pixel information in the unknown region;
converting each sub-region into nodes of a graph structure, calculating edge weights between adjacent nodes, and generating the graph structure according to each edge weight;
generating a node optimization queue according to edge weights among nodes, selecting pixel values among the plurality of subareas, the foreground area and the background area, carrying out optimal value solving on the selected pixel values according to the node optimization queue, and taking the optimal value obtained by solving as an optimal foreground mask value.
Specifically, the node optimization queue is generated by a minimum spanning tree method and edge weights.
It should be understood that a High Definition image refers to a High resolution image, and a High resolution image (High Definition) refers to an image having a vertical resolution of 720p or more.
It should be understood that the identification of the unknown region, the foreground region and the background region in the high-definition image is specifically that the target texture edge of the high-definition image is expanded by a preset template, the expanded region is used as the unknown region, the target region except the expanded region is used as the background region, and the other regions except the target region are used as the foreground region.
In the above embodiment, the high-definition image is divided into the regions by the pixel points, the divided regions are expressed in the form of nodes of the graph structure, the edge weights are calculated, the node optimization queue is obtained by the edge weights, the foreground region and the background region are rapidly determined in the node optimization queue, and the pixel values in the regions are optimally solved, so that the optimal foreground mask value is finally obtained, and the calculation accuracy is high and rapid.
Optionally, as an embodiment of the present invention, the process of dividing the unknown region into a plurality of sub-regions according to pixel information in the unknown region includes:
let the ith pixel point on the unknown region be p i I=1, 2, n, n is a positive integer;
calculating a mean shift vector m (p) corresponding to the pixel according to a pixel mean shift calculation formula and each pixel information on the unknown region i ) The pixel mean shift calculation formula is as follows:
Figure BDA0002296478830000041
wherein any one pixel p i Is composed of five dimensions, each of which represents { R, G, B, x, y }, R, G and B represent a pixel point p i The coordinates of the colors of (2) in RGB space, x and y representing the pixel point p i Plane coordinates on the high definition image,
Figure BDA0002296478830000042
h represents the bandwidth of the device, and h is more than 0, I 2 Representing the formula distance;
calculation of p i Mean shift vector m (p) i ) Until the five-dimensional data points converge, so that each point reaches the maximum local density;
dividing the mean shift vectors corresponding to the n calculated pixel points into w classes, wherein the Euclidean distance of any two pixel points in each class on a five-dimensional space is smaller than the bandwidth h;
merging classes with the pixel number smaller than a preset pixel number threshold M into adjacent classes to generate w' classes, wherein each class represents a sub-region.
It should be understood that a five-dimensional space represents five dimensions, namely { R, G, B, x, y }.
In the above embodiment, the local density is obtained by calculating the color of each pixel and the distance between each pixel, the euclidean distance of any two pixel points in each class on the five-dimensional space is determined according to the density, and the classes are combined according to the number of the pixel points in the class, so as to obtain a plurality of classes, wherein each class represents a sub-region.
Optionally, as an embodiment of the present invention, the calculating edge weights between adjacent nodes, and generating the graph structure according to each edge weight includes:
defining edge weights b between nodes i,j The definition is as follows:
Figure BDA0002296478830000051
wherein ,Ci For the color information of the pixel point,
Figure BDA0002296478830000052
S i is the plane coordinate information of the pixel points,
Figure BDA0002296478830000053
X wi ' RGB three-dimensional color information representing midpoint pixel of the i-th type region,/or->
Figure BDA0002296478830000054
Mean value of three-dimensional color information representing midpoint pixels of each class area, < >>
Figure BDA0002296478830000055
Variance value, X of RGB three-dimensional color information representing midpoint pixel of each class region Si ' plane coordinate information representing midpoint pixel of the i-th type region, ">
Figure BDA0002296478830000056
Average value of plane coordinate information representing midpoint pixels of each class area, < >>
Figure BDA0002296478830000057
Variance value of plane coordinate information representing midpoint pixel of each class area, +.>
According to the edge weight b between the node and each node i,j A graph structure is generated.
The midpoint pixel is defined as having
Figure BDA00022964788300000512
wherein ,/>
Figure BDA0002296478830000058
Figure BDA0002296478830000059
N i Representing the number of pixels in the i-th class of region, Ω i A set of pixels representing a region of class i, < >>
Figure BDA00022964788300000510
Plane x coordinate value representing midpoint pixel in the i-th type region,/->
Figure BDA00022964788300000511
Plane y coordinate value, x representing midpoint pixel in class i region j Plane x coordinate value, y representing the j-th pixel of the i-th type region j A plane y-coordinate value representing a j-th pixel of the i-th type region.
The process of converting each sub-region into the node of the graph structure is as follows: the sub-regions are numbered, each of which is denoted as a node of the graph structure.
Specifically, before defining the edge weights, each node needs to be labeled, as shown in fig. 3, different areas of the original image unknown area U corresponding to the w 'classes that are finally generated are labeled, where each label represents a segmented area (w' areas in total), and each label is a node of the graph structure.
In the above embodiment, each divided region is converted into the graph node representation, and the edge weight relationship is defined between the regions through the color and the distance, so that the node optimization queue can be conveniently generated.
Optionally, as an embodiment of the present invention, the process of determining the foreground area and the background area according to the node optimization queue includes:
optimizing a sub-region, a foreground region and a background region corresponding to an ith node of a node optimization queue to obtain an optimal value of the ith node, and taking the optimal value as an optimal foreground mask value of the ith node;
taking the optimal foreground mask value of the ith node as initial solution information of the (i+1) th node area, and optimizing the foreground area and the background area of the (i+1) th node according to the initial solution information to obtain the optimal foreground mask value of the (i+1) th node;
and optimizing all node areas in the node optimization queue until the optimization is completed, and obtaining the optimal foreground mask value of the whole unknown area.
In the above embodiment, by optimizing each node in the node optimization queue, the optimal foreground mask value of the whole unknown region is obtained, so that the extracted transparency mask result is more accurate.
Optionally, as an embodiment of the present invention, the optimizing the sub-area, the foreground area and the background area corresponding to the ith node in the node optimizing queue to obtain the optimal value of the ith node includes:
calculating each pixel in the foreground region and the background region corresponding to the ith node according to a pixel calculation formula, wherein the pixel calculation formula is that
Figure BDA0002296478830000071
wherein ,
Figure BDA0002296478830000072
Figure BDA0002296478830000073
represents the color value of the kth unknown pixel in the unknown region,
Figure BDA0002296478830000074
represents the kth background value selected in the background area,/->
Figure BDA0002296478830000075
Representing a kth foreground value selected in the foreground region;
taking all pixels in the foreground region and the background region corresponding to the ith node as an optimization variable X, randomly selecting pixel values from the foreground region and the background region, and assigning values to the optimization variable X according to the selected pixel values to obtain a solution set P= (X) 1 ,X 2 ,…,X N ) N represents the number of solutions,
evaluating each solution in the solution set P to obtain an optimal value of the ith node, wherein the evaluation process is as follows:
if f (X) i )>f(X j ) X is then j To X direction i The learning process comprises the following steps: according to the learning formula X j =X j +λ(X i -X j ) Study, X i And continuing to compare with the next solution in the solution set P to obtain comparison error values, and stopping comparing until the comparison error values of the N solutions are smaller than a preset error value to obtain an optimal solution, wherein the optimal solution is used as the optimal value of the ith node.
In the above embodiment, the solution values of the pixel values are calculated, and the solution values are compared to obtain the error value, and the optimal value is obtained by comparing the calculated error value with the preset error value, so that a more accurate foreground mask value can be obtained.
Fig. 2 is a functional block diagram of a transparency mask extraction device for high-definition images according to an embodiment of the present invention.
Alternatively, as another embodiment of the present invention, as shown in fig. 2, a transparency mask extraction apparatus for a high definition image includes:
the calibration module is used for inputting a high-definition image and calibrating an unknown region, a foreground region and a background region in the high-definition image;
the region segmentation module is used for segmenting the unknown region into a plurality of sub-regions according to the pixel information in the unknown region;
the graph structure generation module is used for converting each sub-region into nodes of the graph structure, calculating edge weights between adjacent nodes and generating the graph structure according to each edge weight;
and the optimization module is used for generating a node optimization queue according to the edge weights among the nodes, selecting pixel values from the plurality of subareas, the foreground area and the background area, solving the optimal values of the selected pixel values according to the node optimization queue, and taking the optimal values obtained by solving as optimal foreground mask values.
Optionally, as an embodiment of the present invention, the area dividing module is specifically configured to:
let the ith pixel point on the unknown region be p i I=1, 2, n, n is a positive integer;
calculating a mean shift vector m (p) corresponding to the pixel according to a pixel mean shift calculation formula and each pixel information on the unknown region i ) The pixel mean shift calculation formula is as follows:
Figure BDA0002296478830000081
wherein any one pixel p i Is composed of five dimensions, each of which represents { R, G, B, x, y }, R, G and B represent a pixel point p i The coordinates of the colors of (2) in RGB space, x and y representing the pixel point p i Plane coordinates on the high definition image,
Figure BDA0002296478830000082
h represents the bandwidth of the device, and h is more than 0, I 2 Representing the formula distance;
calculation of p i Mean shift vector m (p) i ) Until the five-dimensional data points converge, so that each point reaches the maximum local density;
dividing the mean shift vectors corresponding to the n calculated pixel points into w classes, wherein the Euclidean distance of any two pixel points in each class on a five-dimensional space is smaller than the bandwidth h;
merging classes with the pixel number smaller than a preset pixel number threshold M into adjacent classes to generate w' classes, wherein each class represents a sub-region.
Optionally, as an embodiment of the present invention, the graph structure generating module is specifically configured to:
defining edge weights b between nodes i,j The definition is as follows:
Figure BDA0002296478830000091
wherein ,Ci For the color information of the pixel point,
Figure BDA0002296478830000092
S i is the plane coordinate information of the pixel points,
Figure BDA0002296478830000093
X w′i RGB three-dimensional color information representing midpoint pixel of i-th class region,/and method for generating the same>
Figure BDA0002296478830000094
Mean value of three-dimensional color information representing midpoint pixels of each class area, < >>
Figure BDA0002296478830000095
Variance value, X of RGB three-dimensional color information representing midpoint pixel of each class region Si ' plane coordinate information representing midpoint pixel of the i-th type region, ">
Figure BDA0002296478830000096
Representing various class regionsAverage value of plane coordinate information of midpoint pixel of domain,/->
Figure BDA0002296478830000097
A variance value representing plane coordinate information of a midpoint pixel of each class region,
according to the edge weight b between the node and each node i,j A graph structure is generated.
Alternatively, as another embodiment of the present invention, a transparency mask extraction apparatus for a high definition image includes a memory, a processor, and a computer program stored in the memory and executable on the processor, which when executed by the processor, implements the transparency mask extraction method for a high definition image as described above.
Alternatively, as an embodiment of the present invention, a computer-readable storage medium storing a computer program which, when executed by a processor, implements the transparency mask extraction method of a high-definition image as described above.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims (8)

1. A method for extracting a transparency mask of a high definition image, comprising the steps of:
inputting a high-definition image, and marking an unknown region, a foreground region and a background region in the high-definition image;
dividing the unknown region into a plurality of sub-regions according to pixel information in the unknown region;
converting each sub-region into nodes of a graph structure, calculating edge weights between adjacent nodes, and generating the graph structure according to each edge weight;
generating a node optimization queue according to edge weights among nodes, selecting pixel values from the plurality of subareas, the foreground area and the background area, carrying out optimal value solving on the selected pixel values according to the node optimization queue, and taking the optimal value obtained by solving as an optimal foreground mask value;
the process for carrying out optimal value solving on the selected pixel values according to the node optimization queue comprises the following steps:
optimizing a sub-region, a foreground region and a background region corresponding to an ith node of the node optimization queue to obtain an optimal value of the ith node, and taking the optimal value as an optimal foreground mask value of the ith node;
taking the optimal foreground mask value of the ith node as initial solution information of the (i+1) th node area, and optimizing the foreground area and the background area of the (i+1) th node according to the initial solution information to obtain the optimal foreground mask value of the (i+1) th node;
optimizing all node areas in the node optimization queue until the node optimization queue is completed, and obtaining an optimal foreground mask value of the whole unknown area;
the process of optimizing the sub-region, the foreground region and the background region corresponding to the ith node of the node optimization queue to obtain the optimal value of the ith node comprises the following steps:
calculating each pixel in the foreground region and the background region corresponding to the ith node according to a pixel calculation formula, wherein the pixel calculation formula is that
Figure QLYQS_1
wherein ,
Figure QLYQS_2
representing the color value of the kth unknown pixel in the unknown region,/for>
Figure QLYQS_3
Represents the kth background value selected in the background area,/->
Figure QLYQS_4
Representing a kth foreground value selected in the foreground region;
taking all pixels in the foreground region and the background region corresponding to the ith node as an optimization variable X, randomly selecting pixel values from the foreground region and the background region, and assigning values to the optimization variable X according to the selected pixel values to obtain a solution set P= (X) 1 ,X 2 ,……,X N ) N represents the number of solutions;
evaluating each solution in the solution set P to obtain an optimal value of the ith node, wherein the evaluation process is as follows:
if f (X) i )>f(X j ) X is then j To X direction i The learning process comprises the following steps: according to the learning formula X j =X j +λ(X i -X j ) Study, X i And continuing to compare with the next solution in the solution set P to obtain comparison error values, and stopping comparing until the comparison error values of the N solutions are smaller than a preset error value to obtain an optimal solution, wherein the optimal solution is used as the optimal value of the ith node.
2. The transparency mask extraction method of high definition image according to claim 1, wherein the process of dividing the unknown region into a plurality of sub-regions according to pixel information in the unknown region comprises:
let the ith pixel point on the unknown region be p i ,i=1, 2..the term "n, n is a positive integer;
calculating a mean shift vector m (p) corresponding to the pixel according to a pixel mean shift calculation formula and each pixel information on the unknown region i ) The pixel mean shift calculation formula is as follows:
Figure QLYQS_5
wherein any one pixel p i Is composed of five dimensions, each of which represents { R, G, B, x, y }, R, G and B represent a pixel point p i The coordinates of the colors of (2) in RGB space, x and y representing the pixel point p i Plane coordinates on the high definition image,
Figure QLYQS_6
h represents the bandwidth, and h > 0, 2 representing the Euclidean distance;
calculation of p i Mean shift vector m (p) i ) Until the five-dimensional data points converge, so that each point reaches the maximum local density;
dividing the mean shift vectors corresponding to the n calculated pixel points into w classes, wherein the Euclidean distance of any two pixel points in each class on a five-dimensional space is smaller than the bandwidth h;
merging classes with the pixel number smaller than a preset pixel number threshold M into adjacent classes to generate w' classes, wherein each class represents a sub-region.
3. The method for extracting the transparency mask of the high-definition image according to claim 1, wherein the process of calculating the edge weights between the adjacent nodes and generating the graph structure according to the respective edge weights comprises:
defining edge weights b between nodes i,j The definition is as follows:
Figure QLYQS_7
wherein ,Ci For the color information of the pixel point,
Figure QLYQS_9
S i is pixel point plane coordinate information, +.>
Figure QLYQS_11
Figure QLYQS_14
RGB three-dimensional color information representing midpoint pixel of i-th class region,/and method for generating the same>
Figure QLYQS_10
Mean value of three-dimensional color information representing midpoint pixels of each class area, < >>
Figure QLYQS_12
The variance values of the RGB three-dimensional color information representing the midpoint pixels of each class area,
Figure QLYQS_13
plane coordinate information representing midpoint pixel of the i-th type region,/th type region>
Figure QLYQS_15
Average value of plane coordinate information representing midpoint pixels of each class area, < >>
Figure QLYQS_8
A variance value representing plane coordinate information of a midpoint pixel of each class region,
according to the edge weight b between the node and each node i,j A graph structure is generated.
4. A transparency mask extraction apparatus for a high definition image, comprising:
the calibration module is used for inputting a high-definition image and calibrating an unknown region, a foreground region and a background region in the high-definition image;
the region segmentation module is used for segmenting the unknown region into a plurality of sub-regions according to the pixel information in the unknown region;
the graph structure generation module is used for converting each sub-region into nodes of the graph structure, calculating edge weights between adjacent nodes and generating the graph structure according to each edge weight;
the optimization module is used for generating a node optimization queue according to the edge weights among the nodes, selecting pixel values from the plurality of subareas, the foreground area and the background area, carrying out optimal value solving on the selected pixel values according to the node optimization queue, and taking the optimal value obtained by solving as an optimal foreground mask value;
in the optimization module, the process of carrying out optimal value solving on the selected pixel value according to the node optimization queue comprises the following steps:
optimizing a sub-region, a foreground region and a background region corresponding to an ith node of the node optimization queue to obtain an optimal value of the ith node, and taking the optimal value as an optimal foreground mask value of the ith node;
taking the optimal foreground mask value of the ith node as initial solution information of the (i+1) th node area, and optimizing the foreground area and the background area of the (i+1) th node according to the initial solution information to obtain the optimal foreground mask value of the (i+1) th node;
optimizing all node areas in the node optimization queue until the node optimization queue is completed, and obtaining an optimal foreground mask value of the whole unknown area;
the process of optimizing the sub-region, the foreground region and the background region corresponding to the ith node of the node optimization queue to obtain the optimal value of the ith node comprises the following steps:
calculating each pixel in the foreground region and the background region corresponding to the ith node according to a pixel calculation formula, wherein the pixel calculation formula is that
Figure QLYQS_16
wherein ,
Figure QLYQS_17
representing the color value of the kth unknown pixel in the unknown region,/for>
Figure QLYQS_18
Represents the kth background value selected in the background area,/->
Figure QLYQS_19
Representing a kth foreground value selected in the foreground region;
taking all pixels in the foreground region and the background region corresponding to the ith node as an optimization variable X, randomly selecting pixel values from the foreground region and the background region, and assigning values to the optimization variable X according to the selected pixel values to obtain a solution set P= (X) 1 ,X 2 ,,X N ) N represents the number of solutions;
evaluating each solution in the solution set P to obtain an optimal value of the ith node, wherein the evaluation process is as follows:
if f (X) i )>f(X j ) X is then j To X direction i The learning process comprises the following steps: according to the learning formula X j =X j +λ(X i -X j ) Study, X i And continuing to compare with the next solution in the solution set P to obtain comparison error values, and stopping comparing until the comparison error values of the N solutions are smaller than a preset error value to obtain an optimal solution, wherein the optimal solution is used as the optimal value of the ith node.
5. The transparency mask extraction device of high definition image according to claim 4, wherein the region segmentation module is specifically configured to:
let the ith pixel point on the unknown region be p i I=1, 2, n, n is a positive integer;
calculating a mean shift vector m (p) corresponding to the pixel according to a pixel mean shift calculation formula and each pixel information on the unknown region i ) The sum ofThe pixel mean shift calculation formula is:
Figure QLYQS_20
wherein any one pixel p i Is composed of five dimensions, each of which represents { R, G, B, x, y }, R, G and B represent a pixel point p i The coordinates of the colors of (2) in RGB space, x and y representing the pixel point p i Plane coordinates on the high definition image,
Figure QLYQS_21
h represents the bandwidth of the device, and h is more than 0, I 2 Representing the Euclidean distance; />
Calculation of p i Mean shift vector m (p) i ) Until the five-dimensional data points converge, so that each point reaches the maximum local density;
dividing the mean shift vectors corresponding to the n calculated pixel points into w classes, wherein the Euclidean distance of any two pixel points in each class on a five-dimensional space is smaller than the bandwidth h;
merging classes with the pixel number smaller than a preset pixel number threshold M into adjacent classes to generate w' classes, wherein each class represents a sub-region.
6. The transparency mask extraction device of high definition image according to claim 4, wherein the graph structure generation module is specifically configured to:
defining edge weights b between nodes i,j The definition is as follows:
Figure QLYQS_22
wherein ,Ci For the color information of the pixel point,
Figure QLYQS_24
S i is pixel point plane coordinate information, +.>
Figure QLYQS_26
Figure QLYQS_28
RGB three-dimensional color information representing midpoint pixel of i-th class region,/and method for generating the same>
Figure QLYQS_25
Mean value of three-dimensional color information representing midpoint pixels in each class area, < >>
Figure QLYQS_27
Variance values of RGB three-dimensional color information representing midpoint pixels of each class area, +.>
Figure QLYQS_29
Plane coordinate information representing midpoint pixel of the i-th type region,/th type region>
Figure QLYQS_30
Average value of plane coordinate information representing midpoint pixels of each class area, < >>
Figure QLYQS_23
A variance value representing plane coordinate information of a midpoint pixel of each class region,
according to the edge weight b between the node and each node i,j A graph structure is generated.
7. A transparency mask extraction device for high definition images comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the transparency mask extraction method for high definition images according to any one of claims 1 to 3 is implemented when the computer program is executed by the processor.
8. A computer readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the transparency mask extraction method of a high definition image according to any one of claims 1 to 3.
CN201911203685.9A 2019-11-29 2019-11-29 Transparency mask extraction method and device for high-definition image and storage medium Active CN111047604B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911203685.9A CN111047604B (en) 2019-11-29 2019-11-29 Transparency mask extraction method and device for high-definition image and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911203685.9A CN111047604B (en) 2019-11-29 2019-11-29 Transparency mask extraction method and device for high-definition image and storage medium

Publications (2)

Publication Number Publication Date
CN111047604A CN111047604A (en) 2020-04-21
CN111047604B true CN111047604B (en) 2023-04-28

Family

ID=70233222

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911203685.9A Active CN111047604B (en) 2019-11-29 2019-11-29 Transparency mask extraction method and device for high-definition image and storage medium

Country Status (1)

Country Link
CN (1) CN111047604B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101989353A (en) * 2010-12-10 2011-03-23 中国科学院深圳先进技术研究院 Image matting method
CN102291520A (en) * 2006-05-26 2011-12-21 佳能株式会社 Image processing method and image processing apparatus
CN102651135A (en) * 2012-04-10 2012-08-29 电子科技大学 Optimized direction sampling-based natural image matting method
CN103942794A (en) * 2014-04-16 2014-07-23 南京大学 Image collaborative cutout method based on confidence level
CN104134192A (en) * 2014-07-23 2014-11-05 中国科学院深圳先进技术研究院 Image defogging method and system
CN105931244A (en) * 2016-04-29 2016-09-07 中科院成都信息技术股份有限公司 Supervision-free image matting method and apparatus
CN106056606A (en) * 2016-05-30 2016-10-26 乐视控股(北京)有限公司 Image processing method and device
CN110400323A (en) * 2019-07-30 2019-11-01 上海艾麒信息科技有限公司 It is a kind of to scratch drawing system, method and device automatically
CN110503704A (en) * 2019-08-27 2019-11-26 北京迈格威科技有限公司 Building method, device and the electronic equipment of three components

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102291520A (en) * 2006-05-26 2011-12-21 佳能株式会社 Image processing method and image processing apparatus
CN101989353A (en) * 2010-12-10 2011-03-23 中国科学院深圳先进技术研究院 Image matting method
CN102651135A (en) * 2012-04-10 2012-08-29 电子科技大学 Optimized direction sampling-based natural image matting method
CN103942794A (en) * 2014-04-16 2014-07-23 南京大学 Image collaborative cutout method based on confidence level
CN104134192A (en) * 2014-07-23 2014-11-05 中国科学院深圳先进技术研究院 Image defogging method and system
CN105931244A (en) * 2016-04-29 2016-09-07 中科院成都信息技术股份有限公司 Supervision-free image matting method and apparatus
CN106056606A (en) * 2016-05-30 2016-10-26 乐视控股(北京)有限公司 Image processing method and device
CN110400323A (en) * 2019-07-30 2019-11-01 上海艾麒信息科技有限公司 It is a kind of to scratch drawing system, method and device automatically
CN110503704A (en) * 2019-08-27 2019-11-26 北京迈格威科技有限公司 Building method, device and the electronic equipment of three components

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Arunava De等.Masking Based Segmentation of Diseased MRI Images.《2010 International Conference on Information Science and Applications》,Masking Based Segmentation of Diseased MRI Images.2010,正文第1-7页. *
颜学名."基于二部图结构信息的启发式算法研究".《中国优秀硕士学位论文全文数据库 信息科技辑》,"基于二部图结构信息的启发式算法研究".2019,正文第1-52页. *

Also Published As

Publication number Publication date
CN111047604A (en) 2020-04-21

Similar Documents

Publication Publication Date Title
Chen et al. Improved saliency detection in RGB-D images using two-phase depth estimation and selective deep fusion
US10354129B2 (en) Hand gesture recognition for virtual reality and augmented reality devices
US20190347767A1 (en) Image processing method and device
US10410084B2 (en) Devices, systems, and methods for anomaly detection
CN108986152B (en) Foreign matter detection method and device based on difference image
CN108876723B (en) Method for constructing color background of gray target image
CN108960115B (en) Multidirectional text detection method based on angular points
JP2018124890A (en) Image processing apparatus, image processing method, and image processing program
CN104657980A (en) Improved multi-channel image partitioning algorithm based on Meanshift
CN111144213A (en) Object detection method and related equipment
CN109345536B (en) Image super-pixel segmentation method and device
US20030169340A1 (en) Method and apparatus for tracking moving objects in pictures
CN111178193A (en) Lane line detection method, lane line detection device and computer-readable storage medium
CN108647264A (en) A kind of image automatic annotation method and device based on support vector machines
CN114155285B (en) Image registration method based on gray histogram
CN115393635A (en) Infrared small target detection method based on super-pixel segmentation and data enhancement
KR100362171B1 (en) Apparatus, method and computer readable medium for computing a transform matrix using image feature point matching technique, and apparatus, method and computer readable medium for generating mosaic image using the transform matrix
CN106056575B (en) A kind of image matching method based on like physical property proposed algorithm
CN111047604B (en) Transparency mask extraction method and device for high-definition image and storage medium
JP2018137667A (en) Camera calibration method, program and device
CN111738061A (en) Binocular vision stereo matching method based on regional feature extraction and storage medium
JP6546385B2 (en) IMAGE PROCESSING APPARATUS, CONTROL METHOD THEREOF, AND PROGRAM
CN111047614A (en) Feature extraction-based method for extracting target corner of complex scene image
CN112417961B (en) Sea surface target detection method based on scene prior knowledge
CN112037230B (en) Forest image segmentation method based on superpixels and hyper-metric profile map

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant