CN111738934B - Automatic red eye repairing method based on MTCNN - Google Patents

Automatic red eye repairing method based on MTCNN Download PDF

Info

Publication number
CN111738934B
CN111738934B CN202010413910.8A CN202010413910A CN111738934B CN 111738934 B CN111738934 B CN 111738934B CN 202010413910 A CN202010413910 A CN 202010413910A CN 111738934 B CN111738934 B CN 111738934B
Authority
CN
China
Prior art keywords
red
face
eye
eyes
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010413910.8A
Other languages
Chinese (zh)
Other versions
CN111738934A (en
Inventor
苏雪平
高蒙
陈宁
任劼
李云红
朱丹尧
段嘉伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Polytechnic University
Original Assignee
Xian Polytechnic University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Polytechnic University filed Critical Xian Polytechnic University
Priority to CN202010413910.8A priority Critical patent/CN111738934B/en
Publication of CN111738934A publication Critical patent/CN111738934A/en
Application granted granted Critical
Publication of CN111738934B publication Critical patent/CN111738934B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/77Retouching; Inpainting; Scratch removal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/193Preprocessing; Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details
    • G06T2207/20028Bilateral filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Ophthalmology & Optometry (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an automatic red eye repair method based on MTCNN, which is implemented according to the following steps: step1, inputting a red eye image into an MTCNN network, wherein the MTCNN network detects a human face and returns the position of the human face and the horizontal and vertical coordinates of the pupils, the nasal tips, the left mouth corner and the right mouth corner of the human face; step2, calculating the pupil distance of the eyes according to the pupil coordinates of the eyes of the face obtained in the step1, and then expanding the proportion to obtain the ROI after parameter adjustment; and 3, performing operations of shielding red eyes, cleaning pupil masks and repairing the red eyes on the ROI obtained in the step2, and finally copying the processed image to an eye area of the original image to obtain a repaired face image. The method has the advantages of full automation, low false detection rate and high red eye repairing speed.

Description

Automatic red eye repairing method based on MTCNN
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to an automatic red eye repairing method based on MTCNN.
Background
Red eye is a red spot in the photo at the pupil of the human eye, which is caused by the flash of the camera. When the ambient brightness is dark, the human pupil is properly enlarged, and when the eyes suddenly receive intense light, the blood vessels reflect the light reflected to the blood color of the lens, so that the pupil of the eyes in the photo is red. Red eye has a large contrast with the eye colors that people have previously perceived, which reduces the quality of the photo. Red eye is a common uncoordinated factor in photography, and many scholars propose various red eye repairing methods, mainly comprising two methods of full-automatic repairing and semi-automatic repairing. Principle of semi-automatic red eye repair algorithm: the ROI of the red eye is first manually selected (Region of Interest ), then the eye position is determined using a corresponding algorithm, and finally the eye pixels are adjusted to repair the red eye. Although the semiautomatic red eye repair algorithm is accurate, it requires manual processing and cannot be adapted to process large-scale data. Basic principle of automatic red eye repair algorithm: firstly, using some characteristics of eyes, adopting a corresponding method to automatically determine the red eye position, and finally repairing the red eye. Although the automatic red eye repairing algorithm does not need manual processing, the method has the advantages of low efficiency, low speed, easy noise interference and poor robustness, so that the conventional red eye repairing method has the problems of relatively low red eye repairing speed and relatively high false detection rate in a comprehensive view.
Disclosure of Invention
The invention aims to provide an automatic red eye repairing method based on MTCNN, which solves the problems of relatively low red eye repairing speed and relatively high false detection rate in the red eye repairing method in the prior art.
The technical proposal adopted by the invention is that,
the automatic red eye repairing method based on the MTCNN is implemented according to the following steps:
step1, inputting a red eye image into an MTCNN network, wherein the MTCNN network detects a human face and returns the position of the human face and the horizontal and vertical coordinates of the pupils, the nasal tips, the left mouth corner and the right mouth corner of the human face;
step2, calculating the pupil distance of the eyes according to the pupil coordinates of the eyes of the face obtained in the step1, and then expanding the proportion to obtain the ROI after parameter adjustment;
and 3, performing operations of shielding red eyes, cleaning pupil masks and repairing the red eyes on the ROI obtained in the step2, and finally copying the processed image to an eye area of the original image to obtain a repaired face image.
The present invention is also characterized in that,
the step1 is specifically implemented according to the following steps:
step 1.1, creating an image pyramid according to the set size of an input red eye image, and performing multi-stage scaling on the red eye image to obtain a group of input images with different sizes;
step 1.2, inputting a group of images with different sizes into a P-Net, generating a feature map through a convolution layer and a pooling layer with different sizes in sequence, judging face contour points through the feature map, generating face candidate frames and frame regression vectors after the images are analyzed and processed by the P-Net, and obtaining a plurality of face candidate frames after recalibration;
step 1.3, inputting the plurality of face candidate boxes obtained in the step 1.2 into R-Net for further training; continuously removing the face candidate frames which do not reach the standard through the set threshold value, and inhibiting and removing the face candidate frames with high overlapping by using a non-maximum value to obtain a plurality of face candidate frames after further training;
and 1.4, inputting the plurality of face candidate boxes obtained in the step 1.3 after further training into an O-Net network, and finally outputting the face position and characteristic points of the horizontal and vertical coordinates of the pupils, the nasal tips, the left and right mouth corners of the eyes of the face after the O-Net network further accurately positions the face position.
In the step2, the calculation of the pupil distance of the eyes is specifically implemented according to the following steps:
the binocular coordinates returned by face detection calculate the distance of the pupils of the eyes using the following formula (6):
wherein D is lr Is the distance between the pupils of the left eye and the right eye of the human face,and->Is the abscissa of the left eye, +.>Andthe abscissa of the right eye.
In the step2, the proportion expansion is specifically implemented according to the following steps:
the ROIs of the left eye and the right eye of the face are marked by rectangular frames respectively through the pupil distance of the two eyes according to a certain proportion, and the calculation formula is as follows:
wherein LEL x,y Is the left upper corner of the left-eye rectangular frameCoordinates, LER x,y REL is the right lower corner coordinate of the left-eye rectangular frame x,y Right eye rectangular frame upper left corner coordinates, RER x,y The lower right corner coordinates of the right-eye rectangular box imw and imh represent the width and height of the face image, respectively.
The step3 is specifically implemented according to the following steps:
step 3.1, dividing the ROI into three channels of red, green and blue, then creating a mask, and only processing the red pupil area; finally, the extracted red pupil area is set to be white, and other areas are set to be black;
step 3.2, performing contour detection on the created mask, extracting white areas which are possibly red eyes in the mask, calculating the area formed by the contour of each white area, storing the contour area with the largest area and pixel points, accurately positioning the red eyes, performing closed operation on the red eyes, and removing noise points in the red eyes;
and 3.3, creating an average channel by averaging the green channel and the blue channel, replacing all pixel values of the red, green and blue channels in the red eye region with the pixel values of the average channel, merging the red, green and blue channels, smoothing and denoising the repaired region by adopting bilateral filtering, and finally obtaining the repaired face image.
In step 3.3, the smoothing denoising treatment for the repair area by bilateral filtering is specifically performed according to the following formula (13):
wherein w (i, j, k, l) is defined by a spatial domain kernel w d (i, j, k, l) and value range kernel w r (i, j, k, l) by multiplying, specifically, the following formula (14):
where q (i, j) is the coordinates of the other coefficients of the template window; p (k, l) is the center coordinate point of the template window; sigma (sigma) d And sigma (sigma) r Standard deviation as gaussian function; f (i, j) represents the pixel value of the image at point q (i, j); f (k, l) represents the pixel value of the image at point p (k, l).
The beneficial effects of the invention are as follows: the automatic red eye repairing method based on the MTCNN is based on the face detection research result of the convolutional neural network in recent years, combines the advantages of the MTCNN, improves the face detection rate and the detection speed, improves the dissonance factor of red eyes in images, repairs the red eyes in the face images, and achieves full automation, 94.74% of the human eye detection rate, 3.57% of the human eye false detection rate, 84.11% of the red eye repairing rate and 347.51 milliseconds of the red eye image repairing speed.
Drawings
Fig. 1 is a schematic diagram of an automatic red eye repair method based on MTCNN according to the present invention;
FIG. 2 is a P-Net network diagram of the automatic red eye repair method based on the MTCNN of the present invention;
FIG. 3 is an R-Net network diagram of the automatic red eye repair method based on the MTCNN of the present invention;
fig. 4 is an O-Net network diagram of the automatic red eye repair method based on MTCNN of the present invention.
Detailed Description
The invention relates to an automatic red eye repairing method based on MTCNN, which is described in detail below with reference to the accompanying drawings and the detailed description.
As shown in fig. 1, the automatic redeye repairing method based on MTCNN is specifically implemented according to the following steps:
step1, inputting a red eye image into an MTCNN network, wherein the MTCNN network detects a human face and returns the position of the human face and the horizontal and vertical coordinates of the pupils, the nasal tips, the left mouth corner and the right mouth corner of the human face;
step2, calculating the pupil distance of the eyes according to the pupil coordinates of the eyes of the face obtained in the step1, then expanding the proportion, and obtaining the ROI (Region of Interest) after parameter adjustment;
and 3, performing operations of shielding red eyes, cleaning pupil masks and repairing the red eyes on the ROI obtained in the step2, and finally copying the processed image to an eye area of the original image to obtain a repaired face image.
Step 1.1, creating an image pyramid according to the set size of the input red eye image, and performing multi-stage scaling on the red eye image to obtain a group of input images with different sizes;
step 1.2, inputting a group of images with different sizes into a full convolutional neural network (P-Net), generating feature images through convolutional layers and pooling layers with different sizes in sequence, judging face contour points through the feature images, generating face candidate frames and frame regression vectors after the images are analyzed and processed by the P-Net, and obtaining a plurality of face candidate frames after recalibration;
step 1.3, inputting the plurality of face candidate boxes obtained in the step 1.2 into R-Net for further training; continuously removing the face candidate frames which do not reach the standard through the set threshold value, and removing the face candidate frames with high overlapping by Non-maximum value suppression (Non-Maximum Suppression, NMS) to obtain a plurality of face candidate frames after further training;
and 1.4, inputting the plurality of face candidate boxes obtained in the step 1.3 after further training into an O-Net network, and finally outputting the face position and characteristic points of the horizontal and vertical coordinates of the pupils, the nasal tips, the left and right mouth corners of the eyes of the face after the O-Net network further accurately positions the face position.
Further, in step2, the calculation of the pupil distance of the eyes is specifically performed according to the following steps:
the binocular coordinates returned by face detection calculate the distance of the pupils of the eyes using the following formula (6):
wherein D is lr Is the distance between the pupils of the left eye and the right eye of the human face,and->Is the abscissa of the left eye, +.>Andthe abscissa of the right eye.
Further, in the step2, the proportion expansion is specifically implemented according to the following steps:
the ROIs of the left eye and the right eye of the face are marked by rectangular frames respectively through the pupil distance of the two eyes according to a certain proportion, and the calculation formula is as follows:
wherein LEL x,y For the upper left corner of the left-eye rectangular frame, LER x,y REL is the right lower corner coordinate of the left-eye rectangular frame x,y Right eye rectangular frame upper left corner coordinates, RER x,y The lower right corner coordinates of the right-eye rectangular box imw and imh represent the width and height of the face image, respectively.
Further, the step3 is specifically implemented according to the following steps:
step 3.1, dividing the ROI into three channels of red, green and blue, then creating a mask, and only processing the red pupil area; finally, the extracted red pupil area is set to be white, and other areas are set to be black;
step 3.2, performing contour detection on the created mask, extracting white areas which are possibly red eyes in the mask, calculating the area formed by the contour of each white area, storing the contour area with the largest area and pixel points, accurately positioning the red eyes, performing closed operation on the red eyes, and removing noise points in the red eyes;
and 3.3, creating an average channel by averaging the green channel and the blue channel, replacing all pixel values of the red, green and blue channels in the red eye region with the pixel values of the average channel, merging the red, green and blue channels, smoothing and denoising the repaired region by adopting bilateral filtering, and finally obtaining the repaired face image.
In step 3.3, the smoothing denoising treatment for the repair area by bilateral filtering is specifically performed according to the following formula (13):
wherein w (i, j, k, l) is defined by a spatial domain kernel w d (i, j, k, l) and value range kernel w r (i, j, k, l) by multiplying, specifically, the following formula (14):
where q (i, j) is the coordinates of the other coefficients of the template window; p (k, l) is the center coordinate point of the template window; sigma (sigma) d And sigma (sigma) r Standard deviation as gaussian function; f (i, j) represents the pixel value of the image at point q (i, j); f (k, l) represents the pixel value of the image at point p (k, l).
The automatic red eye repair method based on the MTCNN is further described in detail through specific examples.
Examples
The invention discloses an automatic red eye repairing method based on MTCNN, which comprises the following steps:
MTCNN-based face detection
For an input red eye image, the red eye image is firstly input into an MTCNN network to detect a human face and returns a human face position and a human face key coordinate, and the specific steps are as follows:
step1: for a given input Image, an Image pyramid (image_pychlamid) is first created according to a set size (minsize), and the Image is subjected to a multi-level scaling (scale) operation, so as to obtain a set of input images with different sizes. Scale=0.7 and minsize=12, as chosen herein.
Step2: the set of images of different sizes from the image pyramid in Step1 are input into a full convolutional neural network (P-Net), as shown in fig. 2. The input layer size of the P-Net network is 12 x 3, the first convolution layer size is 3 x 10, and the maximum pooling layer size is 2 x 2, so as to generate 10 5*5 feature maps; the second convolution layer has a size of 3 x 16, and generates 16 3*3 feature maps; the third convolution layer has a size of 3 x 32, generating 32 signature graphs of 1*1. Finally, for 32 feature maps 1*1, firstly, generating 2 feature maps 1*1 for face classification through 2 convolution kernels of 1 x 32; secondly, generating 4 1*1 feature maps for judging a regression frame through 4 convolution kernels of 1 x 32; finally, 10 characteristic graphs 1*1 are generated through 10 convolution kernels of 1 x 32 and are used for judging the face contour points. The image is analyzed and processed by P-Net to generate face candidate frames and frame regression vectors, the layer network is firstly calibrated according to a set threshold (threshold), the face candidate frames which do not reach standards are removed, and Non-maximum suppression (Non-Maximum Suppression, NMS) is used for removing the face candidate frames which are highly overlapped.
Step3: inputting the candidate frames generated in Step2 into R-Net for further training, continuously removing the non-standard face candidate frames through the set threshold value, and removing the highly overlapped face candidate frames by NMS. As shown in fig. 3, the R-Net network has an R-Net input layer size of 24×24×3, a first convolution layer size of 3×3×28, and a maximum pooling layer size of 3*3, so as to generate 28 feature maps of 11×11. The second convolution layer size was 3 x 48, and the largest pooling layer size was 3*3, yielding 48 4*4 feature maps. The third convolution layer size is 2 x 64, generating 64 3*3 feature maps. The 64 3*3 feature maps are input to a 128-dimensional fully connected layer. Unlike step2, finally, face classification is performed using a full-connection layer with dimension 2, bounding box regression is performed using a full-connection layer with dimension 4, and face key point positioning is performed using a full-connection layer with dimension 10.
Step4: the several candidate boxes generated in Step3 are input into the O-Net network as shown in fig. 4. The size of the O-Net input layer is 48 x 3, the size of the first convolution layer is 3 x 32, and 32 feature maps of 23 x 23 are generated by adopting a maximum pooling layer of 3*3 size; the second convolution layer is 3 x 63, and the maximum pooling layer of 3*3 is adopted to generate 64 10 x 10 feature maps; the third convolution layer is 3 x 64, and the largest pooling layer with the size of 2 x 2 is adopted to generate 64 4*4 characteristic diagrams; the fourth convolution layer size is 2 x 128, and 128 feature maps of 3*3 size are generated; finally, 128 3*3-sized feature maps are connected to a 256-dimensional full connection layer. And finally, respectively carrying out face classification, bounding box regression and face key point positioning by using full-connection layers with dimensions of 2, 4 and 10. The O-Net is similar to the former two steps in removing the face candidate frame, and the face candidate frame is different from the two networks in further precisely positioning the face position, and finally outputting 5 characteristic points (pupil of eyes, nose tip, left and right mouth corners) of the face.
The threshold value threshold selected by the three networks is 0.6,0.7,0.7, the sliding step length of the convolution layer is 1, all zero filling is not adopted, the sliding step length of the pooling layer is 2, all zero filling is adopted, the activation function is PReLu, and the function expression is:
for sample x i The judgment cross entropy loss function of the face is as follows:
wherein the method comprises the steps ofA true class label representing a face, 0 represents a non-face,1 represents a human face, p i Represents x i Probability of being a human face.
The face candidate frame regression adopts a Euclidean distance loss function:
wherein the method comprises the steps ofRepresenting the real coordinates of the face candidate box, +.>Representing face candidates derived from the network,the upper left corner abscissa, height and width of the face candidate box are included.
The characteristic point positioning of the face adopts a Euclidean distance loss function:
wherein the method comprises the steps ofRepresenting the real coordinates of 5 feature points of the face, < >>The system comprises the abscissa of pupils of two eyes of a human face, the abscissa of nasal tips, and the abscissas of left and right mouth corners.
The final objective function of the MTCNN network is:
where N represents the total number of samples. Alpha represents the weight of face judgment, candidate frame regression and feature point positioning in the current stage network, and beta represents the real label of the sample; in P-Net and R-Net networks, the alpha values of face, box, point are 1, 0.5, respectively, while in O-Net networks, the alpha values of face, box, point are 1, 0.5, 1, respectively.
(2) Human eye positioning
Aiming at the pupil coordinates of the eyes obtained by the face detection in the last step, the pupil distance of the eyes is calculated, then a certain proportion expansion is carried out, and a good eye region (namely the ROI for red eye repair) can be obtained through proper parameter adjustment, so that the calculated amount is reduced, and the robustness is improved. The method comprises the following specific steps:
step1: the binocular coordinates returned by face detection calculate the distance of the pupils of the eyes using the following formula (6):
wherein D is lr Is the distance between the pupils of the left eye and the right eye of the human face,and->Is the abscissa of the left eye, +.>Andthe abscissa of the right eye.
Step2: the pupil distance of the eyes of the face calculated in Step1 is adjusted according to a certain proportion, and the ROIs of the left eye and the right eye of the face are marked by rectangular frames respectively, and the calculation formula is as follows:
wherein LEL x,y For the upper left corner of the left-eye rectangular frame, LER x,y REL is the right lower corner coordinate of the left-eye rectangular frame x,y Right eye rectangular frame upper left corner coordinates, RER x,y The lower right corner coordinates of the right-eye rectangular box imw and imh represent the width and height of the face image, respectively.
(3) Red eye repair
The red eye repairing method provided by the invention comprises 3 steps of red eye shielding, pupil mask cleaning and red eye repairing, and specifically comprises the following steps:
step1: firstly, dividing a human eye ROI marked by a rectangular frame into R, G, B channels (namely red, green and blue channels); secondly, creating a red eye detector, namely creating a mask with a red channel pixel value larger than 50 and larger than the sum of the blue channel pixel value and the green channel pixel value, wherein the purpose is to use the mask as shielding, and only process the red pupil area; finally, the extracted red pupil area is set to be white, and other areas are set to be black. The calculation formula is as follows:
where mask represents a mask, N represents an image size, r i 、b i And g i The pixel values of pixel i in the red, blue and green channels are represented, respectively. This step may initially locate the red eye region, but noise interference points may exist around or within the red eye region, so further accurate location and denoising are required.
Step2: it is known from Step1 that the red eye region is set to white and the other regions are set to black, so that the red eye region is positioned for further accuracy. Firstly, performing contour detection on the created mask, extracting white areas which are possibly red eyes in the mask, then calculating the area formed by the contour of each white area, and storing the contour area with the largest area and pixel points, so that the red eyes area can be accurately positioned. Since there may be interference of noise points inside and outside the precisely located red eye region, denoising processing is required. And a cross structure with the size of 5*5 is adopted to perform closed operation on the red eye region, so that noise points in the red eye region are removed, and meanwhile, the pupil region is more round.
Step3: through the above steps, each eye has a mask containing red portions, since red eyes fill the red channel in the image, saturate it, and red eyes break the texture only in the red channel, and still perform well in the green and blue channels, a reasonable texture should be found to repair it. The average channel is first created by averaging the green and blue channels, the formula:
and then replacing all pixel values of three channels in the red eye region by the average channel pixel value, and finally combining R, G, B three channels, wherein the boundary of the repaired eye region has a significant difference with surrounding pixels through the operation, and in order to make the repaired eye more natural, the repaired region is subjected to smooth denoising treatment by adopting bilateral filtering, and the calculation formula is as follows:
wherein w (i, j, k, l) is defined by a spatial domain kernel w d (i, j, k, l) and value range kernel w r (i, j, k, l) by multiplying as follows:
where q (i, j) is the coordinates of the other coefficients of the template window; p (k, l) is the center coordinate point of the template window; sigma (sigma) d Sum sigma r Standard deviation as gaussian function; f (i, j) represents the pixel value of the image at point q (i, j); f (k, l) represents the pixel value of the image at point p (k, l).
And finally, copying the processed image to an eye area of the original image, and outputting and storing the repaired face image.
The automatic red eye repairing method based on the MTCNN has the advantages of high face detection speed, good robustness under the unconstrained condition, capability of acquiring the coordinates of key points of human eyes through regression, and the like, and has the advantages of full automation, low false detection rate and high red eye repairing speed.

Claims (4)

1. The automatic red eye repairing method based on the MTCNN is characterized by comprising the following steps:
step1, inputting a red eye image into an MTCNN network, wherein the MTCNN network detects a human face and returns the position of the human face and the horizontal and vertical coordinates of the pupils, the nasal tips, the left mouth corner and the right mouth corner of the human face;
step2, calculating the pupil distance of the eyes according to the pupil coordinates of the eyes of the face obtained in the step1, and then expanding the proportion to obtain the ROI after parameter adjustment;
step3, performing operations of red eye shielding, pupil mask cleaning and red eye repairing on the ROI obtained in the step2, and finally copying the processed image to an eye area of an original image to obtain a repaired face image;
the step3 is specifically implemented according to the following steps:
step 3.1, dividing the ROI into three channels of red, green and blue, then creating a mask, and only processing the red pupil area; finally, the extracted red pupil area is set to be white, and other areas are set to be black;
where mask represents a mask, N represents an image size, r i 、b i And g i Respectively representing pixel values of a pixel point i in a red channel, a blue channel and a green channel;
step 3.2, performing contour detection on the created mask, extracting white areas which are possibly red eyes in the mask, calculating the area formed by the contour of each white area, storing the contour area with the largest area and pixel points, accurately positioning the red eyes, performing closed operation on the red eyes, and removing noise points in the red eyes;
wherein mean represents the average value of pixel values of pixel points i in the blue channel and the blue channel;
step 3.3, creating an average channel through the average green channel and the blue channel, replacing all pixel values of the red, green and blue channels in the red eye region with the pixel values of the average channel, merging the red, green and blue channels, smoothing and denoising a repaired region by adopting bilateral filtering, and finally obtaining a repaired face image;
in the step 3.3, the smoothing denoising treatment of the repair area by the bilateral filtering is specifically performed according to the following formula (13):
wherein w (i, j, k, l) is defined by a spatial domain kernel w d (i, j, k, l) and value range kernel w r (i, j, k, l) by multiplying, specifically, the following formula (14):
w(i,j,k,l)=w d (i,j,k,l)*w r (i,j,k,l)
where q (i, j) is the coordinates of the other coefficients of the template window; p (k, l) is the center coordinate point of the template window; sigma (sigma) d Sum sigma r Standard deviation as gaussian function; f (i, j) represents the pixel value of the image at point q (i, j); f (k, l) represents the pixel value of the image at point p (k, l).
2. The automatic red eye repair method based on MTCNN according to claim 1, wherein the step1 is specifically implemented according to the following steps:
step 1.1, creating an image pyramid according to the set size of an input red eye image, and performing multi-stage scaling on the red eye image to obtain a group of input images with different sizes;
step 1.2, inputting a group of images with different sizes into a P-Net, generating a feature map through a convolution layer and a pooling layer with different sizes in sequence, judging face contour points through the feature map, generating face candidate frames and frame regression vectors after the images are analyzed and processed by the P-Net, and obtaining a plurality of face candidate frames after recalibration;
step 1.3, inputting the plurality of face candidate boxes obtained in the step 1.2 into R-Net for further training; continuously removing the face candidate frames which do not reach the standard through the set threshold value, and inhibiting and removing the face candidate frames with high overlapping by using a non-maximum value to obtain a plurality of face candidate frames after further training;
and 1.4, inputting the plurality of face candidate boxes obtained in the step 1.3 after further training into an O-Net network, and finally outputting the face position and characteristic points of the horizontal and vertical coordinates of the pupils, the nasal tips, the left and right mouth corners of the eyes of the face after the O-Net network further accurately positions the face position.
3. The automatic red eye repair method based on MTCNN according to claim 1, wherein in step2, the calculation of the pupillary distance of both eyes is specifically performed according to the following steps:
the binocular coordinates returned by face detection calculate the distance of the pupils of the eyes using the following formula (6):
wherein D is lr Is the distance between the pupils of the left eye and the right eye of the human face,and->Is the abscissa of the left eye, +.>And->The abscissa of the right eye.
4. The automatic red eye repair method based on MTCNN according to claim 3, wherein in step2, the ratio expansion is specifically implemented according to the following steps:
and marking the ROIs of the left eye and the right eye of the face by using rectangular frames respectively according to certain proportion by the pupil distance of the two eyes, wherein the calculation formula is as follows:
wherein LEL x,y For the upper left corner of the left-eye rectangular frame, LER x,y REL is the right lower corner coordinate of the left-eye rectangular frame x,y Right eye rectangular frame upper left corner coordinates, RER x,y The lower right corner coordinates of the right-eye rectangular box imw and imh represent the width and height of the face image, respectively.
CN202010413910.8A 2020-05-15 2020-05-15 Automatic red eye repairing method based on MTCNN Active CN111738934B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010413910.8A CN111738934B (en) 2020-05-15 2020-05-15 Automatic red eye repairing method based on MTCNN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010413910.8A CN111738934B (en) 2020-05-15 2020-05-15 Automatic red eye repairing method based on MTCNN

Publications (2)

Publication Number Publication Date
CN111738934A CN111738934A (en) 2020-10-02
CN111738934B true CN111738934B (en) 2024-04-02

Family

ID=72647320

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010413910.8A Active CN111738934B (en) 2020-05-15 2020-05-15 Automatic red eye repairing method based on MTCNN

Country Status (1)

Country Link
CN (1) CN111738934B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113989884B (en) * 2021-10-21 2024-05-14 武汉博视电子有限公司 Facial skin image based ultraviolet deep and shallow color spot identification method

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1750017A (en) * 2005-09-29 2006-03-22 上海交通大学 Red eye moving method based on human face detection
CN109389562A (en) * 2018-09-29 2019-02-26 深圳市商汤科技有限公司 Image repair method and device
CN109409303A (en) * 2018-10-31 2019-03-01 南京信息工程大学 A kind of cascade multitask Face datection and method for registering based on depth
CN110175504A (en) * 2019-04-08 2019-08-27 杭州电子科技大学 A kind of target detection and alignment schemes based on multitask concatenated convolutional network
EP3531377A1 (en) * 2018-02-23 2019-08-28 Samsung Electronics Co., Ltd. Electronic device for generating an image including a 3d avatar reflecting face motion through a 3d avatar corresponding to a face
DE102019114666A1 (en) * 2018-06-01 2019-12-05 Apple Inc. RED-EYE CORRECTION TECHNIQUES
CN110619319A (en) * 2019-09-27 2019-12-27 北京紫睛科技有限公司 Improved MTCNN model-based face detection method and system
CN110969109A (en) * 2019-11-26 2020-04-07 华中科技大学 Blink detection model under non-limited condition and construction method and application thereof

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7155058B2 (en) * 2002-04-24 2006-12-26 Hewlett-Packard Development Company, L.P. System and method for automatically detecting and correcting red eye
US7567707B2 (en) * 2005-12-20 2009-07-28 Xerox Corporation Red eye detection and correction
US8811683B2 (en) * 2011-06-02 2014-08-19 Apple Inc. Automatic red-eye repair using multiple recognition channels
US11074495B2 (en) * 2013-02-28 2021-07-27 Z Advanced Computing, Inc. (Zac) System and method for extremely efficient image and pattern recognition and artificial intelligence platform

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1750017A (en) * 2005-09-29 2006-03-22 上海交通大学 Red eye moving method based on human face detection
EP3531377A1 (en) * 2018-02-23 2019-08-28 Samsung Electronics Co., Ltd. Electronic device for generating an image including a 3d avatar reflecting face motion through a 3d avatar corresponding to a face
DE102019114666A1 (en) * 2018-06-01 2019-12-05 Apple Inc. RED-EYE CORRECTION TECHNIQUES
CN109389562A (en) * 2018-09-29 2019-02-26 深圳市商汤科技有限公司 Image repair method and device
CN109409303A (en) * 2018-10-31 2019-03-01 南京信息工程大学 A kind of cascade multitask Face datection and method for registering based on depth
CN110175504A (en) * 2019-04-08 2019-08-27 杭州电子科技大学 A kind of target detection and alignment schemes based on multitask concatenated convolutional network
CN110619319A (en) * 2019-09-27 2019-12-27 北京紫睛科技有限公司 Improved MTCNN model-based face detection method and system
CN110969109A (en) * 2019-11-26 2020-04-07 华中科技大学 Blink detection model under non-limited condition and construction method and application thereof

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
MTCNN一些小问题讲解;薛定谔的炼丹炉!;《CSDN博客》;全文 *
人工智能关于视盘区多任务深度学习模型在青光眼分类中的应用;张悦;余双;马锴;初春燕;张莉;庞睿奇;王宁利;刘含若;;中华眼科医学杂志(电子版);20200428(02);全文 *
医学图像边缘检测的Levy-DNA-ACO算法研究;张经宇等;计算机工程与应用;20181215(24);全文 *
基于MTCNN和Facenet的人脸识别;刘长伟;《邮电设计技术》(第02期);全文 *
基于改进MTCNN模型的人脸检测与面部关键点定位;陈雨薇;《中国优秀硕士学位论文全文数据库(电子期刊)》(第01期);全文 *

Also Published As

Publication number Publication date
CN111738934A (en) 2020-10-02

Similar Documents

Publication Publication Date Title
CN111223088B (en) Casting surface defect identification method based on deep convolutional neural network
CN105894484B (en) A kind of HDR algorithm for reconstructing normalized based on histogram with super-pixel segmentation
CN112819772B (en) High-precision rapid pattern detection and recognition method
CN106228528B (en) A kind of multi-focus image fusing method based on decision diagram and rarefaction representation
CN108876768B (en) Shadow defect detection method for light guide plate
CN111915704A (en) Apple hierarchical identification method based on deep learning
CN114118144A (en) Anti-interference accurate aerial remote sensing image shadow detection method
CN108133216B (en) Nixie tube reading identification method capable of realizing decimal point reading based on machine vision
CN108932493A (en) A kind of facial skin quality evaluation method
CN112991193B (en) Depth image restoration method, device and computer-readable storage medium
CN112907519A (en) Metal curved surface defect analysis system and method based on deep learning
CN111325688B (en) Unmanned aerial vehicle image defogging method for optimizing atmosphere light by fusion morphology clustering
CN114926407A (en) Steel surface defect detection system based on deep learning
CN108154496B (en) Electric equipment appearance change identification method suitable for electric power robot
CN117611578B (en) Image processing method and image processing system
CN111738934B (en) Automatic red eye repairing method based on MTCNN
CN116843581B (en) Image enhancement method, system, device and storage medium for multi-scene graph
CN111192280B (en) Method for detecting optic disc edge based on local feature
CN116704316A (en) Substation oil leakage detection method, system and medium based on shadow image reconstruction
CN115526811B (en) Adaptive vision SLAM method suitable for variable illumination environment
CN114333073B (en) Intelligent table lamp auxiliary adjusting method and system based on visual perception
CN116596987A (en) Workpiece three-dimensional size high-precision measurement method based on binocular vision
CN115424107A (en) Underwater pier apparent disease detection method based on image fusion and deep learning
CN113610091A (en) Intelligent identification method and device for air switch state and storage medium
Bian et al. Research of face detection system based on skin-tone feature

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant