WO2022206202A1 - 图像美颜处理方法、装置、存储介质与电子设备 - Google Patents

图像美颜处理方法、装置、存储介质与电子设备 Download PDF

Info

Publication number
WO2022206202A1
WO2022206202A1 PCT/CN2022/076470 CN2022076470W WO2022206202A1 WO 2022206202 A1 WO2022206202 A1 WO 2022206202A1 CN 2022076470 W CN2022076470 W CN 2022076470W WO 2022206202 A1 WO2022206202 A1 WO 2022206202A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
sub
original face
face
face sub
Prior art date
Application number
PCT/CN2022/076470
Other languages
English (en)
French (fr)
Inventor
朱家成
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2022206202A1 publication Critical patent/WO2022206202A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/77Retouching; Inpainting; Scratch removal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Definitions

  • the present disclosure relates to the technical field of image processing, and in particular, to an image beauty processing method, an image beauty processing device, a computer-readable storage medium, and an electronic device.
  • Beauty refers to the use of image processing technology to beautify the portraits in images or videos to better meet the aesthetic needs of users.
  • image beauty processing usually includes multiple fixed algorithm processes, such as image feature calculation based on human design, spatial filtering processing, layer fusion, and the like.
  • image feature calculation based on human design
  • spatial filtering processing e.g., image feature calculation based on human design
  • layer fusion e.g., layer fusion
  • the present disclosure provides an image beauty processing method, an image beauty processing device, a computer-readable storage medium, and an electronic device.
  • an image beauty processing method comprising: extracting one or more original face sub-images from an image to be processed; The original face sub-images are combined to generate an original face combined image; the original face combined image is processed by using the deep neural network, and the beautifying face combined image is output; according to the beauty and face combined image and For the to-be-processed image, a target beauty image corresponding to the to-be-processed image is obtained.
  • an image beauty processing device including a processor and a memory, the processor is configured to execute the following program modules stored in the memory: a face extraction module, configured to Extracting one or more original face sub-images from the image; the image combination module is configured to combine the one or more original face sub-images based on the input image size of the deep neural network to generate an original face combined image; The beauty processing module is configured to use the deep neural network to process the original face combination image, and output the beauty face combination image; the image fusion module is configured to be based on the beauty face combination image and For the to-be-processed image, a target beauty image corresponding to the to-be-processed image is obtained.
  • a face extraction module configured to Extracting one or more original face sub-images from the image
  • the image combination module is configured to combine the one or more original face sub-images based on the input image size of the deep neural network to generate an original face combined image
  • the beauty processing module is configured to use the deep neural network to process the original face combination image, and output the
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, realizes the image beautification processing method of the first aspect and possible implementations thereof.
  • an electronic device comprising: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to execute the executable instructions to Perform the image beautification processing method of the first aspect and possible implementations thereof.
  • FIG. 1 shows a schematic structural diagram of an electronic device in this exemplary embodiment
  • FIG. 2 shows a flowchart of an image beautifying processing method in this exemplary embodiment
  • FIG. 3 shows a flow chart of combining the original face sub-images in this exemplary embodiment
  • FIG. 4 shows a schematic diagram of combining original face sub-images in this exemplary embodiment
  • FIG. 5 shows a schematic structural diagram of a deep neural network in this exemplary embodiment
  • FIG. 6 shows a flowchart of processing an image using a deep neural network in this exemplary embodiment
  • Fig. 7 shows the flow chart of obtaining the target beauty image in this exemplary embodiment
  • FIG. 8 shows a flowchart of adjusting the pixel value of a high frequency image in this exemplary embodiment
  • FIG. 9 shows a schematic diagram of a boundary area gradient processing in this exemplary embodiment
  • Fig. 10 shows a schematic flow chart of a beauty treatment method in this exemplary embodiment
  • FIG. 11 shows a schematic structural diagram of an image beautifying processing apparatus in this exemplary embodiment
  • FIG. 12 shows a schematic structural diagram of another image beauty processing apparatus in this exemplary embodiment.
  • Example embodiments will now be described more fully with reference to the accompanying drawings.
  • Example embodiments can be embodied in various forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.
  • the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
  • numerous specific details are provided in order to give a thorough understanding of the embodiments of the present disclosure.
  • those skilled in the art will appreciate that the technical solutions of the present disclosure may be practiced without one or more of the specific details, or other methods, components, devices, steps, etc. may be employed.
  • well-known solutions have not been shown or described in detail to avoid obscuring aspects of the present disclosure.
  • Portrait removal is a part of image beautification and is usually the first stage of image beautification.
  • Portrait blemishes include but are not limited to freckle and acne removal, eye bag removal, dirty mouth corner treatment, light and shadow smoothing, dry lip wrinkles treatment, etc. After removing blemishes on the portrait, you can continue to perform processing such as skin resurfacing, skin tone adjustment, facial feature deformation, and brightness adjustment.
  • the effect of removing flaws in a portrait relies on the calculation of artificially designed image features.
  • the artificially designed image feature calculation is difficult to deal with all situations in practical applications, and it is usually difficult to accurately and fully detect the flaws on the skin, resulting in unclean removal of portrait flaws.
  • the related art also has the problem that the skin of the portrait is unreal after the blemishes are removed. For example, after the mole on the human face is removed, it contrasts with the surrounding skin, resulting in an unnatural appearance.
  • the exemplary embodiments of the present disclosure first provide an image beauty processing method, the application scenarios of which include but are not limited to: a terminal device is installed with an image beauty App (Application, application), and the user selects a local The images in the album are subjected to beautification processing, or the currently captured images are subjected to beautification processing; the terminal device executes the image beautification processing method of this exemplary embodiment, or the terminal device sends the image to the server, and the server executes this exemplary embodiment
  • beautification processing is performed on an image.
  • beauty processing may also be performed on a video selected by the user or a currently shot video, specifically, performing beauty processing on frames containing portraits in the video, for example, performing beauty processing on a real-time video stream in a live broadcast scene.
  • Exemplary embodiments of the present disclosure also provide an electronic device for executing the above-described image beauty processing method.
  • the electronic device may be the above-mentioned terminal device or server, including but not limited to a smart phone, a tablet computer, a wearable device, a computer, and the like.
  • an electronic device includes a processor and a memory.
  • the memory is used to store executable instructions of the processor, and may also store application data, such as image data, game data, etc.; the processor is configured to execute the image beauty processing method in this exemplary embodiment by executing the executable instructions.
  • the following takes the mobile terminal 100 in FIG. 1 as an example to illustrate the structure of the above electronic device. It will be understood by those skilled in the art that the configuration in Figure 1 can also be applied to stationary type devices, in addition to components specifically for mobile purposes.
  • the mobile terminal 100 may specifically include: a processor 110, an internal memory 121, an external memory interface 122, a USB (Universal Serial Bus, Universal Serial Bus) interface 130, a charging management module 140, a power management module 141, battery 142, antenna 1, antenna 2, mobile communication module 150, wireless communication module 160, audio module 170, speaker 171, receiver 172, microphone 173, headphone jack 174, sensor module 180, display screen 190, camera module 191, indication 192, a motor 193, a button 194, a SIM (Subscriber Identification Module, Subscriber Identification Module) card interface 195 and the like.
  • a processor 110 an internal memory 121, an external memory interface 122, a USB (Universal Serial Bus, Universal Serial Bus) interface 130, a charging management module 140, a power management module 141, battery 142, antenna 1, antenna 2, mobile communication module 150, wireless communication module 160, audio module 170, speaker 171, receiver 172, microphone 173, headphone jack 174, sensor module 180, display screen 190, camera module 191, indication 192,
  • the processor 110 may include one or more processing units, for example, the processor 110 may include an AP (Application Processor, application processor), a modem processor, a GPU (Graphics Processing Unit, graphics processor), an ISP (Image Signal Processor, image signal processor), controller, encoder, decoder, DSP (Digital Signal Processor, digital signal processor), baseband processor and/or NPU (Neural-Network Processing Unit, neural network processor), etc.
  • AP Application Processor
  • modem processor e.g., graphics processing circuitry
  • GPU Graphics Processing Unit, graphics processor
  • ISP Image Signal Processor, image signal processor
  • controller e.g., encoder, decoder
  • DSP Digital Signal Processor, digital signal processor
  • baseband processor and/or NPU Neural-Network Processing Unit, neural network processor
  • the encoder can encode (ie compress) image or video data, for example, encode the image obtained after beautification processing to form the corresponding code stream data to reduce the bandwidth occupied by data transmission; the decoder can encode the image or video data.
  • the code stream data is decoded (that is, decompressed) to restore the image or video data, such as decoding the video to be beautified, to obtain the image data of each frame in the video, and extract one or more frames for beautification. color processing.
  • the mobile terminal 100 may support one or more encoders and decoders.
  • the mobile terminal 100 can process images or videos in various encoding formats, such as: JPEG (Joint Photographic Experts Group, Joint Photographic Experts Group), PNG (Portable Network Graphics, Portable Network Graphics), BMP (Bitmap, Bitmap), etc. Image format, MPEG (Moving Picture Experts Group, Moving Picture Experts Group) 1, MPEG2, H.263, H.264, HEVC (High Efficiency Video Coding, High Efficiency Video Coding) and other video formats.
  • JPEG Joint Photographic Experts Group
  • PNG Portable Network Graphics
  • BMP Bitmap
  • Image format MPEG (Moving Picture Experts Group, Moving Picture Experts Group) 1, MPEG2, H.263, H.264, HEVC (High Efficiency Video Coding, High Efficiency Video Coding) and other video formats.
  • the processor 110 may include one or more interfaces through which connections are formed with other components of the mobile terminal 100 .
  • Internal memory 121 may be used to store computer executable program code, which includes instructions.
  • the internal memory 121 may include volatile memory and nonvolatile memory.
  • the processor 110 executes various functional applications and data processing of the mobile terminal 100 by executing the instructions stored in the internal memory 121 .
  • the external memory interface 122 can be used to connect an external memory, such as a Micro SD card, so as to expand the storage capacity of the mobile terminal 100.
  • the external memory communicates with the processor 110 through the external memory interface 122 to implement data storage functions, such as storing images, videos and other files.
  • the USB interface 130 is an interface conforming to the USB standard specification, and can be used to connect a charger to charge the mobile terminal 100, and can also be connected to an earphone or other electronic devices.
  • the charging management module 140 is used to receive charging input from the charger. While charging the battery 142, the charging management module 140 can also supply power to the device through the power management module 141; the power management module 141 can also monitor the state of the battery.
  • the wireless communication function of the mobile terminal 100 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modulation and demodulation processor, the baseband processor, and the like.
  • Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
  • the mobile communication module 150 may provide wireless communication solutions including 2G/3G/4G/5G etc. applied on the mobile terminal 100 .
  • the wireless communication module 160 can provide applications on the mobile terminal 100 including WLAN (Wireless Local Area Networks, wireless local area network) (such as Wi-Fi (Wireless Fidelity, wireless fidelity) network), BT (Bluetooth, Bluetooth), GNSS (Global Navigation Satellite System, global navigation satellite system), FM (Frequency Modulation, frequency modulation), NFC (Near Field Communication, short-range wireless communication technology), IR (Infrared, infrared technology) and other wireless communication solutions.
  • WLAN Wireless Local Area Networks, wireless local area network
  • Wi-Fi Wireless Fidelity, wireless fidelity
  • BT Bluetooth
  • GNSS Global Navigation Satellite System, global navigation satellite system
  • FM Frequency Modulation, frequency modulation
  • NFC Near Field Communication, short-range wireless communication technology
  • IR Infrared, infrared technology
  • the mobile terminal 100 may implement a display function through the GPU, the display screen 190 and the AP, and display a user interface.
  • the mobile terminal 100 may display an interface of an image beauty app on the display screen 190 , where the user may select an image to be processed, and perform settings related to beauty, and the like.
  • the mobile terminal 100 can realize the shooting function through the ISP, the camera module 191, the encoder, the decoder, the GPU, the display screen 190, the AP, and the like.
  • the user can activate the photographing function in the image beautifying App, trigger the activation of the camera module 191 to take a photograph, and perform beautification processing on the photographed image.
  • the mobile terminal 100 may implement audio functions through an audio module 170, a speaker 171, a receiver 172, a microphone 173, an earphone interface 174, an AP, and the like.
  • the sensor module 180 may include a depth sensor 1801, a pressure sensor 1802, a gyro sensor 1803, an air pressure sensor 1804, etc., to realize different sensing detection functions.
  • the indicator 192 can be an indicator light, which can be used to indicate the charging state, the change of the power, and can also be used to indicate a message, a missed call, a notification, and the like.
  • the motor 193 can generate vibration prompts, and can also be used for touch vibration feedback and the like.
  • the keys 194 include a power-on key, a volume key, and the like.
  • the mobile terminal 100 may support one or more SIM card interfaces 195 for connecting the SIM cards to realize functions such as calling and data communication.
  • FIG. 2 shows an exemplary flow of the image beautifying processing method, which may include:
  • Step S210 extracting one or more original face sub-images from the image to be processed
  • Step S220 combining the above-mentioned one or more original face sub-images based on the input image size of the deep neural network to generate an original face combined image
  • Step S230 using the above-mentioned deep neural network to process the original face combination image, and output the beauty face combination image;
  • Step S240 obtaining a target beauty image corresponding to the to-be-processed image according to the combined image of the beautifying face and the to-be-processed image.
  • the image to be processed may be an image selected by the user, such as an image selected by the user in an album, or an image automatically designated by the system, such as an image currently captured.
  • the target beautifying image is an image obtained after the to-be-processed image is processed through the above steps, which may be an intermediate image in the entire beautifying process, or may be a final output beautifying image.
  • Deep Neural Network is a neural network with a large number of layers. By increasing the number of network layers (ie network depth) to reduce the amount of parameters, it can learn the deep features of the image and achieve pixel-level processing. .
  • the deep neural network can be trained to perform any one or a combination of beautification processes on the image. For example, obtain a large number of unbeautified images as sample images to be beautified; obtain sample de-blemish images obtained by artificially de-defecting the sample to-be-beautified images; construct data from the sample to-be-beautified images and the sample de-defect images set to train a deep neural network, and the resulting deep neural network can be used for de-blemish processing.
  • the sample beautifying images obtained by artificially de-defecting and microdermabrasion of the samples to be beautified are obtained, and a dataset is constructed and a deep neural network is trained, then the obtained deep neural network
  • the web can be used for simultaneous removal of blemishes and resurfacing treatments. Therefore, according to the actual application requirements, it is possible to obtain sample images that have undergone specific beauty treatment and build a data set, so as to train a deep neural network capable of implementing specific beauty treatment functions.
  • the deep neural network can integrate a variety of different beauty processing functions. Compared with setting multiple algorithm modules, it is more convenient to implement the scheme and has higher processing efficiency.
  • the image beauty processing method in FIG. 2 may be used as a stage of beauty processing, and other stages of beauty processing may be added before or after the image beauty processing method in FIG. 2 .
  • the deep neural network described above is used to de-blemish images.
  • the image beautification processing method shown in FIG. 2 is used for processing, and the obtained target beautification image is a defect-removed beautification image.
  • de-blemish processing is necessary for image beautification, and the user's demand for de-blemish processing is relatively fixed.
  • the generalized de-blemish beautification processing process can be realized through the image beautification processing method shown in FIG. 2 .
  • skin resurfacing, deformation, three-dimensional, skin color adjustment, light and shadow adjustment and other treatments are not necessary, and the specific needs of users for these treatments are also personalized.
  • These treatments can be called personalized beauty treatments.
  • the user needs to make specific settings before processing, for example, the user selects one or more of the beauty functions, and sets parameters such as microdermabrasion and deformation, and then the terminal device performs processing according to the user's settings.
  • the present disclosure does not limit the sequence of the image beautification processing in FIG. 2 and other beautification processes.
  • the original image can be subjected to personalized beauty processing to obtain an intermediate beauty image, and then the intermediate beauty image is used as the image to be processed, and the image beauty processing shown in FIG. 2 is performed, and the obtained target beauty image is the final output beauty image. color image.
  • the processing of deep neural network is used to realize the removal of blemishes or other beautification functions to replace multiple fixed algorithm processes in the related art, which increases the flexibility of image beautification processing and is suitable for Various lighting conditions or skin conditions improve image beauty and reduce time and memory usage.
  • the image to be processed includes multiple faces
  • the multiple faces can be beautified by one processing after combining the multiple faces, and there is no need to perform multiple beautifications, which improves the processing efficiency.
  • step S210 one or more original face sub-images are extracted from the image to be processed.
  • the original face sub-image is a sub-image obtained by cutting out the face part in the image to be processed.
  • This exemplary embodiment mainly performs beautification processing on the face in the image to be processed, and the number of faces in the image to be processed is not limited. For example, when the image to be processed includes multiple faces, multiple original faces can be extracted. image, and through the processing of the subsequent steps, it can beautify multiple faces at the same time.
  • the above-mentioned extraction of one or more original face sub-images from the image to be processed may include:
  • the face frame with an area greater than or equal to the face area threshold is retained, and the image in the face frame is intercepted to obtain one or more original face sub-images.
  • the key points of the face may include key parts of the face and points on the edge of the face.
  • the face frame may be a rectangular frame, and the face key points of each face are within the face frame.
  • the face frame may be the smallest rectangular frame including key points of the face.
  • all faces can be detected in the image to be processed by the face detection algorithm, which may include faces that do not require beauty (such as the faces of passers-by in the distance).
  • the face detection algorithm may include faces that do not require beauty (such as the faces of passers-by in the distance).
  • the face frame can be filtered through the face area threshold.
  • the face area threshold may be set according to experience or the size of the image to be processed.
  • the face area threshold may be the size of the image to be processed*0.05; if the area of the face frame is greater than or equal to the face area threshold , it is a face that needs beautification, and the face frame is retained; if the area of the face frame is less than the face area threshold, it is a face that does not need beautification, and the face frame is deleted.
  • the retained face frame is the face frame of the valid face.
  • the image in each face frame is intercepted, and the original face sub-image with the same number as the face frame is obtained.
  • an upper limit of the number of original face sub-images may be set, that is, an upper limit of the number of face frames may be set. For example, it can be set to 4. If the number of retained face frames is greater than 4 after filtering by the above-mentioned face area threshold, 4 face frames can be selected from them. For example, it can be the 4 face frames with the largest area, or it can be The 4 face frames closest to the center of the image to be processed correspond to 4 original face sub-images, and no beautification is performed on the faces in other face frames; 4 face frames and intercept the original face sub-image for beautification. In the next processing, select other face frames and intercept the original face sub-image for beautification, so as to complete all faces in the image to be processed whose area is greater than the face area threshold Beautify the face in the frame.
  • the face frame before the image in the face frame is intercepted, can also be enlarged, so that the face frame includes a small amount of areas other than the face, so as to facilitate the gradient processing during subsequent image fusion. .
  • the face frame can be enlarged in one or more directions according to a preset ratio.
  • the preset ratio is 1.1
  • the face frame is evenly expanded around, so that the size of the enlarged face frame is 1.1 times the original size. It should be noted that, when the face frame is enlarged, if one or more boundaries of the face frame reach the boundary of the image to be processed, the boundary of the face frame is made to stay at the boundary of the image to be processed.
  • step S220 the above-mentioned one or more original face sub-images are combined based on the input image size of the deep neural network to generate an original face combined image.
  • the input image size is the image size that matches the input layer of the deep neural network.
  • This exemplary embodiment combines the original face sub-images into one original face combined image, and the size of the original face combined image is the size of the input image.
  • This exemplary embodiment does not limit the size and aspect ratio of the input image size. Exemplarily, the ratio of the long side to the short side of the input image size can be set to be close to
  • the deep neural network can be a fully convolutional network, which can process images of different sizes.
  • the deep neural network has no requirements for the input image size, and the size of the size has an impact on the amount of calculation, memory usage, and beauty precision.
  • the input image size can be determined according to the beauty fineness set by the user or the performance of the terminal device.
  • the deep neural network can be deployed on devices with different performances such as high, medium, and low, and has a wide range of applications. There is no need to deploy different deep neural networks for different devices, which reduces the training cost of the network.
  • the input image size may be determined as a small value, for example, width 640*height 448.
  • step S220 may specifically include:
  • Step S310 according to the number of original face sub-images, divide the input image size into one or more sub-image sizes corresponding to the above-mentioned one or more original face sub-images;
  • Step S320 transforming the corresponding original face sub-images based on the size of each sub-image
  • Step S330 Combine the transformed original face sub-images to generate an original face combined image.
  • FIG. 4 shows exemplary manners of input image size division and image combination when Q is 1 to 4, respectively.
  • the input image size is width 640*height 448, when Q is 1, the sub-image size is also width 640*height 448; when Q is 2, the sub-image size is half of the input image size, that is, width 320* height 448; Q When it is 3, the sub-image size is 0.5, 0.25, and 0.25 of the input image size, that is, width 320*height 448, width 320*height 224, width 320*height 224; when Q is 4, the sub-image size is input 0.25 of the image size, i.e.
  • each original face sub-image is consistent with the size of the sub-image. It should be noted that when the size of each sub-image is inconsistent, such as when Q is 3, the size order of the original face sub-image and the sub-image size can be followed. In order of size, the original face sub-image and the sub-image size are in a one-to-one correspondence, that is, the largest original face sub-image corresponds to the largest sub-image size, and the smallest original face sub-image corresponds to the smallest sub-image size. After the original face sub-image is transformed, the transformed original face sub-images are combined in the manner shown in FIG. 4 to generate an original face combined image.
  • the input image size when Q is an even number, can be divided into Q equals to obtain Q identical sub-image sizes.
  • Q is an odd number, divide the input image size by Q+1 to obtain Q+1 identical sub-image sizes, merge the two sub-image sizes into one sub-image size, and the remaining Q-1 sub-images remain unchanged , thereby obtaining Q sub-image sizes.
  • the size ratio (or area ratio) of the original face sub-image may be calculated first, such as S 1 : S 2 : S 3 :...: S Q , and then the input image size is divided according to the ratio is the size of Q sub-images.
  • the original face sub-image can be transformed based on the sub-image size.
  • the original face sub-image is transformed, which may include any one or more of the following:
  • the size relationship between the width and height of the original face sub-image is different from the size relationship between the width and height of the sub-image size, rotate the original face sub-image by 90 degrees.
  • the width is greater than the height or the width is smaller than the height, then the size relationship between the width and height of the original face sub-image and the sub-image size is the same, and there is no need to rotate the original face sub-image. image; otherwise, the size relationship between the width and height of the original face sub-image and the size of the sub-image is different, and the original face sub-image needs to be rotated 90 degrees (either clockwise or counterclockwise).
  • the sub-image size is width 320*height 448, that is, the width is smaller than the height, if the original face sub-image is wider than the height, the original face sub-image is rotated by 90 degrees.
  • the original face sub-image in order to maintain the angle of the face in the original face sub-image, the original face sub-image may not be rotated.
  • the size of the original face sub-image is larger than the sub-image size
  • downsample the original face sub-image according to the sub-image size The size of the original face sub-image is larger than the sub-image size means that the width of the original face sub-image is larger than the width of the sub-image size, or the height of the original face sub-image is larger than the height of the sub-image size.
  • the image to be processed is generally a clear image captured by a terminal device, and its size is relatively large. Therefore, it is a common situation that the size of the original face sub-image is larger than the sub-image size.
  • the image is downsampled.
  • the down-sampling can be implemented by methods such as bilinear interpolation, nearest neighbor interpolation, etc., which is not limited in the present disclosure.
  • At least one of the width and height of the original face sub-image is aligned with the sub-image size, including the following situations:
  • the width and height of the original face sub-image are the same as the sub-image size
  • the width of the original face sub-image is the same as the width of the sub-image size, and the height is smaller than the height of the sub-image size;
  • the height of the original face sub-image is the same as the height of the sub-image size, and the width is smaller than the width of the sub-image size.
  • the downsampling processing step may not be performed.
  • the size of the original face sub-image is smaller than the sub-image size, fill the original face sub-image according to the difference between the original face sub-image and the sub-image size, so that the size of the filled original face sub-image is equal to the sub-image size.
  • the size of the original face sub-image is smaller than the size of the sub-image, which means that at least one of the width and height of the original face sub-image is smaller than the size of the sub-image, and the other is not larger than the size of the sub-image, including the following situations:
  • the width of the original face sub-image is smaller than the width of the sub-image size, and the height is also smaller than the height of the sub-image size;
  • the width of the original face sub-image is smaller than the width of the sub-image size, and the height is equal to the height of the sub-image size;
  • the height of the original face sub-image is less than the height of the sub-image size, and the width is equal to the height of the sub-image size.
  • Preset pixel values can be used for filling, which are usually pixel values that are quite different from the color of the face, such as (R0, G0, B0), (R255, G255, B255), etc.
  • the center of the original face sub-image is coincident with the center of the sub-image size, and the difference part around the original face sub-image is filled, so that the size of the original face sub-image after filling is the same as that of the original face sub-image.
  • the sub-images are of the same size.
  • the original face sub-image can also be aligned with one edge of the sub-image size, and the other side can be filled. This disclosure does not limit this.
  • the original face sub-image has been processed by at least one of the above-mentioned rotation and downsampling, and the original face sub-image processed by at least one of the rotation and downsampling is obtained, then when the original face sub-image is When the size is smaller than the size of the sub-image, filling is performed according to the difference between the size and the size of the sub-image.
  • the specific implementation method is the same as the filling method of the original face sub-image above, so it will not be repeated.
  • the above 1 to 3 are three commonly used transformation methods, and any one or more of them can be used according to actual needs.
  • 1, 2, and 3 are used to process each original face sub-image in sequence, and the processed original face sub-images are combined into an original face combined image.
  • the orientation, size, etc. of the original face sub-image are changed, which is to facilitate the unified processing of the deep neural network.
  • the combined information can be saved, including but not limited to the size of each original face sub-image (that is, the corresponding sub-image size) and the position in the original face combined image, each The arrangement and order of the original face sub-images. Subsequently, the combined image of the beauty face may be split according to the combination information to obtain each individual sub-image of the beauty face.
  • step S230 the above-mentioned deep neural network is used to process the original face combination image, and output the beauty face combination image.
  • a lightweight deep neural network can be used to reduce the amount of computation and realize the learning and processing of image depth features.
  • the deep neural network may adopt an end-to-end structure to realize pixel-level processing of the original face combined image.
  • Figure 5 shows an exemplary structure of a deep neural network.
  • the deep neural network may be a fully convolutional network, including: a first pixel rearrangement layer, at least one convolution layer, at least one transposed convolution layer, and a second pixel rearrangement layer.
  • step S230 can be implemented through steps S610 to S640 in FIG. 6 :
  • Step S610 using the first pixel rearrangement layer to perform pixel rearrangement processing from single channel to multi-channel on the original face combined image to obtain a first feature image.
  • the original combined face image may be a single-channel image (eg, a grayscale image) or a multi-channel image (eg, an RGB image).
  • the first pixel rearrangement layer can rearrange each channel of the original face composite image into multiple channels.
  • step S610 includes:
  • a represents the number of channels of the original face combined image, which is a positive integer
  • n represents the parameter of pixel rearrangement, which is a positive integer not less than 2.
  • the first pixel rearrangement layer can be implemented by the space_to_depth function in TensorFlow (a machine learning implementation framework) to convert the spatial features in the original face combined image into depth features, or a convolution operation with a stride of n.
  • the first pixel rearrangement layer can be regarded as a special convolutional layer.
  • Step S620 using a convolution layer to perform convolution processing on the first feature image to obtain a second feature image.
  • the present disclosure does not limit the number of convolutional layers, the size of convolutional kernels, the specific structure of the convolutional layers, and the like.
  • Convolutional layers are used to extract image features from different scales and learn depth information.
  • the convolutional layer can include a matching pooling layer, which is used to downsample the convolved image to achieve information abstraction, increase the receptive field, and reduce parameter complexity.
  • step-by-step convolution and downsampling can be used.
  • the image can be reduced by a factor of 2 until the last convolutional layer outputs a second feature image, which can be a deep neural network.
  • Step S630 using a transposed convolution layer to perform a transposed convolution process on the second feature image to obtain a third feature image.
  • the present disclosure does not limit the number of transposed convolution layers, the size of the transposed convolution kernels, the specific structure of the transposed convolution layers, and the like.
  • the transposed convolutional layer is used to upsample the second feature image, which can be viewed as the reverse process of convolution, thereby restoring the size of the image.
  • a step-by-step upsampling method can be adopted, for example, the image can be increased by a factor of 2 until the last transposed convolutional layer outputs the third feature image.
  • the convolutional layer and the transposed convolutional layer have completely symmetrical structures, and the third feature image has the same size and number of channels as the first feature image.
  • a direct connection can be established between the convolutional layer and the transposed convolutional layer.
  • the convolutional layer corresponding to the feature image of the same size and the transposed convolutional layer A direct connection is established between the layers, so that the feature image information of the convolution link is directly connected to the feature image in the transposed convolution link, which is beneficial to obtain a third feature image with more comprehensive information.
  • Step S640 using the second pixel rearrangement layer to perform pixel rearrangement processing from multi-channel to single-channel on the third feature image to obtain a combined image of beauty and face.
  • step S640 includes:
  • the second pixel rearrangement layer can be implemented by using the depth_to_space function in TensorFlow to convert the depth features in the third feature image into spatial features, or it can be implemented by using a transposed convolution operation with a stride of n.
  • Row layers can be thought of as special transposed convolutional layers.
  • the processing process of the deep neural network also does not change the number of faces.
  • the original face composite image is composed of 4 original face sub-images.
  • the beauty face composite image also includes 4 faces, which is a comparison of the 4 original face sub-images.
  • the deep neural network is used for de-blemish processing, its de-blemish effect depends on the quality of the dataset and the training effect, rather than on the calculation of artificially designed image features.
  • the deep neural network can cope with almost all situations in practical applications, including different lighting conditions, different skin conditions, etc. The problem of unclean removal of medium blemishes.
  • step S240 a target beauty image corresponding to the to-be-processed image is obtained according to the combined image of the beautifying face and the to-be-processed image.
  • the beautifying face combined image includes the face in the original face sub-image after beautifying processing, and replacing the face after beautifying with the original face of the image to be processed to obtain the target beautifying image.
  • step S240 may include:
  • Step S710 splitting the beauty face sub-image corresponding to the original face sub-image from the beauty face combination image
  • Step S720 Replace the original face sub-image in the image to be processed with the corresponding beauty-beauty face sub-image to obtain a target beauty-beauty image.
  • the above saved combination information can be used to split a sub-image with a specific position and a specific size from the beauty face combination image, that is, the beauty face sub-image,
  • the beauty face sub-image corresponds to the original face sub-image one-to-one.
  • the split beauty face sub-image can be correspondingly inversely transformed, Including removing filled pixels, upsampling, and reverse rotation by 90 degrees, etc., so that the direction and size of the inversely transformed beauty face sub-image are consistent with the original face sub-image, so that 1:1 replacement can be performed in the image to be processed. , get the target beauty image.
  • the beautifying face sub-image is the face sub-image after beautification processing by the deep neural network, which is usually a face sub-image with a high degree of beauty.
  • the original face sub-image may be used to perform beautification weakening processing on the beautifying face sub-image.
  • the beautification weakening process refers to reducing the beautification level of the facial image of the beautifying person. Two exemplary ways of beautifying and softening are provided below:
  • the original face sub-image is fused to the beauty face sub-image.
  • the beauty level parameter may be a beauty strength parameter under a specific beauty function, such as the degree of blemish removal.
  • the beauty level parameter may be a parameter used for the current setting, a system default parameter, or a parameter used in the last beauty treatment, or the like.
  • the original face sub-image and the beauty face sub-image can be fused with the beauty level parameter as the proportion. For example, assuming that the range of the degree of defect removal is 0 to 100, and the currently set value is a, refer to the following formula:
  • image_blend represents the fused image
  • image_ori represents the original face sub-image
  • image_deblemish represents the beauty face sub-image.
  • the original face sub-image is transformed when the original face sub-image is combined into the original face combined image
  • inverse transformation can be performed on the split beauty face sub-image.
  • the original face sub-image and the beauty face sub-image have the following relationship: the original face sub-image before transformation is consistent with the beauty face sub-image after inverse transformation in direction, size, etc.; the transformed original face sub-image is the same as the beauty face before inverse transformation.
  • the orientation and size of the face images are the same. Therefore, when using the above formula (1) to fuse the original face sub-image and the beauty face sub-image, the original face sub-image before the transformation and the beauty face sub-image after inverse transformation can be fused, or the transformation can also be fused.
  • the second method is to fuse the high-frequency image of the original face sub-image into the beauty face sub-image.
  • the high-frequency image refers to an image containing high-frequency information such as detail texture in the original face sub-image.
  • high frequency images can be acquired by:
  • the resolution of the down-sampled face sub-image is lower than that of the original face sub-image.
  • the down-sampled face sub-image is up-sampled, so that the obtained up-sampled face sub-image has the same resolution as the original face sub-image.
  • the reverse rotation can also be performed, so that the obtained up-sampled face sub-image is the same as the original face sub-image.
  • the orientation of the face sub-image is also the same.
  • Upsampling can use methods such as bilinear interpolation and nearest neighbor interpolation.
  • the resolution can be recovered by upsampling, it is difficult to recover the lost high-frequency information, that is, the upsampled face sub-image can be regarded as a low-frequency image of the original face sub-image.
  • the difference between the original face sub-image and the up-sampled face sub-image can be determined.
  • the original face sub-image and the up-sampled face sub-image can be subtracted, and the result is the high-frequency information of the original face sub-image.
  • image that is, the high-frequency image of the original face sub-image.
  • the high-frequency image can also be obtained by filtering the original face sub-image to extract high-frequency information.
  • the high-frequency image can be superimposed into the beauty face sub-image by means of direct addition, so that high-frequency information such as details and textures can be added to the beauty face sub-image. , more realistic.
  • the pixel value is generally small, for example, the value of each RGB channel does not exceed 4.
  • the mutation position in the original face sub-image such as the small black mole on the face, it has strong high-frequency information, so the pixel value of the corresponding position in the high-frequency image may be relatively large.
  • the pixel values of these positions may have adverse effects, such as sharp edges such as "mole marks", resulting in unnatural visual perception.
  • the image beauty processing method may further include the following steps S810 and S820:
  • step S810 the defect points are determined in the high frequency image.
  • the defect point is a pixel point with strong high-frequency information, and a point with a larger pixel value in the high-frequency image can be determined as a defect point.
  • defect points can be determined by:
  • the pixel corresponding to the pixel in the high-frequency image is determined as the defect.
  • the preset defect condition is used to measure the difference between the beauty face sub-image and the original face sub-image, so as to determine whether each pixel is a removed defect.
  • a removed defect In the process of removing blemishes, small black moles, poxes, etc. in the face are usually removed, and the skin color of the face is filled. At this position, the sub-image of the beautiful face is very different from the original sub-image, so it can be obtained by Set preset defect conditions to identify defect points.
  • the preset defect condition may include: the difference values of each color channel are all greater than a first color difference threshold, and at least one of the difference values of each color channel is greater than a second color difference threshold.
  • the first color difference threshold and the second color difference threshold may be empirical thresholds. For example, when the color channels include RGB, the first color difference threshold may be 20, and the second color difference threshold may be 40.
  • the specific difference values of the three RGB color channels in the difference value are judged, and the difference of each color channel is judged.
  • the value is greater than 20, and whether the difference between at least one color channel is greater than 40, when these two conditions are met, it means that the preset defect condition is met, and the pixel point at the corresponding position in the high-frequency image is determined as the defect point. .
  • Step S820 adjusting the pixel values in the preset area around the above-mentioned defect point in the high-frequency image to be within a preset value range.
  • the preset area around the defect point can be further determined in the high-frequency image, for example, a 5*5 pixel area centered on the defect point.
  • the specific size can be determined according to the size of the high-frequency image.
  • the pixel value in the preset area is adjusted to be within a preset value range, and the preset value range is generally a smaller value range, which can be determined according to experience and actual needs, and the pixel value usually needs to be reduced during adjustment.
  • the preset value range may be -2 to 2
  • the pixel value around the defect point may exceed -5 to 5, and the value is adjusted to be within -2 to 2, in fact, limit processing is performed. In this way, sharp edges such as "mole marks" can be weakened, and the visual natural feeling can be increased.
  • This exemplary embodiment can adopt these two beautification and weakening processing methods at the same time. For example, firstly, the original face sub-image and the beautified face sub-image are fused by the first method, and on this basis, the high-frequency image is merged with the second method. Superimposed on it to obtain a beautifying face sub-image that has undergone beautification weakening processing, and the beautifying face sub-image has both a good beautifying effect and a sense of reality.
  • Gradient processing is performed on the boundary area between the unreplaced area in the image to be processed and the sub-image of the beauty face, so that the boundary area forms a smooth transition.
  • the unreplaced area in the image to be processed is the area other than the original face sub-image in the image to be processed.
  • the boundary area between the above-mentioned non-replaced area and the beautifying face sub-image actually includes two parts: the boundary area adjacent to the beautifying face sub-image in the non-replacement area, and the area adjacent to the non-replacement area in the beautifying face sub-image. border area.
  • gradation processing can be performed on any one part, and gradation processing can also be performed on both parts at the same time.
  • a certain proportion (eg, 10%) of the boundary area may be determined in the beauty-beautifying face sub-image, which extends inward from the edge of the beauty-beautifying face sub-image.
  • the border area usually needs to avoid the face part to avoid changing the color of the face part in the gradient processing. For example, after the above-mentioned enlargement processing of the face frame makes the face in the original face sub-image have a certain distance from the boundary, then the face in the beauty face sub-image also has a certain distance from the boundary. During processing, the face part can be better avoided.
  • the boundary area After the boundary area is determined, the inner edge color of the boundary area is obtained, which is recorded as the first color; the inner edge color of the unreplaced area is obtained, which is recorded as the second color; and then the boundary area is subjected to gradient processing of the first color and the second color. Therefore, the boundary between the unreplaced area and the beautifying face sub-image is a gradient color area (the oblique line area in FIG. 9 ), which forms a smooth transition and prevents sudden changes in color, resulting in visual dissonance.
  • each beauty face sub-image can be replaced with the corresponding original face sub-image in the image to be processed, and the gradient processing of the boundary area is performed to obtain a target beauty image. , so that it has a natural and harmonious visual experience.
  • Figure 10 shows a schematic flow of an image beauty processing method, including:
  • Step S1001 generating a plurality of face frames in the image to be processed according to the identified face key points, and retaining face frames with an area not less than a face area threshold. It is assumed that the image to be processed includes two main faces, which correspond to the generated face frame 1 and face frame 2 respectively.
  • step S1002 the face frame 1 and the face frame 2 are respectively enlarged, for example, can be enlarged to 1.1 times, and then the image in the face frame is intercepted to obtain the original face sub-image 1 and the original face sub-image 2.
  • Step S1003 the input image size of the deep neural network is divided into two equal parts to obtain the sub-image size.
  • the size of the original face sub-image 1 and the original face sub-image 2 are both larger than the sub-image size, so the original face sub-image 1 and the original face
  • the face sub-image 2 is down-sampled, and further processing such as rotation and filling can be performed to obtain the down-sampled face sub-image 1 and the down-sampled face sub-image 2.
  • Step S1004 up-sampling the down-sampled face sub-image 1 and the down-sampled face sub-image 2 so as to be consistent with the resolutions of the original face sub-image 1 and the original face sub-image 2 . If the down-sampled face sub-image 1 and the sampled face sub-image 2 are also processed by rotation, padding, etc., you can also perform reverse rotation, remove padding, etc., to obtain the up-sampled face sub-image 1 and the up-sampled face sub-image 2. image 2.
  • Step S1005 subtract the original face sub-image 1 and the down-sampled face sub-image 1 to obtain a high-frequency image 1, and subtract the original face sub-image 2 and the down-sampled face sub-image 2 to obtain a high-frequency image 2.
  • Step S1006 combine the down-sampled face sub-image 1 and the down-sampled face sub-image 2 into an original face combined image.
  • Step S1007 inputting the original face combination image into the deep neural network, and outputting the beauty face combination image after processing.
  • Step S1008 splitting the beauty face combination image into beauty face sub-image 1 and beauty face sub-image 2, wherein the beauty face sub-image 1 corresponds to the original face sub-image 1, and the beauty face sub-image 2 corresponds to the original face sub-image 1.
  • the original face sub-image 2 corresponds to.
  • Step S1009 fuse the beautifying face sub-image 1 with the original face sub-image 1 according to the beautifying degree parameter, and then add it to the high-frequency image 1 to obtain the face sub-image 1 to be replaced;
  • the face sub-image 2 is fused according to the beauty degree parameter, and then added to the high-frequency image 2 to obtain the face sub-image 2 to be replaced.
  • step S1010 the face sub-image 1 to be replaced and the face sub-image 2 to be replaced are merged into the image to be processed.
  • the part of the original face sub-image 1 in the image to be replaced can be replaced by the face sub-image 1 to be replaced.
  • the face sub-image 2 to be replaced replaces the part of the original face sub-image 2 in the image to be processed, so that the two main faces in the to-be-processed image are both replaced with beautified faces, and the target beautified image is output.
  • follow-up can also be personalized beauty treatment.
  • the image beauty processing apparatus 1100 may include a processor 1110 and a memory 1120 .
  • the memory 1120 stores the following program modules:
  • face extraction module 1121 configured to extract one or more original face sub-images from the image to be processed
  • the image combining module 1122 is configured to combine the above-mentioned one or more original face sub-images based on the input image size of the deep neural network to generate an original face combined image;
  • the beauty processing module 1123 is configured to use the above-mentioned deep neural network to process the original face combination image, and output the beauty face combination image;
  • the image fusion module 1124 is configured to obtain a target beauty image corresponding to the to-be-processed image by combining the image and the to-be-processed image according to the beautified face.
  • the processor 1110 is used to execute the above-mentioned program modules.
  • the face extraction module 1121 is configured to:
  • the face frame with an area greater than or equal to the face area threshold is retained, and the image in the face frame is intercepted to obtain one or more original face sub-images.
  • the image combination module 1122 is configured to:
  • the input image size is divided into one or more sub-image sizes corresponding to one or more original face sub-images
  • the image combination module 1122 is configured to perform any one or more of the following:
  • the size of the original face sub-image or the original face sub-image processed by at least one of rotation and downsampling is smaller than the size of the sub-image, fill the original face sub-image according to the difference between the size of the original face sub-image and the size of the sub-image , or fill the original face sub-image processed by at least one of rotation and down-sampling according to the difference between the size of the original face sub-image processed by at least one of rotation and down-sampling and the size of the sub-image.
  • the deep neural network is a fully convolutional network, including: a first pixel rearrangement layer, at least one convolutional layer, at least one transposed convolutional layer, and a second pixel rearrangement layer.
  • Beauty processing module 1123 including:
  • the first rearrangement sub-module is configured to use the first pixel rearrangement layer to perform pixel rearrangement processing from single channel to multi-channel on the original face combined image to obtain a first feature image;
  • a convolution sub-module configured to perform convolution processing on the first feature image by using a convolution layer to obtain a second feature image
  • the transposed convolution sub-module is configured to use the transposed convolution layer to perform transposed convolution processing on the second feature image to obtain the third feature image;
  • the second rearrangement sub-module is configured to perform pixel rearrangement processing from multi-channel to single-channel on the third feature image by using the second pixel rearrangement layer to obtain a combined image of beauty and face.
  • the first rearrangement submodule is configured to:
  • a is a positive integer
  • n is a positive integer not less than 2.
  • the second rearrangement submodule is configured to:
  • b is a positive integer
  • n is a positive integer not less than 2.
  • the image fusion module 1124 is configured to:
  • the original face sub-image in the image to be processed is replaced with the corresponding beauty face sub-image to obtain the target beauty image.
  • the image fusion module 1124 is configured to:
  • the original face sub-image Before replacing the original face sub-image in the image to be processed with the corresponding beautifying face sub-image, the original face sub-image is used to perform beautification weakening processing on the beautifying face sub-image.
  • the image fusion module 1124 is configured to:
  • the original face sub-image is fused to the beauty face sub-image.
  • the image fusion module 1124 is configured to:
  • the high-frequency image of the original face sub-image is fused to the beauty face sub-image.
  • the image combination module 1122 is configured to:
  • the up-sampled face sub-image has the same resolution as the original face sub-image
  • the image fusion module 1124 is configured to:
  • the image fusion module 1124 is configured to:
  • the pixel point corresponding to the pixel point in the high-frequency image is determined as the defect point.
  • the preset defect conditions include:
  • the difference values of each color channel are all greater than the first color difference threshold, and at least one of the difference values of each color channel is greater than the second color difference threshold.
  • the image fusion module 1124 is configured to:
  • the target beautifying image includes a de-blemishing beautifying image.
  • the memory 1120 may also store the following program modules:
  • the personalized beauty processing module is configured to perform personalized beauty processing on the blemish-free beauty image after obtaining the blemish-removing beauty image, so as to obtain a final beauty image.
  • the image beauty processing apparatus 1200 may include:
  • a face extraction module 1210 configured to extract one or more original face sub-images from the image to be processed
  • the image combining module 1220 is configured to combine the above-mentioned one or more original face sub-images based on the input image size of the deep neural network to generate an original face combined image;
  • the beauty processing module 1230 is configured to use the above-mentioned deep neural network to process the original face combination image, and output the beauty face combination image;
  • the image fusion module 1240 is configured to obtain a target beauty image corresponding to the to-be-processed image by combining the image and the to-be-processed image according to the beautified face.
  • the face extraction module 1210 is configured to:
  • the face frame with an area greater than or equal to the face area threshold is retained, and the image in the face frame is intercepted to obtain one or more original face sub-images.
  • the image combination module 1220 is configured to:
  • the input image size is divided into one or more sub-image sizes corresponding to one or more original face sub-images
  • the image combination module 1220 is configured to perform any one or more of the following:
  • the size of the original face sub-image or the original face sub-image processed by at least one of rotation and downsampling is smaller than the size of the sub-image, fill the original face sub-image according to the difference between the size of the original face sub-image and the size of the sub-image , or fill the original face sub-image processed by at least one of rotation and down-sampling according to the difference between the size of the original face sub-image processed by at least one of rotation and down-sampling and the size of the sub-image.
  • the deep neural network is a fully convolutional network, including: a first pixel rearrangement layer, at least one convolutional layer, at least one transposed convolutional layer, and a second pixel rearrangement layer.
  • Beauty processing module 1230 including:
  • the first rearrangement sub-module is configured to use the first pixel rearrangement layer to perform pixel rearrangement processing from single channel to multi-channel on the original face combined image to obtain a first feature image;
  • a convolution sub-module configured to perform convolution processing on the first feature image by using a convolution layer to obtain a second feature image
  • the transposed convolution sub-module is configured to use the transposed convolution layer to perform transposed convolution processing on the second feature image to obtain the third feature image;
  • the second rearrangement sub-module is configured to perform pixel rearrangement processing from multi-channel to single-channel on the third feature image by using the second pixel rearrangement layer to obtain a combined image of beauty and face.
  • the first rearrangement submodule is configured to:
  • a is a positive integer
  • n is a positive integer not less than 2.
  • the second rearrangement submodule is configured to:
  • b is a positive integer
  • n is a positive integer not less than 2.
  • the image fusion module 1240 is configured to:
  • the original face sub-image in the image to be processed is replaced with the corresponding beauty face sub-image to obtain the target beauty image.
  • the image fusion module 1240 is configured to:
  • the original face sub-image Before replacing the original face sub-image in the image to be processed with the corresponding beautifying face sub-image, the original face sub-image is used to perform beautification weakening processing on the beautifying face sub-image.
  • the image fusion module 1240 is configured to:
  • the original face sub-image is fused to the beauty face sub-image.
  • the image fusion module 1240 is configured to:
  • the high-frequency image of the original face sub-image is fused to the beauty face sub-image.
  • the image combination module 1220 is configured to:
  • the up-sampled face sub-image has the same resolution as the original face sub-image
  • the image fusion module 1240 is configured to:
  • the image fusion module 1240 is configured to:
  • the pixel point corresponding to the pixel point in the high-frequency image is determined as the defect point.
  • the preset defect conditions include:
  • the difference values of each color channel are all greater than the first color difference threshold, and at least one of the difference values of each color channel is greater than the second color difference threshold.
  • the image fusion module 1240 is configured to:
  • the target beautifying image includes a de-blemishing beautifying image.
  • the image beauty processing apparatus 1200 may further include:
  • the personalized beauty processing module is configured to perform personalized beauty processing on the blemish-free beauty image after obtaining the blemish-removing beauty image, so as to obtain a final beauty image.
  • Exemplary embodiments of the present disclosure also provide a computer-readable storage medium that can be implemented in the form of a program product including program code for causing the electronic device to run the program product when the program product is run on the electronic device.
  • program product may be implemented as a portable compact disk read only memory (CD-ROM) and include program code, and may be executed on an electronic device, such as a personal computer.
  • CD-ROM portable compact disk read only memory
  • the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • the program product may employ any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • a computer readable signal medium may include a propagated data signal in baseband or as part of a carrier wave with readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a readable signal medium can also be any readable medium, other than a readable storage medium, that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
  • Program code embodied on a readable medium may be transmitted using any suitable medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Program code for performing the operations of the present disclosure may be written in any combination of one or more programming languages, including object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural programming Language - such as the "C" language or similar programming language.
  • the program code may execute entirely on the user computing device, partly on the user device, as a stand-alone software package, partly on the user computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.
  • the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device (eg, using an Internet service provider business via an Internet connection).
  • LAN local area network
  • WAN wide area network
  • modules or units of the apparatus for action performance are mentioned in the above detailed description, this division is not mandatory. Indeed, according to exemplary embodiments of the present disclosure, the features and functions of two or more modules or units described above may be embodied in one module or unit. Conversely, the features and functions of one module or unit described above may be further divided into multiple modules or units to be embodied.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

一种图像美颜处理方法、图像美颜处理装置、计算机可读存储介质与电子设备。该方法包括:从待处理图像中提取一张或多张原始人脸子图像(S210);基于深度神经网络的输入图像尺寸将所述一张或多张原始人脸子图像进行组合,生成原始人脸组合图像(S220);利用所述深度神经网络对所述原始人脸组合图像进行处理,输出美颜人脸组合图像(S230);根据所述美颜人脸组合图像与所述待处理图像,得到所述待处理图像对应的目标美颜图像(S240)。适用于多样的光照条件或皮肤状况,改善了图像美颜效果,并提高了处理效率。

Description

图像美颜处理方法、装置、存储介质与电子设备
本申请要求申请日为2021年03月29日,申请号为202110336102.0,名称为“图像美颜处理方法、装置、存储介质与电子设备”的中国专利申请的优先权,该中国专利申请的全部内容通过引用结合在本文中。
技术领域
本公开涉及图像处理技术领域,尤其涉及一种图像美颜处理方法、图像美颜处理装置、计算机可读存储介质与电子设备。
背景技术
美颜是指利用图像处理技术对图像或视频中的人像进行美化处理,以更好地满足用户的审美需求。
相关技术中,图像美颜处理通常包括固定的多个算法流程,例如基于人为设计的图像特征计算、空间滤波处理、图层融合等。然而,实际拍摄场景中可能面临复杂多样的光照条件,且拍摄对象的皮肤状况多种多样,采用上述方法无法较好地应对不同的情况,导致美颜效果不理想。
发明内容
本公开提供一种图像美颜处理方法、图像美颜处理装置、计算机可读存储介质与电子设备。
根据本公开的第一方面,提供一种图像美颜处理方法,包括:从待处理图像中提取一张或多张原始人脸子图像;基于深度神经网络的输入图像尺寸将所述一张或多张原始人脸子图像进行组合,生成原始人脸组合图像;利用所述深度神经网络对所述原始人脸组合图像进行处理,输出美颜人脸组合图像;根据所述美颜人脸组合图像与所述待处理图像,得到所述待处理图像对应的目标美颜图像。
根据本公开的第二方面,提供一种图像美颜处理装置,包括处理器与存储器,所述处理器用于执行所述存储器中存储的以下程序模块:人脸提取模块,被配置为从待处理图像中提取一张或多张原始人脸子图像;图像组合模块,被配置为基于深度神经网络的输入图像尺寸将所述一张或多张原始人脸子图像进行组合,生成原始人脸组合图像;美颜处理模块,被配置为利用所述深度神经网络对所述原始人脸组合图像进行处理,输出美颜人脸组合图像;图像融合模块,被配置为根据所述美颜人脸组合图像与所述待处理图像,得到所述待处理图像对应的目标美颜图像。
根据本公开的第三方面,提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现上述第一方面的图像美颜处理方法及其可能的实现方式。
根据本公开的第四方面,提供一种电子设备,包括:处理器;以及存储器,用于存储所述处理器的可执行指令;其中,所述处理器配置为经由执行所述可执行指令来执行上述第一方面的图像美颜处理方法及其可能的实现方式。
附图说明
图1示出本示例性实施方式中一种电子设备的结构示意图;
图2示出本示例性实施方式中一种图像美颜处理方法的流程图;
图3示出本示例性实施方式中对原始人脸子图像进行组合的流程图;
图4示出本示例性实施方式中对原始人脸子图像进行组合的示意图;
图5示出本示例性实施方式中一种深度神经网络的结构示意图;
图6示出本示例性实施方式中一种利用深度神经网络处理图像的流程图;
图7示出本示例性实施方式中得到目标美颜图像的流程图;
图8示出本示例性实施方式中调整高频图像像素值的流程图;
图9示出本示例性实施方式中一种边界区域渐变处理的示意图;
图10示出本示例性实施方式中一种美颜处理方法的示意性流程图;
图11示出本示例性实施方式中一种图像美颜处理装置的结构示意图;
图12示出本示例性实施方式中另一种图像美颜处理装置的结构示意图。
具体实施方式
现在将参考附图更全面地描述示例实施方式。然而,示例实施方式能够以多种形式实施,且不应被理解为限于在此阐述的范例;相反,提供这些实施方式使得本公开将更加全面和完整,并将示例实施方式的构思全面地传达给本领域的技术人员。所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施方式中。在下面的描述中,提供许多具体细节从而给出对本公开的实施方式的充分理解。 然而,本领域技术人员将意识到,可以实践本公开的技术方案而省略所述特定细节中的一个或更多,或者可以采用其它的方法、组元、装置、步骤等。在其它情况下,不详细示出或描述公知技术方案以避免喧宾夺主而使得本公开的各方面变得模糊。
此外,附图仅为本公开的示意性图解,并非一定是按比例绘制。图中相同的附图标记表示相同或类似的部分,因而将省略对它们的重复描述。附图中所示的一些方框图是功能实体,不一定必须与物理或逻辑上独立的实体相对应。可以采用软件形式来实现这些功能实体,或在一个或多个硬件模块或集成电路中实现这些功能实体,或在不同网络和/或处理器装置和/或微控制器装置中实现这些功能实体。
人像去瑕疵是图像美颜处理的一部分,通常是图像美颜处理中的第一阶段处理。人像去瑕疵包括但不限于祛斑祛痘、祛眼袋、显脏嘴角处理、光影平整、干燥唇纹处理等。在人像去瑕疵之后,可以继续进行磨皮、肤色调整、五官形变、亮度调整等处理。
相关技术中,人像去瑕疵的效果依赖于人为设计的图像特征计算。而人为设计的图像特征计算难以应对实际应用中的所有情况,通常难以准确、充分地检测出皮肤上的瑕疵,导致人像瑕疵去除不干净。并且,相关技术还存在人像去瑕疵后皮肤不真实的问题,例如人脸的痣被去除后与周围的皮肤形成反差,导致看上去不自然。
鉴于上述问题,本公开的示例性实施方式首先提供一种图像美颜处理方法,其应用场景包括但不限于:终端设备安装有图像美颜App(Application,应用程序),用户在App中选择本地相册中的图像进行美颜处理,或者对当前拍摄的图像进行美颜处理;终端设备执行本示例性实施方式的图像美颜处理方法,或者终端设备将图像发送至服务器,由服务器执行本示例性实施方式的图像美颜处理方法,对图像进行美颜处理。或者,也可以对用户选择的视频或者当前拍摄的视频进行美颜处理,具体为对视频中包含人像的帧进行美颜处理,例如在直播场景中,对实时的视频流进行美颜处理。
本公开的示例性实施方式还提供一种电子设备,用于执行上述图像美颜处理方法。该电子设备可以是上述终端设备或服务器,包括但不限于智能手机、平板电脑、可穿戴设备、计算机等。一般的,电子设备包括处理器和存储器。存储器用于存储处理器的可执行指令,也可以存储应用数据,如图像数据、游戏数据等;处理器配置为经由执行可执行指令来执行本示例性实施方式中的图像美颜处理方法。
下面以图1中的移动终端100为例,对上述电子设备的构造进行示例性说明。本领域技术人员应当理解,除了特别用于移动目的的部件之外,图1中的构造也能够应用于固定类型的设备。
如图1所示,移动终端100具体可以包括:处理器110、内部存储器121、外部存储器接口122、USB(Universal Serial Bus,通用串行总线)接口130、充电管理模块140、电源管理模块141、电池142、天线1、天线2、移动通信模块150、无线通信模块160、音频模块170、扬声器171、受话器172、麦克风173、耳机接口174、传感器模块180、显示屏190、摄像模组191、指示器192、马达193、按键194以及SIM(Subscriber Identification Module,用户标识模块)卡接口195等。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括AP(Application Processor,应用处理器)、调制解调处理器、GPU(Graphics Processing Unit,图形处理器)、ISP(Image Signal Processor,图像信号处理器)、控制器、编码器、解码器、DSP(Digital Signal Processor,数字信号处理器)、基带处理器和/或NPU(Neural-Network Processing Unit,神经网络处理器)等。
编码器可以对图像或视频数据进行编码(即压缩),例如对美颜处理后得到的图像进行编码,形成对应的码流数据,以减少数据传输所占的带宽;解码器可以对图像或视频的码流数据进行解码(即解压缩),以还原出图像或视频数据,例如对待美颜的视频进行解码,以得到视频中每一帧的图像数据,提取其中的一帧或多帧进行美颜处理。移动终端100可以支持一种或多种编码器和解码器。这样,移动终端100可以处理多种编码格式的图像或视频,例如:JPEG(Joint Photographic Experts Group,联合图像专家组)、PNG(Portable Network Graphics,便携式网络图形)、BMP(Bitmap,位图)等图像格式,MPEG(Moving Picture Experts Group,动态图像专家组)1、MPEG2、H.263、H.264、HEVC(High Efficiency Video Coding,高效率视频编码)等视频格式。
在一些实施方式中,处理器110可以包括一个或多个接口,通过不同的接口和移动终端100的其他部件形成连接。
内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。内部存储器121可以包括易失性存储器与非易失性存储器。处理器110通过运行存储在内部存储器121的指令,执行移动终端100的各种功能应用以及数据处理。
外部存储器接口122可以用于连接外部存储器,例如Micro SD卡,实现扩展移动终端100的存储能力。外部存储器通过外部存储器接口122与处理器110通信,实现数据存储功能,例如存储图像,视频等文件。
USB接口130是符合USB标准规范的接口,可以用于连接充电器为移动终端100充电,也可以连 接耳机或其他电子设备。
充电管理模块140用于从充电器接收充电输入。充电管理模块140为电池142充电的同时,还可以通过电源管理模块141为设备供电;电源管理模块141还可以监测电池的状态。
移动终端100的无线通信功能可以通过天线1、天线2、移动通信模块150、无线通信模块160、调制解调处理器以及基带处理器等实现。天线1和天线2用于发射和接收电磁波信号。移动通信模块150可以提供应用在移动终端100上的包括2G/3G/4G/5G等无线通信的解决方案。无线通信模块160可以提供应用在移动终端100上的包括WLAN(Wireless Local Area Networks,无线局域网)(如Wi-Fi(Wireless Fidelity,无线保真)网络)、BT(Bluetooth,蓝牙)、GNSS(Global Navigation Satellite System,全球导航卫星***)、FM(Frequency Modulation,调频)、NFC(Near Field Communication,近距离无线通信技术)、IR(Infrared,红外技术)等无线通信解决方案。
移动终端100可以通过GPU、显示屏190及AP等实现显示功能,显示用户界面。例如,移动终端100可以在显示屏190中显示图像美颜App的界面,用户可以在其中选择待处理图像,进行美颜的相关设置等。
移动终端100可以通过ISP、摄像模组191、编码器、解码器、GPU、显示屏190及AP等实现拍摄功能。例如,用户可以在图像美颜App启动拍照功能,触发开启摄像模组191进行拍照,并对拍摄的图像进行美颜处理。
移动终端100可以通过音频模块170、扬声器171、受话器172、麦克风173、耳机接口174及AP等实现音频功能。
传感器模块180可以包括深度传感器1801、压力传感器1802、陀螺仪传感器1803、气压传感器1804等,以实现不同的感应检测功能。
指示器192可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。马达193可以产生振动提示,也可以用于触摸振动反馈等。按键194包括开机键,音量键等。
移动终端100可以支持一个或多个SIM卡接口195,用于连接SIM卡,以实现通话以及数据通信等功能。
下面结合图2对图像美颜处理方法进行说明,图2示出了图像美颜处理方法的示例性流程,可以包括:
步骤S210,从待处理图像中提取一张或多张原始人脸子图像;
步骤S220,基于深度神经网络的输入图像尺寸将上述一张或多张原始人脸子图像进行组合,生成原始人脸组合图像;
步骤S230,利用上述深度神经网络对原始人脸组合图像进行处理,输出美颜人脸组合图像;
步骤S240,根据美颜人脸组合图像与待处理图像,得到待处理图像对应的目标美颜图像。
其中,待处理图像可以是用户选择的图像,如用户在相册中选择的图像,也可以是***自动指定的图像,如当前拍摄的图像等。目标美颜图像为待处理图像经过上述各个步骤的处理后所得到的图像,其可以是整个美颜过程中的中间图像,也可以是最终输出的美颜图像。
深度神经网络(Deep Neural Network,DNN)是一种具有较多层数的神经网络,通过增加网络层数(即网络深度)以减少参数量,同时能够学习到图像的深层特征,实现像素级处理。
本示例性实施方式中,深度神经网络可以被训练为对图像进行任意一种或多种美颜处理的组合。例如,获取大量未经美颜的图像,作为样本待美颜图像;获取人为对样本待美颜图像进行去瑕疵所得到的样本去瑕疵图像;将样本待美颜图像与样本去瑕疵图像构建数据集,用来训练深度神经网络,则所得到的深度神经网络可用于进行去瑕疵处理。如果在获取上述样本去瑕疵图像时,替换为获取人为对样本待美颜图像进行去瑕疵与磨皮处理后的样本美颜图像,并构建数据集、训练深度神经网络,则所得到的深度神经网络可用于同时进行去瑕疵与磨皮处理。因此,根据实际应用需求,可以获取经过特定美颜处理的样本图像并构建数据集,从而训练出能够实现特定美颜处理功能的深度神经网络。并且,深度神经网络可以集成多种不同的美颜处理功能,相比于设置多个算法模块,在方案实现上更加便捷,处理效率更高。
在一种实施方式中,可以将图2的图像美颜处理方法作为一个阶段的美颜处理,在图2的图像美颜处理方法之前或之后,增加其他阶段的美颜处理。
例如,上述深度神经网络用于对图像进行去瑕疵处理。在获取待处理图像后,通过图2的图像美颜处理方法进行处理,得到的目标美颜图像为去瑕疵美颜图像。后续还可以对去瑕疵美颜图像进行个性化美颜处理,得到最终的美颜图像。
一般的,去瑕疵处理对于图像美颜来说是必需的,且用户对于去瑕疵处理的需求较为固定化,可以通过图2的图像美颜处理方法实现通用化的去瑕疵美颜处理流程。相比之下,磨皮、形变、立体、肤色 调整、光影调整等处理不是必需的,且用户对于这些处理的具体需求也呈现个性化的特点,可以将这些处理称为个性化美颜处理,通常需要用户进行具体的设置后进行处理,例如用户选择其中的一种或多种美颜功能,并设置磨皮度、形变度等参数,然后由终端设备根据用户的设置进行处理。
需要说明的是,本公开对于图2的图像美颜处理与其他美颜处理的先后顺序不做限定。例如可以先对原始图像进行个性化美颜处理,得到中间美颜图像,再以中间美颜图像作为待处理图像,执行图2的图像美颜处理,得到的目标美颜图像为最终输出的美颜图像。
通过上述图像美颜处理方法,一方面,通过深度神经网络的处理实现去瑕疵或其他美颜功能,以替代相关技术中固定的多个算法流程,增加了图像美颜处理的灵活性,适用于多样的光照条件或皮肤状况,改善了图像美颜效果,并且降低了耗时与内存占用。另一方面,当待处理图像中包括多张人脸时,可以在组合多张人脸后通过一次处理实现对多张人脸的美颜,无需进行多次美颜,提高了处理效率。
下面对图2中的每个步骤进行具体说明。
参考图2,在步骤S210中,从待处理图像中提取一张或多张原始人脸子图像。
原始人脸子图像是在待处理图像中截取人脸部分所得到的子图像。本示例性实施方式主要对待处理图像中的人脸进行美颜处理,对于待处理图像中的人脸数量不做限定,例如待处理图像中包括多张人脸时,可以提取多张原始人脸子图像,并通过后续步骤的处理,实现同时美颜多张人脸。
在一种实施方式中,上述从待处理图像中提取一张或多张原始人脸子图像,可以包括:
根据在待处理图像中识别到的人脸关键点,在待处理图像中生成一个或多个人脸框;
保留面积大于或等于人脸面积阈值的人脸框,并截取人脸框内的图像,得到一张或多张原始人脸子图像。
其中,人脸关键点可以包括人脸关键部位以及人脸边缘的点。人脸框可以是矩形框,每张人脸的人脸关键点均处于人脸框内。
在一种实施方式中,人脸框可以是包括人脸关键点的最小矩形框。
一般的,通过人脸检测算法可以在待处理图像中检测出所有的人脸,其中可能包括不需要美颜的人脸(例如远处路人的人脸),考虑到在图像美颜的场景中,通常需要对较大的人脸进行美颜(较小的人脸美颜后效果不明显),因此可以通过人脸面积阈值对人脸框进行过滤。具体地,可以根据经验或者待处理图像的大小,设置人脸面积阈值,示例性的,人脸面积阈值可以是待处理图像的尺寸*0.05;如果人脸框的面积大于或等于人脸面积阈值,则为需要美颜的人脸,保留该人脸框;如果人脸框的面积小于人脸面积阈值,则为不需要美颜的人脸,删除该人脸框。
在对人脸框完成过滤后,所保留的人脸框为有效人脸的人脸框。截取每个人脸框内的图像,得到与人脸框数量相同的原始人脸子图像。
在一种实施方式中,为便于后续对原始人脸子图像进行组合,可以设置原始人脸子图像的数量上限,即设置人脸框的数量上限。如可以设置为4,如果经过上述人脸面积阈值的过滤后,所保留的人脸框数量大于4,则可以从中选取4个人脸框,如可以是面积最大的4个人脸框,也可以是距离待处理图像的中心最近的4个人脸框,对应截取4张原始人脸子图像,对于其他人脸框内的人脸不进行美颜;或者可以进行多次美颜处理,本次处理中选取4个人脸框并截取原始人脸子图像进行美颜,下次处理中选取其他的人脸框并截取原始人脸子图像进行美颜,从而完成对待处理图像中所有面积大于人脸面积阈值的人脸框内的人脸进行美颜。
在一种实施方式中,在截取人脸框内的图像前,还可以对人脸框进行扩大处理,使人脸框包括少量的人脸以外的区域,以便于后续进行图像融合时进行渐变处理。在进行扩大处理时,可以按照预设比例将人脸框向一个或多个方向扩大。例如,预设比例为1.1,将人脸框按照向四周均匀扩大,使扩大后的人脸框尺寸为原尺寸的1.1倍。需要说明的是,在对人脸框进行扩大处理时,如果人脸框的一个或多个边界到达待处理图像的边界,则使人脸框的该边界停留在待处理图像的边界处。
继续参考图2,在步骤S220中,基于深度神经网络的输入图像尺寸将上述一张或多张原始人脸子图像进行组合,生成原始人脸组合图像。
输入图像尺寸是与深度神经网络的输入层匹配的图像尺寸。本示例性实施方式将原始人脸子图像组合为一张原始人脸组合图像,该原始人脸组合图像的尺寸为输入图像尺寸。本示例性实施方式对于输入图像尺寸的大小以及宽高比不做限定,示例性的,可以设置输入图像尺寸的长边与短边的比例接近
Figure PCTCN2022076470-appb-000001
在一种实施方式中,深度神经网络可以是全卷积网络,全卷积网络可以处理不同尺寸的图像。在这种情况下,深度神经网络对于输入的图像尺寸没有要求,尺寸的大小对于计算量、内存占用、美颜精细度有影响。可以根据用户设置的美颜精细度或者终端设备的性能,确定输入图像尺寸。由此,该深度神经网络可以部署在高、中、低等不同性能的设备上,适用范围很广,无需针对不同的设备部署不同的深度神经网络,降低了网络的训练成本。示例性的,考虑在移动终端上适合进行轻量化计算,可以将输入 图像尺寸确定为较小的数值,例如为宽640*高448。
在获取输入图像尺寸后,需要将原始人脸子图像组合为该尺寸大小的原始人脸组合图像。具体组合的方式与原始人脸子图像的数量相关。在一种实施方式中,参考图3所示,步骤S220可以具体包括:
步骤S310,根据原始人脸子图像的数量,将输入图像尺寸分割为与上述一张或多张原始人脸子图像对应的一个或多个子图像尺寸;
步骤S320,分别基于每个子图像尺寸将对应的原始人脸子图像进行变换;
步骤S330,将变换后的原始人脸子图像进行组合,生成原始人脸组合图像。
下面结合图4举例说明。图4中Q表示原始人脸子图像的数量,图4分别示出了Q为1~4时的输入图像尺寸分割与图像组合的示例性方式。假设输入图像尺寸为宽640*高448,Q为1时,子图像尺寸也为宽640*高448;Q为2时,子图像尺寸为输入图像尺寸的一半,即宽320*高448;Q为3时,子图像尺寸分别为输入图像尺寸的0.5、0.25、0.25,即宽320*高448、宽320*高224、宽320*高224;Q为4时,子图像尺寸分别均为输入图像尺寸的0.25,即宽320*高224。将各个原始人脸子图像分别变换为与子图像尺寸一致,需要特别说明的是,当各个子图像尺寸不一致时,如Q为3的情况,可以按照原始人脸子图像的大小顺序与子图像尺寸的大小顺序,将原始人脸子图像与子图像尺寸进行一一对应,即最大的原始人脸子图像对应到最大的子图像尺寸,最小的原始人脸子图像对应到最小的子图像尺寸。在将原始人脸子图像进行变换后,再将变换后的原始人脸子图像按照图4所示的方式进行组合,生成一张原始人脸组合图像。
在一种实施方式中,当Q为偶数时,可以将输入图像尺寸进行Q等分,得到Q个相同的子图像尺寸。具体地,可以将Q分解为两个因数的乘积,即Q=q 1*q 2,使q 1/q 2的比例与输入图像尺寸的宽高比(如
Figure PCTCN2022076470-appb-000002
)尽可能接近,将输入图像尺寸的宽度进行q 1等分,高度进行q 2等分。当Q为奇数时,将输入图像尺寸进行Q+1等分,得到Q+1个相同的子图像尺寸,将其中的两个子图像尺寸合并为一个子图像尺寸,其余Q-1个子图像不变,由此得到Q个子图像尺寸。
在另一种实施方式中,可以先计算原始人脸子图像的尺寸比例(或面积比例),如可以是S 1:S 2:S 3:…:S Q,再按照该比例将输入图像尺寸分割为Q个子图像尺寸。
确定每个原始人脸子图像对应的子图像尺寸后,可以基于子图像尺寸对原始人脸子图像进行变换。在一种实施方式中,对原始人脸子图像进行变换,可以包括以下任意一条或多条:
①当原始人脸子图像的宽度与高度的大小关系与子图像尺寸的宽度与高度的大小关系不同时,将原始人脸子图像旋转90度。具体来说,在原始人脸子图像与子图像尺寸中,均为宽度大于高度或者均为宽度小于高度,则原始人脸子图像与子图像尺寸的宽度与高度的大小关系相同,无需旋转原始人脸子图像;否则,原始人脸子图像与子图像尺寸的宽度与高度的大小关系不同,需要将原始人脸子图像旋转90度(顺时针或逆时针旋转皆可)。例如,子图像尺寸为宽320*高448时,即宽度小于高度,如果原始人脸子图像为宽度大于高度的情况,则将原始人脸子图像旋转90度。
在一种实施方式中,为了保持原始人脸子图像中人脸的角度,可以不对原始人脸子图像进行旋转。
②当原始人脸子图像的尺寸大于子图像尺寸时,根据子图像尺寸将原始人脸子图像进行下采样。其中,原始人脸子图像的尺寸大于子图像尺寸,是指原始人脸子图像的宽度大于子图像尺寸的宽度,或者原始人脸子图像的高度大于子图像尺寸的高度。在图像美颜场景中,待处理图像一般是终端设备拍摄的清晰图像,其尺寸较大,因此原始人脸子图像的尺寸大于子图像尺寸是比较常见的情况,即通常情况下需要对原始人脸子图像进行下采样。
下采样可以采用双线性插值、最近邻插值等方法实现,本公开对此不做限定。
在进行下采样后,原始人脸子图像的宽度与高度中的至少一个与子图像尺寸对齐,具体包括以下几种情况:
原始人脸子图像的宽度、高度均与子图像尺寸相同;
原始人脸子图像的宽度与子图像尺寸的宽度相同,高度小于子图像尺寸的高度;
原始人脸子图像的高度与子图像尺寸的高度相同,宽度小于子图像尺寸的宽度。
需要说明的是,如果已经对原始人脸子图像进行了上述旋转,得到经过旋转的原始人脸子图像,则当该原始人脸子图像的尺寸大于子图像尺寸时,根据子图像尺寸对其进行下采样,具体的实现方式与上述原始人脸子图像的下采样方式相同,因而不再赘述。
反之,当原始人脸子图像(或经过旋转的原始人脸子图像)的尺寸小于或等于子图像尺寸时,可以不进行下采样的处理步骤。
③当原始人脸子图像的尺寸小于子图像尺寸时,根据原始人脸子图像与子图像尺寸的差值将原始人脸子图像进行填充,使填充后的原始人脸子图像的尺寸等于子图像尺寸。其中,原始人脸子图像的尺寸小于子图像尺寸,是指原始人脸子图像的宽度与高度中的至少一个小于子图像尺寸,另一个不大于子图 像尺寸,具体包括以下几种情况:
原始人脸子图像的宽度小于子图像尺寸的宽度,高度也小于子图像尺寸的高度;
原始人脸子图像的宽度小于子图像尺寸的宽度,高度等于子图像尺寸的高度;
原始人脸子图像的高度小于子图像尺寸的高度,宽度等于子图像尺寸的高度。
填充时可以采用预设像素值,通常是与人脸颜色差别较大的像素值,如(R0,G0,B0)、(R255,G255,B255)等。
一般可以填充在原始人脸子图像的四周,例如将原始人脸子图像的中心与子图像尺寸的中心重合,对原始人脸子图像四周的差值部分进行填充,使填充后原始人脸子图像的尺寸与子图像尺寸一致。当然也可以将原始人脸子图像与子图像尺寸的一侧边缘对齐,对另一侧进行填充。本公开对此不做限定。
需要说明的是,如果已经对原始人脸子图像进行了上述旋转与下采样中至少一种处理,得到经过旋转与下采样中至少一种处理的原始人脸子图像,则当该原始人脸子图像的尺寸小于子图像尺寸时,根据其与子图像尺寸的差值进行填充,具体的实现方式与上述原始人脸子图像的填充方式相同,因而不再赘述。
上述①~③为常用的三种变换方式,可以根据实际需求使用其中的任意一种或多种。例如,依次采用①、②、③对每张原始人脸子图像进行处理,将处理后的原始人脸子图像组合为原始人脸组合图像。
在上述变换中,改变了原始人脸子图像的方向、尺寸等,这是为了便于深度神经网络的统一处理。后续还需要对美颜后的人脸图像进行逆变换,使其恢复为与原始人脸子图像的方向、尺寸等一致,以适应待处理图像的尺寸。因此,可以保存相应的变换信息,包括但不限于:对每张原始人脸子图像旋转的方向与角度,下采样的比例,填充的像素的坐标。这样便于后续根据该变换信息进行逆变换。
在将变换后的原始人脸子图像进行组合后,可以保存组合信息,包括但不限于每张原始人脸子图像的尺寸(即对应的子图像尺寸)以及在原始人脸组合图像中的位置,各原始人脸子图像的排列方式与顺序。后续可以根据该组合信息对美颜人脸组合图像进行拆分,以得到每个单独的美颜人脸子图像。
继续参考图2,在步骤S230中,利用上述深度神经网络对原始人脸组合图像进行处理,输出美颜人脸组合图像。
在一种实施方式中,可以采用轻量化的深度神经网络,以降低计算量,并实现对于图像深度特征的学习与处理。示例性的,深度神经网络可以采用端到端(End-to-End)的结构,以实现对原始人脸组合图像的像素级处理。
图5示出了深度神经网络的示例性结构。如图5所示,深度神经网络可以是全卷积网络,包括:第一像素重排层、至少一个卷积层、至少一个转置卷积层、第二像素重排层。
基于图5所示的深度神经网络,参考图6所示,步骤S230可以通过图6中的步骤S610至S640实现:
步骤S610,利用第一像素重排层对原始人脸组合图像进行由单通道到多通道的像素重排处理,得到第一特征图像。
需要说明的是,原始人脸组合图像可以是单通道图像(如灰度图像),也可以是多通道图像(如RGB图像)。第一像素重排层可以将原始人脸组合图像的每个通道重排为多个通道。
在一种可选的实施方式中,步骤S610包括:
将通道数为a的原始人脸组合图像输入第一像素重排层;
将原始人脸组合图像的每个通道中每n*n邻域的像素点分别重排至n*n个通道中的相同位置,输出通道数为a*n*n的第一特征图像。
其中,a表示原始人脸组合图像的通道数,为正整数,n表示像素重排的参数,为不小于2的正整数。以n=2为例,遍历原始人脸组合图像的第一通道,通常从左上角开始,将每2*2格子的像素点提取出来,分别重排到4个通道中的相同位置,由此将一个通道变为四个通道,同时图像的宽和高降低到一半,将重排后的图像记为第一特征图像;采用同样的方式处理其他通道。如果原始人脸组合图像为单通道图像,则像素重排后得到四通道的第一特征图像;如果原始人脸组合图像为三通道图像,则像素重排后得到十二通道的第一特征图像。
第一像素重排层可以采用TensorFlow(一种机器学习的实现框架)中的space_to_depth函数实现,将原始人脸组合图像中的空间特征转换为深度特征,也可用采用步长为n的卷积操作实现,此时第一像素重排层可视为特殊的卷积层。
步骤S620,利用卷积层对第一特征图像进行卷积处理,得到第二特征图像。
本公开对于卷积层的数量、卷积核尺寸、卷积层的具体结构等不做限定。卷积层用于从不同尺度上提取图像特征并学习深度信息。卷积层可以包括配套的池化层,用于对卷积后的图像进行下采样,以实现信息抽象,增大感受野,同时降低参数复杂度。
当设置多个卷积层时,可以采用逐步卷积与下采样的方式,例如可以使图像按照2倍率下降,直到最后一个卷积层输出第二特征图像,第二特征图像可以是深度神经网络处理过程中尺寸最小的特征图像。
步骤S630,利用转置卷积层对第二特征图像进行转置卷积处理,得到第三特征图像。
本公开对于转置卷积层的数量、转置卷积核尺寸、转置卷积层的具体结构等不做限定。转置卷积层用于对第二特征图像进行上采样,可视为卷积的相反过程,由此恢复图像的尺寸。
当设置多个转置卷积层时,可以采用逐步上采样的方式,例如可以使图像按照2倍率上升,直到最后一个转置卷积层输出第三特征图像。
在一种可选的实施方式中,卷积层与转置卷积层为完全对称的结构,则第三特征图像与第一特征图像的尺寸、通道数相同。
在一种可选的实施方式中,可以在卷积层与转置卷积层之间建立直连,如图4所示,在对应于相同尺寸的特征图像的卷积层与转置卷积层之间建立直连,由此实现卷积环节的特征图像信息直接连接到转置卷积环节中的特征图像,有利于得到信息更为全面的第三特征图像。
步骤S640,利用第二像素重排层对第三特征图像进行由多通道到单通道的像素重排处理,得到美颜人脸组合图像。
需要说明的是,原始人脸组合图像可以是单通道图像(如灰度图像),也可以是多通道图像(如RGB图像)。第二像素重排层可以将原始人脸组合图像的每个通道重排为多个通道。在一种可选的实施方式中,步骤S640包括:
将通道数为b*n*n的第三特征图像输入第二像素重排层;
将第三特征图像的每n*n个通道中相同位置的像素点重排至单通道中的n*n邻域内,输出通道数为b的美颜人脸组合图像;
其中,b为正整数。
第二像素重排层可以采用TensorFlow中的depth_to_space函数实现,将第三特征图像中的深度特征转换为空间特征,也可用采用步长为n的转置卷积操作实现,此时第二像素重排层可视为特殊的转置卷积层。
如果卷积层与转置卷积层为完全对称的结构,即第三特征图像与第一特征图像的尺寸、通道数相同,则有a=b。进而,美颜人脸组合图像与原始人脸组合图像的通道数也相同,即深度神经网络的处理过程不改变图像尺寸与通道数。
需要说明的是,深度神经网络的处理过程同样不改变人脸的数量。例如原始人脸组合图像是由4张原始人脸子图像组合而成,在经过深度神经网络的处理后,美颜人脸组合图像中也包括4张人脸,是对4张原始人脸子图像中人脸进行美颜后的人脸。
如果将深度神经网络用于进行去瑕疵处理,其去瑕疵效果依赖于数据集的质量与训练效果,而不依赖于人为设计的图像特征计算。当采用全面的数据集进行充分训练后,深度神经网络可以应对实际应用中的几乎所有情况,包括不同的光照条件、不同的皮肤状况等,实现准确、充分地检测与去除人像瑕疵,解决相关技术中瑕疵去除不干净的问题。
继续参考图2,在步骤S240中,根据美颜人脸组合图像与待处理图像,得到待处理图像对应的目标美颜图像。
美颜人脸组合图像包括原始人脸子图像中的人脸进行美颜处理后的人脸,将美颜处理后的人脸替换掉待处理图像原始的人脸,得到目标美颜图像。
在一种实施方式中,参考图7所示,步骤S240可以包括:
步骤S710,从美颜人脸组合图像中拆分出与原始人脸子图像对应的美颜人脸子图像;
步骤S720,将待处理图像中的原始人脸子图像替换为对应的美颜人脸子图像,得到目标美颜图像。
其中,在对美颜人脸组合图像进行拆分时,可以采用上述保存的组合信息,从美颜人脸组合图像中拆分出特定位置、特定尺寸的子图像,即美颜人脸子图像,美颜人脸子图像与原始人脸子图像一一对应。
在一种实施方式中,如果在将原始人脸子图像组合为原始人脸组合图像时,对原始人脸子图像进行了变换,则可以相应的对拆分得到的美颜人脸子图像进行逆变换,包括去除填充的像素、上采样、反向旋转90度等,使逆变换后的美颜人脸子图像与原始人脸子图像的方向、尺寸等一致,这样在待处理图像中可以进行1:1替换,得到目标美颜图像。
美颜人脸子图像使经过深度神经网络进行美颜处理后的人脸子图像,通常是美颜程度较高的人脸子图像。在一种实施方式中,为了增加美颜人脸子图像的真实感,可以在步骤S720前,利用原始人脸子图像对美颜人脸子图像进行美颜弱化处理。美颜弱化处理是指降低美颜人脸子图像的美颜程度。下面提供美颜弱化处理的两种示例性方式:
方式一、根据设定的美颜程度参数,将原始人脸子图像融合至美颜人脸子图像。其中,美颜程度参 数可以是特定美颜功能下的美颜力度参数,如去瑕疵程度。本示例性实施方式中,美颜程度参数可以是用于当前设定的参数,***默认的参数,或者上一次美颜所使用的参数等。在确定美颜程度参数后,可以以美颜程度参数作为比重,将原始人脸子图像与美颜人脸子图像进行融合。举例来说,假设去瑕疵程度的范围为0~100,当前设定的值为a,参考如下公式:
Figure PCTCN2022076470-appb-000003
其中,image_blend表示融合后的图像,image_ori表示原始人脸子图像,image_deblemish表示美颜人脸子图像。当a为0时,表示不进行去瑕疵处理,则完全使用原始人脸子图像;当a为100时,表示完全去瑕疵处理,则完全使用美颜人脸子图像。因此,公式(1)表示通过融合,得到介于原始人脸子图像与美颜人脸子图像中间的图像,a越大,所得到的图像越接近于美颜人脸子图像,即美颜程度越高,美颜效果越明显。
需要说明的是,如果在将原始人脸子图像组合为原始人脸组合图像时,对原始人脸子图像进行了变换,可以对拆分得到的美颜人脸子图像进行逆变换。原始人脸子图像与美颜人脸子图像具有如下关系:变换前的原始人脸子图像与逆变换后的美颜人脸子图像方向、尺寸等一致;变换后的原始人脸子图像与逆变换前的美颜人脸子图像方向、尺寸等一致。因此,在利用上述公式(1)将原始人脸子图像与美颜人脸子图像进行融合时,可以融合上述变换前的原始人脸子图像与逆变换后的美颜人脸子图像,也可以融合上述变换后的原始人脸子图像与逆变换前的美颜人脸子图像。
方式二、将原始人脸子图像的高频图像融合至美颜人脸子图像。其中,高频图像是指包含原始人脸子图像中细节纹理等高频信息的图像。
在一种实施方式中,可以通过以下方式获取高频图像:
在基于深度神经网络的输入图像尺寸将上述一张或多张原始人脸子图像进行组合时,如果对原始人脸子图像进行下采样,则将下采样后得到的下采样人脸子图像进行上采样,得到上采样人脸子图像;
根据原始人脸子图像与上采样人脸子图像的差别,获取原始人脸子图像的高频图像。
其中,下采样人脸子图像的分辨率低于原始人脸子图像,一般在下采样的过程中,不可避免地会损失图像的高频信息。对下采样人脸子图像进行上采样,使得到的上采样人脸子图像与原始人脸子图像的分辨率相同。需要说明的是,如果对原始人脸子图像进行下采样前,还进行了旋转,则对下采样人脸子图像进行上采样后,还可以进行反向旋转,使得到的上采样人脸子图像与原始人脸子图像的方向也相同。
上采样可以采用双线性插值、最近邻插值等方法。通过上采样虽然能够恢复分辨率,但是难以恢复出所损失的高频信息,即上采样人脸子图像可视为原始人脸子图像的低频图像。由此,确定原始人脸子图像与上采样人脸子图像的差别,例如可以将原始人脸子图像与上采样人脸子图像相减,结果为原始人脸子图像的高频信息,将想见后的值形成图像,即原始人脸子图像的高频图像。
在另一种实施方式中,还可以通过对原始人脸子图像进行滤波,以提取高频信息,得到高频图像。
在将上述高频图像融合至美颜人脸子图像时,可以采用直接相加的方式,将高频图像叠加到美颜人脸子图像中,使得美颜人脸子图像中增加细节纹理等高频信息,更具有真实感。
由于原始人脸子图像与上采样人脸子图像通常是非常相近的,基于其差值得到的高频图像中,像素值一般较小,如RGB各通道值不超过4。然而,对于原始人脸子图像中的突变位置,比如脸上的小黑痣等,其具有强烈的高频信息,因此在高频图像中对应位置的像素值可能比较大。在将高频图像融合至原始人脸子图像时,这些位置的像素值可能产生不良影响,例如产生“痣印”等锐利边缘,导致视觉感受不自然。
针对于上述问题,在一种实施方式中,参考图8所示,图像美颜处理方法还可以包括以下步骤S810与S820:
步骤S810,在高频图像中确定瑕疵点。
其中,瑕疵点是具有强烈高频信息的像素点,可以将高频图像中像素值较大的点确定为瑕疵点。
或者,在一种实施方式中,可以通过以下方式确定瑕疵点:
将美颜人脸子图像与对应的原始人脸子图像相减,得到每个像素点的差值;
当判断某个像素点的差值满足预设瑕疵条件时,将该像素点在高频图像中对应的像素点确定为瑕疵点。
其中,预设瑕疵条件用于衡量美颜人脸子图像与原始人脸子图像的差别,以判断每个像素点是否为被去除的瑕疵点。在去瑕疵处理中,通常会将人脸中的小黑痣、痘等去除,并填充人脸肤色,在该位置处,美颜人脸子图像与原始人脸子图像的差别很大,因此可以通过设定预设瑕疵条件来甄别瑕疵点。
示例性的,预设瑕疵条件可以包括:各个颜色通道的差值均大于第一颜色差阈值,且各个颜色通道的差值中的至少一个大于第二颜色差阈值。第一颜色差阈值与第二颜色差阈值可以是经验阈值。例如,当颜色通道包括RGB时,第一颜色差阈值可以是20,第二颜色差阈值可以是40。由此,得到每个像 素点在美颜人脸子图像中与在原始人脸子图像中的差值后,对差值中RGB三个颜色通道的具体差值进行判断,判断每个颜色通道的差值是否均大于20,以及其中是否由至少一个颜色通道的差值大于40,当满足这两个条件时,表示满足预设瑕疵条件,则将高频图像中对应位置的像素点确定为瑕疵点。
步骤S820,将高频图像中上述瑕疵点周围预设区域内的像素值调整到预设数值范围内。
确定瑕疵点后,可以在高频图像中进一步确定瑕疵点周围的预设区域,例如可以是以瑕疵点为中心的5*5像素区域,具体的尺寸可以根据高频图像的尺寸来确定,本公开不做限定。将预设区域内的像素值调整到预设数值范围内,预设数值范围一般是较小的数值范围,可以根据经验与实际需求确定,在调整时通常需要减小像素值。示例性的,预设数值范围可以是-2~2,而瑕疵点周围的像素值可能超出-5~5,将其调整到-2~2内,实际上进行了限值处理。由此能够弱化“痣印”等锐利边缘,增加视觉上的自然感受。
以上说明了两种美颜弱化处理方式。本示例性实施方式可以同时采用这两种美颜弱化处理方式,例如,先通过方式一进行原始人脸子图像与美颜人脸子图像的融合,在此基础上,再通过方式二将高频图像叠加到其中,得到经过美颜弱化处理的美颜人脸子图像,该的美颜人脸子图像兼具有较好的美颜效果与真实感。
在一种实施方式中,在将待处理图像中的原始人脸子图像替换为对应的美颜人脸子图像时,还可以执行以下步骤:
对待处理图像中的未替换区域与美颜人脸子图像之间的边界区域进行渐变处理,使边界区域形成平滑过渡。
其中,待处理图像中的未替换区域即待处理图像中除原始人脸子图像以外的区域。上述未替换区域与美颜人脸子图像之间的边界区域实际包括两部分:未替换区域中与美颜人脸子图像相邻的边界区域,以及美颜人脸子图像中与未替换区域相邻的边界区域。本示例性实施方式可以对其中任一部分进行渐变处理,也可以同时对两部分进行渐变处理。
参考图9所示,可以在美颜人脸子图像中确定一定比例(如10%)的边界区域,其从美颜人脸子图像的边缘向内延伸。需要注意的是,边界区域通常需要避开人脸部分,以避免渐变处理中改变人脸部分的颜色。例如,上述对人脸框进行扩大处理后,使得原始人脸子图像中的人脸与边界具有一定的距离,则美颜人脸子图像中的人脸与边界也具有一定的距离,这样在进行渐变处理时,可以较好地避开人脸部分。确定边界区域后,获取边界区域的内边缘颜色,记为第一颜色;获取未替换区域的内边缘颜色,记为第二颜色;再对边界区域进行第一颜色与第二颜色的渐变处理。由此,未替换区域与美颜人脸子图像的边界处为渐变色区域(图9中的斜线区域),这样形成平滑过渡,防止产生颜色突变,导致视觉不和谐。
需要说明的是,当有多张美颜人脸子图像时,可以分别将每张美颜人脸子图像替换掉待处理图像中对应的原始人脸子图像,并进行边界区域的渐变处理,得到一张目标美颜图像,使其具有自然、和谐的视觉感受。
图10示出了图像美颜处理方法的示意性流程,包括:
步骤S1001,在待处理图像中根据识别到的人脸关键点生成多个人脸框,并保留面积不小于人脸面积阈值的人脸框。假设待处理图像中包括两张主要的人脸,分别对应生成人脸框1与人脸框2。
步骤S1002,对人脸框1与人脸框2分别进行扩大处理,例如可以扩大至1.1倍,然后截取人脸框内的图像,得到原始人脸子图像1与原始人脸子图像2。
步骤S1003,将深度神经网络的输入图像尺寸进行二等分,得到子图像尺寸,原始人脸子图像1与原始人脸子图像2的尺寸均大于子图像尺寸,因此对原始人脸子图像1与原始人脸子图像2进行下采样,还可以进行旋转、填充等处理,得到下采样人脸子图像1与下采样人脸子图像2。
步骤S1004,将下采样人脸子图像1与下采样人脸子图像2进行上采样,以与原始人脸子图像1、原始人脸子图像2的分辨率一致。如果在获取下采样人脸子图像1与采样人脸子图像2时还进行了旋转、填充等处理,则还可以进行反向旋转、去除填充等处理,得到上采样人脸子图像1与上采样人脸子图像2。
步骤S1005,将原始人脸子图像1与下采样人脸子图像1相减,得到高频图像1,将原始人脸子图像2与下采样人脸子图像2相减,得到高频图像2。
步骤S1006,将下采样人脸子图像1与下采样人脸子图像2组合为一张原始人脸组合图像。
步骤S1007,将原始人脸组合图像输入深度神经网络,处理后输出美颜人脸组合图像。
步骤S1008,将美颜人脸组合图像拆分为美颜人脸子图像1与美颜人脸子图像2,其中美颜人脸子图像1与原始人脸子图像1相对应,美颜人脸子图像2与原始人脸子图像2相对应。
步骤S1009,将美颜人脸子图像1与原始人脸子图像1按照美颜程度参数进行融合,再与高频图像 1相加,得到待替换人脸子图像1;将美颜人脸子图像2与原始人脸子图像2按照美颜程度参数进行融合,再与高频图像2相加,得到待替换人脸子图像2。
步骤S1010,将待替换人脸子图像1与待替换人脸子图像2融合至待处理图像,具体地,可以由待替换人脸子图像1替换掉待处理图像中的原始人脸子图像1的部分,由待替换人脸子图像2替换掉待处理图像中的原始人脸子图像2的部分,这样待处理图像中的两张主要人脸均被替换为美颜后的人脸,输出目标美颜图像。后续还可以进行个性化美颜处理。
本公开的示例性实施方式还提供一种图像美颜处理装置。参考图11所示,该图像美颜处理装置1100可以包括处理器1110与存储器1120。其中,存储器1120存储有以下程序模块:
人脸提取模块1121,被配置为从待处理图像中提取一张或多张原始人脸子图像;
图像组合模块1122,被配置为基于深度神经网络的输入图像尺寸将上述一张或多张原始人脸子图像进行组合,生成原始人脸组合图像;
美颜处理模块1123,被配置为利用上述深度神经网络对原始人脸组合图像进行处理,输出美颜人脸组合图像;
图像融合模块1124,被配置为根据美颜人脸组合图像与待处理图像,得到待处理图像对应的目标美颜图像。
处理器1110用于执行上述程序模块。
在一种实施方式中,人脸提取模块1121,被配置为:
根据在待处理图像中识别到的人脸关键点,在待处理图像中生成一个或多个人脸框;
保留面积大于或等于人脸面积阈值的人脸框,并截取人脸框内的图像,得到一张或多张原始人脸子图像。
在一种实施方式中,图像组合模块1122,被配置为:
根据原始人脸子图像的数量,将输入图像尺寸分割为与一张或多张原始人脸子图像对应的一个或多个子图像尺寸;
分别基于每个子图像尺寸将对应的原始人脸子图像进行变换;
将变换后的原始人脸子图像进行组合,生成原始人脸组合图像。
在一种实施方式中,图像组合模块1122,被配置为执行以下任意一条或多条:
当原始人脸子图像的宽度与高度的大小关系与子图像尺寸的宽度与高度的大小关系不同时,将原始人脸子图像旋转90度;
当原始人脸子图像或者经过旋转的原始人脸子图像的尺寸大于子图像尺寸时,根据子图像尺寸将原始人脸子图像或者经过旋转的原始人脸子图像进行下采样;
当原始人脸子图像或者经过旋转与下采样中至少一种处理的原始人脸子图像的尺寸小于子图像尺寸时,根据原始人脸子图像的尺寸与子图像尺寸的差值将原始人脸子图像进行填充,或者根据该经过旋转与下采样中至少一种处理的原始人脸子图像的尺寸与子图像尺寸的差值将该经过旋转与下采样中至少一种处理的原始人脸子图像进行填充。
在一种实施方式中,深度神经网络为全卷积网络,包括:第一像素重排层、至少一个卷积层、至少一个转置卷积层、第二像素重排层。
美颜处理模块1123,包括:
第一重排子模块,被配置为利用第一像素重排层对原始人脸组合图像进行由单通道到多通道的像素重排处理,得到第一特征图像;
卷积子模块,被配置为利用卷积层对第一特征图像进行卷积处理,得到第二特征图像;
转置卷积子模块,被配置为利用转置卷积层对第二特征图像进行转置卷积处理,得到第三特征图像;
第二重排子模块,被配置为利用第二像素重排层对第三特征图像进行由多通道到单通道的像素重排处理,得到美颜人脸组合图像。
在一种实施方式中,第一重排子模块,被配置为:
将通道数为a的原始人脸组合图像输入第一像素重排层;
将待处理图像的每个通道中每n*n邻域的像素点分别重排至n*n个通道中的相同位置,输出通道数为a*n*n的第一特征图像;
其中,a为正整数,n为不小于2的正整数。
在一种实施方式中,第二重排子模块,被配置为:
将通道数为b*n*n的第三特征图像输入第二像素重排层;
将第三特征图像的每n*n个通道中相同位置的像素点重排至单通道中的n*n邻域内,输出通道数为b的美颜人脸组合图像;
其中,b为正整数,n为不小于2的正整数。
在一种实施方式中,图像融合模块1124,被配置为:
从美颜人脸组合图像中拆分出与原始人脸子图像对应的美颜人脸子图像;
将待处理图像中的原始人脸子图像替换为对应的美颜人脸子图像,得到目标美颜图像。
在一种实施方式中,图像融合模块1124,被配置为:
在将待处理图像中的原始人脸子图像替换为对应的美颜人脸子图像前,利用原始人脸子图像对美颜人脸子图像进行美颜弱化处理。
在一种实施方式中,图像融合模块1124,被配置为:
根据设定的美颜程度参数,将原始人脸子图像融合至美颜人脸子图像。
在一种实施方式中,图像融合模块1124,被配置为:
将原始人脸子图像的高频图像融合至美颜人脸子图像。
在一种实施方式中,图像组合模块1122,被配置为:
在基于深度神经网络的输入图像尺寸将一张或多张原始人脸子图像进行组合时,如果对原始人脸子图像进行下采样,则将下采样后得到的下采样人脸子图像进行上采样,得到上采样人脸子图像,上采样人脸子图像与原始人脸子图像的分辨率相同;
根据原始人脸子图像与上采样人脸子图像的差别,获取原始人脸子图像的高频图像。
在一种实施方式中,图像融合模块1124,被配置为:
在高频图像中确定瑕疵点;
将高频图像中瑕疵点周围预设区域内的像素值调整到预设数值范围内。
在一种实施方式中,图像融合模块1124,被配置为:
将美颜人脸子图像与对应的原始人脸子图像相减,得到每个像素点的差值;
当判断像素点的差值满足预设瑕疵条件时,将像素点在高频图像中对应的像素点确定为瑕疵点。
在一种实施方式中,预设瑕疵条件包括:
各个颜色通道的差值均大于第一颜色差阈值,且各个颜色通道的差值中的至少一个大于第二颜色差阈值。
在一种实施方式中,图像融合模块1124,被配置为:
在将待处理图像中的原始人脸子图像替换为对应的美颜人脸子图像后,对待处理图像中的未替换区域与美颜人脸子图像之间的边界区域进行渐变处理,使边界区域形成平滑过渡。
在一种实施方式中,目标美颜图像包括去瑕疵美颜图像。
存储器1120还可以存储以下程序模块:
个性化美颜处理模块,被配置为在得到去瑕疵美颜图像后,对去瑕疵美颜图像进行个性化美颜处理,得到最终的美颜图像。
本公开的示例性实施方式还提供另一种图像美颜处理装置。参考图12所示,该图像美颜处理装置1200可以包括:
人脸提取模块1210,被配置为从待处理图像中提取一张或多张原始人脸子图像;
图像组合模块1220,被配置为基于深度神经网络的输入图像尺寸将上述一张或多张原始人脸子图像进行组合,生成原始人脸组合图像;
美颜处理模块1230,被配置为利用上述深度神经网络对原始人脸组合图像进行处理,输出美颜人脸组合图像;
图像融合模块1240,被配置为根据美颜人脸组合图像与待处理图像,得到待处理图像对应的目标美颜图像。
在一种实施方式中,人脸提取模块1210,被配置为:
根据在待处理图像中识别到的人脸关键点,在待处理图像中生成一个或多个人脸框;
保留面积大于或等于人脸面积阈值的人脸框,并截取人脸框内的图像,得到一张或多张原始人脸子图像。
在一种实施方式中,图像组合模块1220,被配置为:
根据原始人脸子图像的数量,将输入图像尺寸分割为与一张或多张原始人脸子图像对应的一个或多个子图像尺寸;
分别基于每个子图像尺寸将对应的原始人脸子图像进行变换;
将变换后的原始人脸子图像进行组合,生成原始人脸组合图像。
在一种实施方式中,图像组合模块1220,被配置为执行以下任意一条或多条:
当原始人脸子图像的宽度与高度的大小关系与子图像尺寸的宽度与高度的大小关系不同时,将原始 人脸子图像旋转90度;
当原始人脸子图像或者经过旋转的原始人脸子图像的尺寸大于子图像尺寸时,根据子图像尺寸将原始人脸子图像或者经过旋转的原始人脸子图像进行下采样;
当原始人脸子图像或者经过旋转与下采样中至少一种处理的原始人脸子图像的尺寸小于子图像尺寸时,根据原始人脸子图像的尺寸与子图像尺寸的差值将原始人脸子图像进行填充,或者根据该经过旋转与下采样中至少一种处理的原始人脸子图像的尺寸与子图像尺寸的差值将该经过旋转与下采样中至少一种处理的原始人脸子图像进行填充。
在一种实施方式中,深度神经网络为全卷积网络,包括:第一像素重排层、至少一个卷积层、至少一个转置卷积层、第二像素重排层。
美颜处理模块1230,包括:
第一重排子模块,被配置为利用第一像素重排层对原始人脸组合图像进行由单通道到多通道的像素重排处理,得到第一特征图像;
卷积子模块,被配置为利用卷积层对第一特征图像进行卷积处理,得到第二特征图像;
转置卷积子模块,被配置为利用转置卷积层对第二特征图像进行转置卷积处理,得到第三特征图像;
第二重排子模块,被配置为利用第二像素重排层对第三特征图像进行由多通道到单通道的像素重排处理,得到美颜人脸组合图像。
在一种实施方式中,第一重排子模块,被配置为:
将通道数为a的原始人脸组合图像输入第一像素重排层;
将待处理图像的每个通道中每n*n邻域的像素点分别重排至n*n个通道中的相同位置,输出通道数为a*n*n的第一特征图像;
其中,a为正整数,n为不小于2的正整数。
在一种实施方式中,第二重排子模块,被配置为:
将通道数为b*n*n的第三特征图像输入第二像素重排层;
将第三特征图像的每n*n个通道中相同位置的像素点重排至单通道中的n*n邻域内,输出通道数为b的美颜人脸组合图像;
其中,b为正整数,n为不小于2的正整数。
在一种实施方式中,图像融合模块1240,被配置为:
从美颜人脸组合图像中拆分出与原始人脸子图像对应的美颜人脸子图像;
将待处理图像中的原始人脸子图像替换为对应的美颜人脸子图像,得到目标美颜图像。
在一种实施方式中,图像融合模块1240,被配置为:
在将待处理图像中的原始人脸子图像替换为对应的美颜人脸子图像前,利用原始人脸子图像对美颜人脸子图像进行美颜弱化处理。
在一种实施方式中,图像融合模块1240,被配置为:
根据设定的美颜程度参数,将原始人脸子图像融合至美颜人脸子图像。
在一种实施方式中,图像融合模块1240,被配置为:
将原始人脸子图像的高频图像融合至美颜人脸子图像。
在一种实施方式中,图像组合模块1220,被配置为:
在基于深度神经网络的输入图像尺寸将一张或多张原始人脸子图像进行组合时,如果对原始人脸子图像进行下采样,则将下采样后得到的下采样人脸子图像进行上采样,得到上采样人脸子图像,上采样人脸子图像与原始人脸子图像的分辨率相同;
根据原始人脸子图像与上采样人脸子图像的差别,获取原始人脸子图像的高频图像。
在一种实施方式中,图像融合模块1240,被配置为:
在高频图像中确定瑕疵点;
将高频图像中瑕疵点周围预设区域内的像素值调整到预设数值范围内。
在一种实施方式中,图像融合模块1240,被配置为:
将美颜人脸子图像与对应的原始人脸子图像相减,得到每个像素点的差值;
当判断像素点的差值满足预设瑕疵条件时,将像素点在高频图像中对应的像素点确定为瑕疵点。
在一种实施方式中,预设瑕疵条件包括:
各个颜色通道的差值均大于第一颜色差阈值,且各个颜色通道的差值中的至少一个大于第二颜色差阈值。
在一种实施方式中,图像融合模块1240,被配置为:
在将待处理图像中的原始人脸子图像替换为对应的美颜人脸子图像后,对待处理图像中的未替换区 域与美颜人脸子图像之间的边界区域进行渐变处理,使边界区域形成平滑过渡。
在一种实施方式中,目标美颜图像包括去瑕疵美颜图像。
图像美颜处理装置1200还可以包括:
个性化美颜处理模块,被配置为在得到去瑕疵美颜图像后,对去瑕疵美颜图像进行个性化美颜处理,得到最终的美颜图像。
上述装置中各部分的具体细节在方法部分实施方式中已经详细说明,未披露的细节内容可以参见方法部分的实施方式内容,因而不再赘述。
本公开的示例性实施方式还提供了一种计算机可读存储介质,可以实现为一种程序产品的形式,其包括程序代码,当程序产品在电子设备上运行时,程序代码用于使电子设备执行本说明书上述“示例性方法”部分中描述的根据本公开各种示例性实施方式的步骤。在一种可选的实施方式中,该程序产品可以实现为便携式紧凑盘只读存储器(CD-ROM)并包括程序代码,并可以在电子设备,例如个人电脑上运行。然而,本公开的程序产品不限于此,在本文件中,可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行***、装置或者器件使用或者与其结合使用。
程序产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以为但不限于电、磁、光、电磁、红外线、或半导体的***、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。
计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了可读程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。可读信号介质还可以是可读存储介质以外的任何可读介质,该可读介质可以发送、传播或者传输用于由指令执行***、装置或者器件使用或者与其结合使用的程序。
可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于无线、有线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言的任意组合来编写用于执行本公开操作的程序代码,程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中,远程计算设备可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算设备,或者,可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。
应当注意,尽管在上文详细描述中提及了用于动作执行的设备的若干模块或者单元,但是这种划分并非强制性的。实际上,根据本公开的示例性实施方式,上文描述的两个或更多模块或者单元的特征和功能可以在一个模块或者单元中具体化。反之,上文描述的一个模块或者单元的特征和功能可以进一步划分为由多个模块或者单元来具体化。
所属技术领域的技术人员能够理解,本公开的各个方面可以实现为***、方法或程序产品。因此,本公开的各个方面可以具体实现为以下形式,即:完全的硬件实施方式、完全的软件实施方式(包括固件、微代码等),或硬件和软件方面结合的实施方式,这里可以统称为“电路”、“模块”或“***”。本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其他实施方式。本申请旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施方式仅被视为示例性的,本公开的真正范围和精神由权利要求指出。
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限定。

Claims (20)

  1. 一种图像美颜处理方法,其特征在于,包括:
    从待处理图像中提取一张或多张原始人脸子图像;
    基于深度神经网络的输入图像尺寸将所述一张或多张原始人脸子图像进行组合,生成原始人脸组合图像;
    利用所述深度神经网络对所述原始人脸组合图像进行处理,输出美颜人脸组合图像;
    根据所述美颜人脸组合图像与所述待处理图像,得到所述待处理图像对应的目标美颜图像。
  2. 根据权利要求1所述的方法,其特征在于,所述从待处理图像中提取一张或多张原始人脸子图像,包括:
    根据在所述待处理图像中识别到的人脸关键点,在所述待处理图像中生成一个或多个人脸框;
    保留面积大于或等于人脸面积阈值的人脸框,并截取所述人脸框内的图像,得到所述一张或多张原始人脸子图像。
  3. 根据权利要求1所述的方法,其特征在于,所述基于深度神经网络的输入图像尺寸将所述一张或多张原始人脸子图像进行组合,生成原始人脸组合图像,包括:
    根据所述原始人脸子图像的数量,将所述输入图像尺寸分割为与所述一张或多张原始人脸子图像对应的一个或多个子图像尺寸;
    分别基于每个子图像尺寸将对应的所述原始人脸子图像进行变换;
    将变换后的所述原始人脸子图像进行组合,生成所述原始人脸组合图像。
  4. 根据权利要求3所述的方法,其特征在于,所述分别基于每个子图像尺寸将对应的所述原始人脸子图像进行变换,包括以下任意一条或多条:
    当所述原始人脸子图像的宽度与高度的大小关系与所述子图像尺寸的宽度与高度的大小关系不同时,将所述原始人脸子图像旋转90度;
    当所述原始人脸子图像或者经过旋转的原始人脸子图像的尺寸大于所述子图像尺寸时,根据所述子图像尺寸将所述原始人脸子图像或者所述经过旋转的原始人脸子图像进行下采样;
    当所述原始人脸子图像或者经过旋转与下采样中至少一种处理的原始人脸子图像的尺寸小于所述子图像尺寸时,根据所述原始人脸子图像的尺寸与所述子图像尺寸的差值将所述原始人脸子图像进行填充,或者根据所述经过旋转与下采样中至少一种处理的原始人脸子图像的尺寸与所述子图像尺寸的差值将所述经过旋转与下采样中至少一种处理的原始人脸子图像进行填充。
  5. 根据权利要求1所述的方法,其特征在于,所述深度神经网络为全卷积网络,包括:第一像素重排层、至少一个卷积层、至少一个转置卷积层、第二像素重排层;
    所述利用所述深度神经网络对所述原始人脸组合图像进行处理,输出美颜人脸组合图像,包括:
    利用所述第一像素重排层对所述原始人脸组合图像进行由单通道到多通道的像素重排处理,得到第一特征图像;
    利用所述卷积层对所述第一特征图像进行卷积处理,得到第二特征图像;
    利用所述转置卷积层对所述第二特征图像进行转置卷积处理,得到第三特征图像;
    利用所述第二像素重排层对所述第三特征图像进行由多通道到单通道的像素重排处理,得到所述美颜人脸组合图像。
  6. 根据权利要求5所述的方法,其特征在于,所述利用所述第一像素重排层对所述原始人脸组合图像进行由单通道到多通道的像素重排处理,得到第一特征图像,包括:
    将通道数为a的所述原始人脸组合图像输入所述第一像素重排层;
    将所述待处理图像的每个通道中每n*n邻域的像素点分别重排至n*n个通道中的相同位置,输出通道数为a*n*n的所述第一特征图像;
    其中,a为正整数,n为不小于2的正整数。
  7. 根据权利要求5所述的方法,其特征在于,所述利用所述第二像素重排层对所述第三特征图像进行由多通道到单通道的像素重排处理,得到所述中间图像,包括:
    将通道数为b*n*n的所述第三特征图像输入所述第二像素重排层;
    将所述第三特征图像的每n*n个通道中相同位置的像素点重排至单通道中的n*n邻域内,输出通道数为b的所述美颜人脸组合图像;
    其中,b为正整数,n为不小于2的正整数。
  8. 根据权利要求1所述的方法,其特征在于,所述根据所述美颜人脸组合图像与所述待处理图像,得到所述待处理图像对应的目标美颜图像,包括:
    从所述美颜人脸组合图像中拆分出与所述原始人脸子图像对应的美颜人脸子图像;
    将所述待处理图像中的所述原始人脸子图像替换为对应的所述美颜人脸子图像,得到所述目标美颜图像。
  9. 根据权利要求8所述的方法,其特征在于,在将所述待处理图像中的所述原始人脸子图像替换为对应的所述美颜人脸子图像前,所述方法还包括:
    利用所述原始人脸子图像对所述美颜人脸子图像进行美颜弱化处理。
  10. 根据权利要求9所述的方法,其特征在于,所述利用所述原始人脸子图像对所述美颜人脸子图像进行美颜弱化处理,包括:
    根据设定的美颜程度参数,将所述原始人脸子图像融合至所述美颜人脸子图像。
  11. 根据权利要求9所述的方法,其特征在于,所述利用所述原始人脸子图像对所述美颜人脸子图像进行美颜弱化处理,包括:
    将所述原始人脸子图像的高频图像融合至所述美颜人脸子图像。
  12. 根据权利要求11所述的方法,其特征在于,所述方法还包括:
    在基于深度神经网络的输入图像尺寸将所述一张或多张原始人脸子图像进行组合时,如果对所述原始人脸子图像进行下采样,则将下采样后得到的下采样人脸子图像进行上采样,得到上采样人脸子图像,所述上采样人脸子图像与所述原始人脸子图像的分辨率相同;
    根据所述原始人脸子图像与所述上采样人脸子图像的差别,获取所述原始人脸子图像的高频图像。
  13. 根据权利要求11所述的方法,其特征在于,在将所述原始人脸子图像的高频图像融合至所述美颜人脸子图像前,所述方法还包括:
    在所述高频图像中确定瑕疵点;
    将所述高频图像中所述瑕疵点周围预设区域内的像素值调整到预设数值范围内。
  14. 根据权利要求13所述的方法,其特征在于,所述在所述高频图像中确定瑕疵点,包括:
    将所述美颜人脸子图像与对应的所述原始人脸子图像相减,得到每个像素点的差值;
    当判断所述像素点的差值满足预设瑕疵条件时,将所述像素点在所述高频图像中对应的像素点确定为瑕疵点。
  15. 根据权利要求14所述的方法,其特征在于,所述预设瑕疵条件包括:
    各个颜色通道的差值均大于第一颜色差阈值,且各个颜色通道的差值中的至少一个大于第二颜色差阈值。
  16. 根据权利要求8所述的方法,其特征在于,在将所述待处理图像中的所述原始人脸子图像替换为对应的所述美颜人脸子图像时,所述方法还包括:
    对所述待处理图像中的未替换区域与所述美颜人脸子图像之间的边界区域进行渐变处理,使所述边界区域形成平滑过渡。
  17. 根据权利要求1所述的方法,其特征在于,所述目标美颜图像包括去瑕疵美颜图像,在得到所述去瑕疵美颜图像后,所述方法还包括:
    对所述去瑕疵美颜图像进行个性化美颜处理,得到最终的美颜图像。
  18. 一种图像美颜处理装置,其特征在于,包括处理器与存储器,所述处理器用于执行所述存储器中存储的以下程序模块:
    人脸提取模块,被配置为从待处理图像中提取一张或多张原始人脸子图像;
    图像组合模块,被配置为基于深度神经网络的输入图像尺寸将所述一张或多张原始人脸子图像进行组合,生成原始人脸组合图像;
    美颜处理模块,被配置为利用所述深度神经网络对所述原始人脸组合图像进行处理,输出美颜人脸组合图像;
    图像融合模块,被配置为根据所述美颜人脸组合图像与所述待处理图像,得到所述待处理图像对应的目标美颜图像。
  19. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至17任一项所述的方法。
  20. 一种电子设备,其特征在于,包括:
    处理器;以及
    存储器,用于存储所述处理器的可执行指令;
    其中,所述处理器配置为经由执行所述可执行指令来执行权利要求1至17任一项所述的方法。
PCT/CN2022/076470 2021-03-29 2022-02-16 图像美颜处理方法、装置、存储介质与电子设备 WO2022206202A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110336102.0A CN113077397B (zh) 2021-03-29 2021-03-29 图像美颜处理方法、装置、存储介质与电子设备
CN202110336102.0 2021-03-29

Publications (1)

Publication Number Publication Date
WO2022206202A1 true WO2022206202A1 (zh) 2022-10-06

Family

ID=76611263

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/076470 WO2022206202A1 (zh) 2021-03-29 2022-02-16 图像美颜处理方法、装置、存储介质与电子设备

Country Status (2)

Country Link
CN (1) CN113077397B (zh)
WO (1) WO2022206202A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117333928A (zh) * 2023-12-01 2024-01-02 深圳市宗匠科技有限公司 一种人脸特征点检测方法、装置、电子设备及存储介质

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113077397B (zh) * 2021-03-29 2024-05-17 Oppo广东移动通信有限公司 图像美颜处理方法、装置、存储介质与电子设备
CN113538274A (zh) * 2021-07-14 2021-10-22 Oppo广东移动通信有限公司 图像美颜处理方法、装置、存储介质与电子设备
CN113436245B (zh) * 2021-08-26 2021-12-03 武汉市聚芯微电子有限责任公司 图像处理方法、模型训练方法、相关装置及电子设备
CN115546858B (zh) * 2022-08-15 2023-08-25 荣耀终端有限公司 人脸图像处理方法和电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080267443A1 (en) * 2006-05-05 2008-10-30 Parham Aarabi Method, System and Computer Program Product for Automatic and Semi-Automatic Modification of Digital Images of Faces
CN112233041A (zh) * 2020-11-05 2021-01-15 Oppo广东移动通信有限公司 图像美颜处理方法、装置、存储介质与电子设备
CN113077397A (zh) * 2021-03-29 2021-07-06 Oppo广东移动通信有限公司 图像美颜处理方法、装置、存储介质与电子设备
CN113538274A (zh) * 2021-07-14 2021-10-22 Oppo广东移动通信有限公司 图像美颜处理方法、装置、存储介质与电子设备
CN114049278A (zh) * 2021-11-17 2022-02-15 Oppo广东移动通信有限公司 图像美颜处理方法、装置、存储介质与电子设备

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106412458A (zh) * 2015-07-31 2017-02-15 中兴通讯股份有限公司 一种图像处理方法和装置
CN106210521A (zh) * 2016-07-15 2016-12-07 深圳市金立通信设备有限公司 一种拍照方法及终端
CN107944414B (zh) * 2017-12-05 2021-03-02 Oppo广东移动通信有限公司 图像处理方法、装置、电子设备及计算机可读存储介质
CN108550117A (zh) * 2018-03-20 2018-09-18 维沃移动通信有限公司 一种图像处理方法、装置以及终端设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080267443A1 (en) * 2006-05-05 2008-10-30 Parham Aarabi Method, System and Computer Program Product for Automatic and Semi-Automatic Modification of Digital Images of Faces
CN112233041A (zh) * 2020-11-05 2021-01-15 Oppo广东移动通信有限公司 图像美颜处理方法、装置、存储介质与电子设备
CN113077397A (zh) * 2021-03-29 2021-07-06 Oppo广东移动通信有限公司 图像美颜处理方法、装置、存储介质与电子设备
CN113538274A (zh) * 2021-07-14 2021-10-22 Oppo广东移动通信有限公司 图像美颜处理方法、装置、存储介质与电子设备
CN114049278A (zh) * 2021-11-17 2022-02-15 Oppo广东移动通信有限公司 图像美颜处理方法、装置、存储介质与电子设备

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BOCHKOVSKIY ALEXEY, WANG CHIEN-YAO, LIAO HONG-YUAN MARK: "YOLOv4: Optimal Speed and Accuracy of Object Detection", 22 April 2020 (2020-04-22), XP055792857, Retrieved from the Internet <URL:https://arxiv.org/pdf/2004.10934.pdf> [retrieved on 20210406] *
WANG WEIJIE, YAO JIANTAO; ZHANG MINYAN; WANG MIN: "Intelligent garbage sorting and recycling robot based on YOLOV4", INTELLIGENT COMPUTER AND APPLICATIONS, vol. 10, no. 11, 1 November 2020 (2020-11-01), pages 182 - 186, XP055972411, ISSN: 2095-2163 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117333928A (zh) * 2023-12-01 2024-01-02 深圳市宗匠科技有限公司 一种人脸特征点检测方法、装置、电子设备及存储介质
CN117333928B (zh) * 2023-12-01 2024-03-22 深圳市宗匠科技有限公司 一种人脸特征点检测方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN113077397B (zh) 2024-05-17
CN113077397A (zh) 2021-07-06

Similar Documents

Publication Publication Date Title
WO2022206202A1 (zh) 图像美颜处理方法、装置、存储介质与电子设备
CN112330574B (zh) 人像修复方法、装置、电子设备及计算机存储介质
CN111580765B (zh) 投屏方法、投屏装置、存储介质、被投屏设备与投屏设备
CN111598776B (zh) 图像处理方法、图像处理装置、存储介质与电子设备
WO2023284401A1 (zh) 图像美颜处理方法、装置、存储介质与电子设备
US9390478B2 (en) Real time skin smoothing image enhancement filter
CN111784614A (zh) 图像去噪方法及装置、存储介质和电子设备
CN114049278A (zh) 图像美颜处理方法、装置、存储介质与电子设备
CN111696039B (zh) 图像处理方法及装置、存储介质和电子设备
CN113409203A (zh) 图像模糊程度确定方法、数据集构建方法与去模糊方法
CN113902611A (zh) 图像美颜处理方法、装置、存储介质与电子设备
WO2021258530A1 (zh) 图像分辨率处理方法、装置、设备及可读存储介质
CN114331918A (zh) 图像增强模型的训练方法、图像增强方法及电子设备
WO2024032331A9 (zh) 图像处理方法及装置、电子设备、存储介质
CN115205164B (zh) 图像处理模型的训练方法、视频处理方法、装置及设备
CN114972096A (zh) 人脸图像优化方法及装置、存储介质及电子设备
CN114565532A (zh) 视频美颜处理方法、装置、存储介质与电子设备
CN115330633A (zh) 图像色调映射方法及装置、电子设备、存储介质
CN113781336B (zh) 图像处理的方法、装置、电子设备与存储介质
CN113364964B (zh) 图像处理方法、图像处理装置、存储介质与终端设备
CN114359100A (zh) 图像色彩增强方法、装置、存储介质与电子设备
CN112233041B (zh) 图像美颜处理方法、装置、存储介质与电子设备
CN113409209A (zh) 图像去模糊方法、装置、电子设备与存储介质
CN115082349A (zh) 图像美颜处理方法、装置、存储介质与电子设备
CN112233041A (zh) 图像美颜处理方法、装置、存储介质与电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22778387

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22778387

Country of ref document: EP

Kind code of ref document: A1