WO2023168667A1 - 图像处理方法、装置、神经网络训练方法及存储介质 - Google Patents

图像处理方法、装置、神经网络训练方法及存储介质 Download PDF

Info

Publication number
WO2023168667A1
WO2023168667A1 PCT/CN2022/080213 CN2022080213W WO2023168667A1 WO 2023168667 A1 WO2023168667 A1 WO 2023168667A1 CN 2022080213 W CN2022080213 W CN 2022080213W WO 2023168667 A1 WO2023168667 A1 WO 2023168667A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
processed
video
target
mapping relationship
Prior art date
Application number
PCT/CN2022/080213
Other languages
English (en)
French (fr)
Inventor
应礼剑
李志强
徐斌
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to PCT/CN2022/080213 priority Critical patent/WO2023168667A1/zh
Publication of WO2023168667A1 publication Critical patent/WO2023168667A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/40Analysis of texture

Definitions

  • the present application relates to the field of image processing technology, specifically, to an image processing method, device, neural network training method and storage medium.
  • users need to convert certain attributes of one image into the same attributes of another image.
  • style conversion of images after seeing images of a certain style taken by others, the user hopes to convert the images taken by himself into images of that style.
  • some technologies can only convert the image into an image of a specific style, and cannot convert any style quickly in real time.
  • technologies that can use image pairs as input to train a neural network, so that the trained neural network can convert the style of one frame of the input image pair into the style of another frame, but this technology has a negative impact on the performance of the device. The requirements are very high and cannot be used on some ordinary terminal devices.
  • this application provides an image processing method, device and storage medium.
  • an image processing method including:
  • the image to be processed is processed according to at least one preset initial color mapping relationship and the weight corresponding to each initial color mapping relationship to obtain a target image, so as to display the target image on the user interaction interface; wherein, The target attribute of the target image is consistent with the target attribute of the reference image, and the weight is determined based on the image to be processed and the reference image.
  • a training method for a generative adversarial network includes:
  • the sample image pair includes a third image and a fourth image; input the third image and the fourth image into the generator of the generative adversarial network to obtain each preset initial color mapping The corresponding weight of the relationship;
  • a target loss is constructed based on the discrimination results of the fourth image and the fifth image by a discriminator of the generative adversarial network, and the generative adversarial network is trained based on the target loss.
  • an image processing device includes a processor, a memory, and a computer program stored in the memory and executable by the processor.
  • the processor executes the computer program. program, perform the following steps:
  • the image to be processed is processed according to at least one preset initial color mapping relationship and the weight corresponding to each initial color mapping relationship to obtain a target image, so as to display the target image on the user interaction interface; wherein, The target attribute of the target image is consistent with the target attribute of the reference image, and the weight is determined based on the image to be processed and the reference image.
  • a computer-readable storage medium is provided.
  • a computer program is stored on the computer-readable storage medium.
  • the computer program is executed, the above-mentioned first aspect and/or the second aspect are implemented.
  • At least one initial color mapping relationship can be set in advance.
  • Each initial color mapping relationship can be used to convert the target attribute of the image into a specific attribute, and then The weight corresponding to each initial color mapping relationship can be determined according to the image to be processed and the reference image, and the image to be processed is processed according to the at least one initial color mapping relationship and the weight corresponding to each initial color mapping relationship to obtain the target image, so that the processing The target attributes of the obtained target image are consistent with those of the reference image.
  • the target attribute of the image to be processed can be quickly converted into the target attribute of any reference image, and since the weight corresponding to each preset initial color mapping relationship is determined based on the image to be processed and the reference image, then Then the image to be processed is processed according to the preset initial color mapping relationship and weight to obtain the target image.
  • Figure 1 is a flow chart of an image processing method according to an embodiment of the present application.
  • Figure 2 is a schematic diagram of a user importing images to be processed and reference images according to an embodiment of the present application.
  • Figure 3 is a schematic diagram of style conversion of images or videos displayed on a user interaction interface according to an embodiment of the present application.
  • Figure 4 is a schematic diagram of style conversion of collected images or videos according to an embodiment of the present application.
  • Figure 5 is a schematic diagram of a video in which target attributes change in a specific manner, collected according to an embodiment of the present application.
  • Figure 6 is a flow chart of a training method for a generative adversarial network according to an embodiment of the present application.
  • Figure 7 is a schematic diagram of a training method for a generative adversarial network according to an embodiment of the present application.
  • Figure 8 is a schematic diagram of using a generative adversarial network to convert the style of an image according to an embodiment of the present application.
  • Figure 9 is a schematic diagram of the logical structure of an image processing device according to an embodiment of the present application.
  • users have the need to quickly convert a certain attribute of one image into the same attribute of another image. For example, when users see some excellent image works in photography communities or public accounts, they may want to convert the style of the image they took into the style of that image, or the user may want to convert the dynamic range of one image into another.
  • the dynamic range of the image In the following, the image on which the user wishes to perform attribute conversion is called an image to be processed.
  • the image used by the user as a reference to convert the attributes of the image to be processed into the attributes of the image is called a reference image.
  • some technologies can only convert the image to be processed into an image with specific attributes. For example, taking style conversion as an example, several style templates are usually set in advance, and the style conversion is performed on the image to be processed. You can only select one or more from the preset style templates to convert the style of the image to be processed into the selected style. This method can only convert the image to be processed into a specific preset style and cannot Achieve any style conversion of the image to be processed.
  • embodiments of the present application provide an image processing method.
  • at least one initial color mapping relationship can be set in advance.
  • Each initial color mapping relationship can be used to convert the target attributes of the image. Convert to a specific attribute (for example, each initial color mapping relationship is used to convert the image into a specific style of image), and then the weight corresponding to each initial color mapping relationship can be determined based on the image to be processed and the reference image, according to
  • the at least one initial color mapping relationship and the weight corresponding to each initial color mapping relationship are processed on the image to be processed to obtain a target image, so that the target attribute of the processed target image is consistent with the target attribute of the reference image.
  • the target attribute of the image to be processed can be quickly converted into the target attribute of any reference image, and since the weight corresponding to each preset initial color mapping relationship is determined based on the image to be processed and the reference image, then Then the image to be processed is processed according to the preset initial color mapping relationship and weight to obtain the target image.
  • the image processing method provided by the embodiments of the present application can be executed by any electronic device that has the function of converting a certain attribute of an image into the same attribute of another image.
  • the electronic device can be a mobile phone, a tablet, a computer, Handheld gimbal, drone, server, etc.
  • this method can be executed by designated image processing software, and any device with the image processing software installed can implement this image processing method.
  • a specified functional service may be integrated when the device leaves the factory, and the specified functional service may execute the above image processing method.
  • the image processing method provided by the embodiment of the present disclosure may include the following steps:
  • step S102 when the user wants to convert the target attribute of a certain frame of image into the target attribute of another frame of image, the user can issue a trigger instruction, and then the device executing the method can obtain the image to be processed and the reference of the image to be processed. image.
  • the triggering method of the triggering instruction can be set flexibly. For example, it can be triggered by the user clicking a specified control on the user interaction interface, or it can also be triggered by the user's specific voice, gesture, action and other prompt information.
  • the image to be processed and the reference image can be separate images or video frames in a certain video.
  • the image to be processed may be an image or a video frame in a video that has just been captured by the user using a camera, or it may be an image or a video frame in the video imported by the user.
  • the reference image may be an image or a video frame in a video imported by the user, or it may be a preset default image or a video frame in the video, which is not limited in the embodiment of this application.
  • S104 Process the image to be processed according to at least one preset initial color mapping relationship and the weight corresponding to each initial color mapping relationship to obtain a target image, so as to display the target image on the user interaction interface; Wherein, the target attribute of the target image is consistent with the target attribute of the reference image, and the weight is determined based on the image to be processed and the reference image.
  • step S104 after the image to be processed and the reference image are obtained, the weight corresponding to each initial color mapping relationship in the preset at least one initial color mapping relationship can be determined based on the image to be processed and the reference image, and then the weight corresponding to the initial color mapping relationship can be determined using the image to be processed and the reference image.
  • At least one initial mapping color mapping relationship and weight are processed on the image to be processed to obtain a target image whose target attributes are consistent with those of the reference image.
  • each initial color mapping relationship can convert the target attribute of the image to be processed into a specific attribute.
  • the weight is used to adjust and correct each initial color mapping relationship, or the weight is used to adjust and correct the image mapped using each initial color mapping relationship, so that the target attributes and the target attributes of the reference image can be obtained Consistent target image.
  • the weight corresponding to each initial color mapping when determining the weight corresponding to each initial color mapping based on the image to be processed and the reference image, various methods can be used. For example, certain algorithms can be used to analyze and compare the target attributes of the image to be processed and the reference image, based on The characteristics of the two determine the weight corresponding to each initial color map.
  • the neural network can be pre-trained, the image to be processed and the reference image are input into the pre-trained neural network, and the weight corresponding to each initial color mapping relationship is determined through the neural network.
  • initial color mapping relationship 1 initial color mapping relationship 1
  • initial color mapping relationship 2 initial color mapping relationship 3
  • initial color mapping relationship 3 initial color mapping relationship 3
  • the weight corresponding to each initial color mapping relationship can be determined in real time based on the style of the image to be processed and the reference image, and then based on the weight and the initial color mapping relationship to be processed
  • the image is processed to obtain an image whose style is consistent with that of the reference image.
  • the target image can be displayed on the user interaction interface for the user to view.
  • the weight corresponding to the preset initial color mapping relationship is determined in real time based on the characteristics of the image to be processed and the reference image, and then the image to be processed is processed according to the weight and the preset initial color mapping relationship to obtain the target A target image whose attributes are consistent with those of the reference image.
  • the target attributes of the image to be processed can be quickly converted to be consistent with the target attributes of any reference image, and because only the weight of the preset initial color mapping relationship is determined, and then based on the weight and the preset initial The color mapping relationship is used to process the image to be processed.
  • the amount of calculation can be greatly reduced, so that this method can also be deployed on general terminal devices.
  • the target attribute in the embodiment of the present application may be the style of the image, for example, it may be the style related to the color of the image, for example, the brightness, contrast, color vividness of the image, etc., or it may also be the style of the image. Overall style, such as cartoon style, comic style, sketch style, etc.
  • the target attribute may also be the dynamic range of the image. For example, a low dynamic range image may be converted into a high dynamic range image.
  • the target attribute may also be the style of the character in the image, such as the age attribute of the character, etc.
  • the initial color mapping relationship can be characterized by any method used to represent the conversion relationship between the pixel values of the two frames of images, for example, it can be represented by a mapping table, a mapping curve, etc.
  • each initial color mapping relationship is represented by an N-dimensional lookup table, where N is a positive integer.
  • the initial color mapping relationship can be characterized by 1D-lut (one-dimensional lookup table), 2D-lut (two-dimensional lookup table), 3D-lut (three-dimensional lookup table), and 4D-lut (four-dimensional lookup table).
  • the initial color mapping relationship can contain only one type or multiple types, which can be set according to the actual situation. For example, taking each initial color mapping relationship through a 3D-lut as an example, you can set only one 3D-lut or multiple 3D-luts.
  • the image to be processed can be a video frame in an image or video
  • the reference image can also be a video frame in an image or video.
  • the image to be processed and the reference image are both a single frame of images, and the user can convert the target attribute of the frame of the image to be processed into the target attribute of the frame of the reference image.
  • the image to be processed may be a video frame in a video
  • the reference image may be a frame of image
  • the image to be processed can be multiple video frames in a video, and the reference images of these multiple video frames are the same frame image.
  • the user can convert the target attributes of all video frames in a video into the target of a reference image of a certain frame. Properties are consistent. For example, if the image to be processed is video A, the style of each video frame in video frame A can be converted into the style of reference image R.
  • the image to be processed may be a video frame in a video
  • the reference image may be a multi-frame image.
  • the images to be processed are multiple video frames in the video.
  • the reference images of the multiple video frames are the first images of multiple frames.
  • the first image of each frame serves as the reference image of one or more video frames. That is, the user can convert the The target attributes of different video frames are converted into target attributes of different reference images. For example, if the image to be processed is video A, the style of some video frames in video frame A can be converted into the style of reference image R1, and the style of some video frames can be converted into the style of reference image R2.
  • both the image to be processed and the reference image may be video frames in the video.
  • the image to be processed may be the first video frame in the first video
  • the reference image may be the second video frame in the second video.
  • Each A second video frame may serve as a reference image for one or more first video frames.
  • the image to be processed is a video frame in video A
  • the reference image can be a video frame in video B
  • the style of the video frame in video A after style conversion can correspond one-to-one to the style of the video frame in video B.
  • both the image to be processed and the reference image can be imported by the user.
  • a control can be set on the user interaction interface.
  • the user triggers the control (the "style conversion" control in the figure)
  • the user can be prompted to import the image to be processed and the reference image, and the user can select a path.
  • select an image or video from the specified storage location as the image to be processed or the reference image respectively then use one of the image frames or videos imported by the user as the image to be processed, and use another image frame or video imported by the user as the image to be processed. Reference image.
  • the user can also directly edit the image or video displayed on the user interaction interface to convert its target attribute.
  • the user can open an image or video so that the image or video is displayed on the user interaction interface.
  • the user interaction interface can also include controls for editing the image or video.
  • the user triggers After specifying a control (such as the "Style Conversion" control in the picture), the user can be prompted to import a reference image.
  • the user can select a path, select an image or video from the specified storage location as a reference image, and then obtain the image displayed on the user interaction interface. Or the video frame in the video is used as the image to be processed, and the image imported by the user or the video frame in the video is obtained as the reference image.
  • the camera can also be directly called to collect images or videos.
  • the video frames in the images or videos collected by the camera can be obtained as images to be processed.
  • User-imported images or video frames from videos serve as reference images.
  • the user interaction interface can include specific functional controls (for example, the "style conversion" control in the figure). The function of this control is to convert the style of the image or video collected by the user into the user's style. The style of the imported image or video.
  • the camera can be automatically called to collect the image or video.
  • the user can be prompted to import the reference image, so that the image or video finally presented to the user is style converted. image or video.
  • the reference image can also be a preset second image with specific attributes.
  • the camera can be called to collect a video.
  • the video frames in the collected video will be As the image to be processed, the pre-stored second image is then obtained as the reference image, so that the target attributes of the image or video collected by the user can be automatically converted into specific attributes.
  • the second image may be one frame or multiple frames.
  • the second image includes multiple frames of images in which the target attributes change in a predetermined manner.
  • the second image may include multiple frames of images, and the style of the multiple frames of images gradually changes in a certain manner, thereby utilizing the multiple frames of images.
  • the target attributes of the video frames in the processed video can also be changed in a predetermined manner.
  • special functional controls such as the style conversion control in the figure
  • This functional control can be used to collect videos that change in a specific way.
  • the camera can be automatically called to collect video, and the pre-stored multiple frames of images that change in a certain way can be used as reference images (reference image 1, reference image 2, and reference image 3 in the figure). Convert the target attributes of the collected video. For example, the target attributes of video frames 1-10 can be converted into the target attributes of reference image 1, and the target attributes of video frames 11-20 can be converted into the target attributes of reference image 2. The target attributes of the video frames 21-30 can be converted into the target attributes of the reference image 3, so that the target attributes of the final video change in the same way.
  • the target attribute may be an image style
  • the target attribute of the video frames in the processed video changing in a predetermined manner may be the image style of the video frames in the processed video changing according to seasonal changes.
  • the second image can be a pre-stored four-frame image of a style transitioning from spring to winter.
  • the collected video can be processed using the above-mentioned four-frame images as reference images, and the positions in the video can be automatically
  • the front part of the video frames is converted to spring style
  • the middle part of the video frames is converted to summer style and autumn style
  • the last part of the video frames is converted to winter style, so that the style of the video frames in the processed video transitions from spring to winter.
  • the target attribute may be an image style
  • the image style of the video frames in the processed video may change according to day and night.
  • the second image can be a pre-stored multi-frame image of a style transitioning from morning to evening. After the video is collected, the above-mentioned multi-frame image can be used as a reference image to process the collected video, so that the processed video The style of the video frames transitions from morning to evening.
  • the target attribute may be the style of the characters in the image, and the style of the characters in the video frames in the processed video changes according to age.
  • the second image can be a pre-stored multi-frame image of a character transitioning from childhood to old age. After collecting the video of the character, the above-mentioned multi-frame image can be used as a reference image to process the collected video, so that the processing The age of the characters in the video frames after the video transitions from early childhood to old age.
  • each video frame in the captured video when each video frame in the captured video is used as an image to be processed, and a pre-stored second image is obtained as a reference image, the captured video can be divided into multiple sub-videos.
  • the number of sub-videos Consistent with the number of second images, each sub-video corresponds to one frame of the second image.
  • the video frame in each sub-video is used as the image to be processed, and the second image corresponding to each sub-video is used as the second image in the sub-video.
  • the reference image of each video frame when each video frame in the captured video is used as an image to be processed, and a pre-stored second image is obtained as a reference image.
  • video A can be divided into four sub-videos ⁇ sub-video A1, sub-video A2, Sub-video A3, sub-video A4 ⁇ , among which, the video frame in sub-video A1 uses the image with the style "spring” as the reference image, the video frame in sub-video A2 uses the image with the style "summer” as the reference image, The video frames in video A3 use the image with the style "autumn” as the reference image, and the video frames in the sub-video A4 use the image with the style "winter” as the reference image, and then perform style conversion on the video frames of the above four sub-videos respectively. , so that the effect of the video frames in the video finally displayed to the user gradually transitions from spring to winter, showing the alternation of the four seasons.
  • the target image when the image to be processed is processed based on at least one preset initial color mapping relationship and the weight corresponding to each initial color mapping relationship to obtain the target image, the target image may be obtained based on at least one preset initial color mapping relationship.
  • the color mapping relationship is mapped to the image to be processed respectively to obtain the image mapped using each initial color mapping relationship, and then the image mapped using the initial color mapping relationship is processed based on the weight corresponding to each initial color mapping relationship. And fuse the processed images to obtain the target image.
  • mapping relationship 1, mapping relationship 2, and mapping relationship 3 you can first use the above three initial color mapping relationships to perform mapping processing on the image to be processed, and obtain three frames of images with different target attributes. , and then use the weights corresponding to each initial color mapping relationship to perform weighted fusion processing on the three frame images, so that the target attributes of the final image are consistent with the target attributes of the reference image.
  • the weight can be a numerical value or a weight matrix. That is, for the image obtained by mapping processing, the weight corresponding to each pixel can be the same or different. For example, each pixel may correspond to a weight, or each pixel block may correspond to a weight, or the image obtained by the entire frame mapping process may correspond to a weight.
  • the target image when the image to be processed is processed based on at least one preset initial color mapping relationship and the weight corresponding to each initial color mapping relationship to obtain the target image, the target image may be obtained based on at least one preset initial color mapping relationship.
  • the color mapping relationship, as well as the weight corresponding to each initial color mapping relationship determines the target color mapping relationship.
  • the target color mapping relationship is used to convert the target attributes of the image to be processed into the target attributes of the reference image, and then uses the target color mapping relationship to treat Process the image and perform mapping processing to obtain the target image.
  • mapping relationship 1, mapping relationship 2, and mapping relationship 3 For example, assuming there are three initial color mapping relationships (mapping relationship 1, mapping relationship 2, and mapping relationship 3), you can first use the above three initial color mapping relationships and the weights corresponding to each initial color mapping relationship to obtain a target color mapping relationship, which The target color mapping relationship can be used to convert the target attributes of the image to be processed into the target attributes of the reference image, and then use the target color mapping relationship to map the image to be processed to obtain a target image whose target attributes are consistent with those of the reference image.
  • feature extraction can be performed on the image to be processed and the reference image respectively, and then based on the extracted features of the image to be processed and the reference
  • the characteristics of the image determine the weight corresponding to each initial color mapping relationship.
  • the features of the image to be processed and the reference image can be extracted through some feature extraction networks, and the weight corresponding to each initial color mapping relationship is determined based on the extracted features.
  • the step of determining the weight of each initial color mapping relationship based on the image to be processed and the reference image is performed by a pre-trained generative adversarial network.
  • the image to be processed and the reference image can be input into a pre-trained generative adversarial network, and the generative adversarial network can output the weight corresponding to each initial color mapping relationship.
  • the generative adversarial network can be trained based on the following method: a large number of sample image pairs can be obtained, each sample image pair includes a third image and a fourth image, and the target attributes of the two images are different, where, the first The third image may be an image that requires target attribute conversion, and the fourth image may be a reference image of the third image, that is, the target attribute of the fourth image needs to be converted into the target attribute of the third image through a generative adversarial network.
  • the third image and the fourth image can be input into the generator of the generative adversarial network to obtain the preset weight corresponding to each initial color mapping relationship, and then based on the weight corresponding to each initial color mapping relationship, the preset Each color mapping relationship processes the third image to obtain a fifth image whose target attributes match the target attributes of the fourth image. Then the fourth image and the fifth image can be judged based on the discriminator of the generative adversarial network. Which of the fourth and fifth images is the generated image and which is the original real image, and a target loss is constructed based on the discrimination results, and then the generative adversarial network can be trained based on the target loss.
  • the target loss in addition to the discrimination results of the fourth image and the fifth image by the discriminator of the generative adversarial network, it can also be constructed based on the conditions that need to be followed in the process of mapping the images using the mapping relationship. Add some constraints to the target loss. For example, in the process of mapping the image, it is necessary to ensure the monotonicity of the brightness of the mapped image, that is, the pixels with greater brightness in the image before mapping are in the image after mapping. The brightness in should also be kept larger. Based on this principle, constraints can be added to the target loss to ensure that the final output target image can meet the above conditions. Secondly, the color of the mapped image also needs to be kept smooth to avoid the problem of color discontinuity. Therefore, a constraint term can also be added to the target loss, and the constraint term can ensure a smooth transition of the color of the final target image.
  • this embodiment of the present application also provides a training method for a generative adversarial network. As shown in Figure 6, the method may include the following steps:
  • a generative adversarial network is pre-trained so that the trained generative adversarial network can convert the style of one of the two input frames into the style of the other frame. Specifically, it includes the following two stages:
  • the generative adversarial network consists of two parts, the generator and the discriminator.
  • the generator can be composed of a general feature extraction network.
  • the feature extraction network can be LeNet, AlexNet, VGG, GoogleNet, ResNet, DenseNet and other networks.
  • the discriminator The network can be a binary classification network.
  • One or more 3D-luts can be preset, where each 3D-lut is a lookup table used to map images into images of a specific style.
  • the sample image pairs include two frames of images with different styles.
  • the two frames of images include a third image to be performed for style conversion, and a fourth image used as a reference. image.
  • the sample image pairs can then be input into the generator, which can perform feature extraction on the two frames of images and determine the weight corresponding to each 3D-lut based on the extracted features.
  • each preset 3D-lut can be used to map the third image separately to obtain the mapped image, and then the weight corresponding to each 3D-lut can be used to perform weighted fusion processing on the mapped image to obtain the fifth image.
  • the fourth image and the fifth image can be input into the discriminator, and the discriminator determines whether the fourth image and the fifth image are real images or generated images. Based on the determination results, the target loss can be constructed, and then the target loss can be based on Adjust the network parameters of the generator to train the generative adversarial network.
  • the generative adversarial network can be used to convert the style of one frame of image into the style of another frame of image, as shown in Figure 8.
  • the image to be processed and the reference image can be input into the generator of the generative adversarial network, and the generator determines each image based on the image to be processed and the reference image.
  • the weight corresponding to each 3D-lut is used to map the image to be processed separately to obtain the mapped image, and then the weight corresponding to each 3D-lut is used to perform weighted fusion on the mapped image to obtain the final target image.
  • the style of the target image is consistent with the style of the reference image.
  • the weight corresponding to the initial 3D-lut is preset, and then the image after mapping each initial 3D-lut is mapped based on the weight. Perform fusion processing to obtain a target image with the same style as the reference image.
  • the style of one frame of image can be quickly converted into the style of any other frame of image.
  • the generative adversarial network is directly used to generate the target image after style conversion, which is relatively complex, requires a large amount of calculation, and has high requirements on device performance.
  • This application uses the generative adversarial network to output the preset corresponding to each initial 3D-lut. Weight, use the initial 3D-lut and weight to process the image to be processed, and obtain the target image, which can greatly reduce the amount of calculation, making this method also deployable in general terminal devices.
  • the embodiment of the present application also provides an image processing device.
  • the device 90 includes a processor 91, a memory 92, and a computer program stored in the memory 92 and executable by the processor 91. , when the processor 91 executes the computer program, the following steps can be implemented:
  • the image to be processed is processed according to at least one preset initial color mapping relationship and the weight corresponding to each initial color mapping relationship to obtain a target image, so as to display the target image on the user interaction interface; wherein, The target attribute of the target image is consistent with the target attribute of the reference image, and the weight is determined based on the image to be processed and the reference image.
  • the image to be processed is multiple video frames in the video, and the reference images of the multiple video frames are the same frame image; or
  • the image to be processed is a plurality of video frames in the video, the reference images of the multiple video frames are multiple first images, and the first image of each frame serves as a reference image for one or more video frames; or
  • the image to be processed is the first video frame in the first video
  • the reference image of the first video frame is the second video frame in the second video
  • each second video frame serves as one or more first videos The reference image of the frame.
  • the processor when used to obtain an image to be processed and a reference image of the image to be processed in response to a user's trigger instruction, it is specifically used to:
  • the image or video imported by the user is obtained as the image to be processed, and the image or video imported by the user is obtained as the reference image.
  • the processor when used to obtain an image to be processed and a reference image of the image to be processed in response to a user's trigger instruction, it is specifically used to:
  • the video frame in the image or video displayed on the user interactive interface is obtained as the image to be processed, and the image or video imported by the user is obtained.
  • the video frame serves as the reference image.
  • the processor when used to obtain an image to be processed and a reference image of the image to be processed in response to a user's trigger instruction, it is specifically used to:
  • the video frames in the images or videos collected by the camera are obtained as the images to be processed, and the video frames in the images or videos imported by the user are obtained as the reference images.
  • the processor when used to obtain an image to be processed and a reference image of the image to be processed in response to a user's trigger instruction, it is specifically used to:
  • the video frames in the collected video are used as the images to be processed, and the pre-stored second image is obtained as the reference image.
  • the second image includes a multi-frame image in which the target attribute changes in a predetermined manner, such that the target attribute of the video frames in the processed video changes in a predetermined manner.
  • the target attributes of the video frames in the processed video change in a predetermined manner, including;
  • the target attribute is an image style, and the image style of the video frames in the processed video changes according to seasonal changes; or
  • the target attribute is an image style, and the image style of the video frames in the processed video changes according to day and night; or
  • the target attribute is the character style in the image, and the character style of the processed video frames in the video changes according to age.
  • the processor when used to use each video frame in the collected video as the image to be processed, and to obtain a pre-stored second image as the reference image, it is specifically used to:
  • each sub-video the video frame in each sub-video is used as the image to be processed, and the second image corresponding to each sub-video is used as the reference image.
  • the processor is configured to process the image to be processed based on at least one preset initial color mapping relationship and the weight corresponding to each initial color mapping relationship, and when obtaining the target image, specifically for :
  • the images to be processed are mapped based on at least one preset initial color mapping relationship to obtain images mapped using each initial color mapping relationship; based on the weight corresponding to each initial color mapping relationship, the images to be processed are mapped using the initial color mapping relationship. Process the image mapped by the color mapping relationship, and fuse the processed images to obtain the target image; or
  • a target color mapping relationship is determined, and the target color mapping relationship is used to convert the target attribute of the image to be processed into The target attribute of the reference image; use the target color mapping relationship to perform mapping processing on the image to be processed to obtain the target image.
  • the target attributes include one or more of the following: image style, dynamic range of the image, and style of the characters in the image.
  • each initial color mapping relationship is represented by an N-dimensional lookup table, where N is a positive integer.
  • the weight is determined based on the image to be processed and the reference image, including:
  • the weights are determined based on the extracted features.
  • the step of determining the weight based on the image to be processed and the reference image is performed by a pre-trained generative adversarial network.
  • the generative adversarial network is trained based on the following method:
  • sample image pair including a third image and a fourth image
  • a target loss is constructed based on the discrimination results of the fourth image and the fifth image by a discriminator of the generative adversarial network, and the generative adversarial network is trained based on the target loss.
  • embodiments of this specification also provide a computer storage medium, the storage medium stores a program, and when the program is executed by a processor, the method in any of the above embodiments is implemented.
  • Embodiments of the present description may take the form of a computer program product implemented on one or more storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having program code embodied therein.
  • Storage media available for computers include permanent and non-permanent, removable and non-removable media, and can be implemented by any method or technology to store information.
  • Information may be computer-readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to: phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, compact disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, Magnetic tape cassettes, tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium can be used to store information that can be accessed by a computing device.
  • PRAM phase change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • ROM read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory or other memory technology
  • CD-ROM compact disc read-only memory
  • DVD digital versatile disc
  • Magnetic tape cassettes tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium can be used to store information that can be accessed by
  • the device embodiment since it basically corresponds to the method embodiment, please refer to the partial description of the method embodiment for relevant details.
  • the device embodiments described above are only illustrative.
  • the units described as separate components may or may not be physically separated.
  • the components shown as units may or may not be physical units, that is, they may be located in One location, or it can be distributed across multiple network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. Persons of ordinary skill in the art can understand and implement the method without any creative effort.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Processing Or Creating Images (AREA)

Abstract

一种图像处理方法、装置、神经网络训练方法及存储介质,所述图像处理方法包括:响应于用户的触发指令,获取待处理图像和所述待处理图像的参考图像;根据预设的至少一种初始色彩映射关系,以及每种初始色彩映射关系对应的权重对所述待处理图像进行处理,得到目标图像,以将所述目标图像显示在用户交互界面上;其中,所述目标图像的目标属性与所述参考图像的所述目标属性一致,所述权重基于所述待处理图像和所述参考图像确定。通过这种方式,可以快速地将待处理图像的目标属性转换成和任意参考图像的目标属性一致,并且可以大大减小计算量,使得该方法在一般的终端设备上也可以部署。

Description

图像处理方法、装置、神经网络训练方法及存储介质 技术领域
本申请涉及图像处理技术领域,具体而言,涉及一种图像处理方法、装置、神经网络训练方法及存储介质。
背景技术
在一些场景,用户需要将一张图像的某种属性转换成另一张图像的该种属性。以对图像的风格进行转换为例,用户在看到其他人拍摄的某种风格的图像后,希望将自己拍摄的图像也转换成该种风格的图像。目前,在对图像的风格进行转换时,有些技术只能将图像转换成特定风格的图像,无法实时快速地进行任意风格的转换。也有些技术可以以图像对作为输入,训练神经网络,使得训练后的神经网络可以将输入的图像对中的一帧图像的风格转换成另一帧图像的风格,但是这种技术对设备的性能要求很高,无法在一些普通的终端设备上使用。
发明内容
有鉴于此,本申请提供一种图像处理方法、装置及存储介质。
根据本申请的第一方面,提供一种图像处理方法,所述方法包括:
响应于用户的触发指令,获取待处理图像和所述待处理图像的参考图像;
根据预设的至少一种初始色彩映射关系,以及每种初始色彩映射关系对应的权重对所述待处理图像进行处理,得到目标图像,以将所述目标图像显示在用户交互界面上;其中,所述目标图像的目标属性与所述参考图像的所述目标属性一致,所述权重基于所述待处理图像和所述参考图像确定。
根据本申请的第二方面,提供一种生成对抗网络的训练方法,所述方法包括:
获取样本图像对,所述样本图像对包括第三图像和第四图像;将所述第三图像和所述第四图像输入到生成对抗网络的生成器中,得到预设的每种初始色彩映射关系对应的权重;
基于所述权重、预设的每种色彩映射关系对所述第三图像进行处理,得到目标属性和所述第四图像的所述目标属性相匹配的第五图像;
基于生成对抗网络的判别器对所述第四图像和所述第五图像的判别结果构建目标损失,基于所述目标损失对所述生成对抗网络进行训练。
根据本申请的第三方面,提供一种图像处理装置,所述图像处理装置包括处理器、存储器、存储于所述存储器可供所述处理器执行的计算机程序,所述处理器执行所述计算机程序时,实现以下步骤:
响应于用户的触发指令,获取待处理图像和所述待处理图像的参考图像;
根据预设的至少一种初始色彩映射关系,以及每种初始色彩映射关系对应的权重对所述待处理图像进行处理,得到目标图像,以将所述目标图像显示在用户交互界面上;其中,所述目标图像的目标属性与所述参考图像的所述目标属性一致,所述权重基于所述待处理图像和所述参考图像确定。
根据本申请的第三方面,提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被执行时实现上述第一方面和/或第二方面提及的方法。
应用本申请提供的方案,在对图像的目标属性进行转换时,可以预先设置至少一种初始色彩映射关系,每种初始色彩映射关系可以用于将图像的目标属性转换成某种特定属性,然后可以根据待处理图像和参考图像确定每种初始色彩映射关系对应的权重,根据该至少一种初始色彩映射关系和每种初始色彩映射关系对应的权重对待处理图像进行处理,得到目标图像,使得处理得到的目标图像的目标属性和参考图像的目标属性一致。通过这种方式,可以实现快速地将待处理图像的目标属性转换成任意参考图 像的目标属性,并且由于是根据待处理图像和参考图像确定预设的每种初始色彩映射关系对应的权重,然后再根据预设的初始色彩映射关系和权重对待处理图像进行处理,得到目标图像,从而无需直接通过神经网络生成目标图像,可以大大减小计算量,使得该方法在手机、电脑等性能一般的终端设备上也能部署使用。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是本申请一个实施例的图像处理方法的流程图。
图2是本申请一个实施例的用户导入待处理图像和参考图像的示意图。
图3是本申请一个实施例的对用户交互界面显示的图像或视频进行风格转换的示意图。
图4是本申请一个实施例的对采集的图像或视频进行风格转换的示意图。
图5是本申请一个实施例的采集得到目标属性按照特定方式变化的视频的示意图。
图6是本申请一个实施例的一种生成对抗网络的训练方法流程图。
图7是本申请一个实施例的一种生成对抗网络的训练方法的示意图。
图8是本申请一个实施例的一种利用生成对抗网络对图像的风格进行转换的示意图。
图9是本申请一个实施例的图像处理装置的逻辑结构的示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进 行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
在一些场景,用户存在将一张图像的某种属性快速转换成另一张图像的该种属性的需求。比如,用户在摄影社区或公众号看到一些优秀的图像作品时,可能希望将自己拍摄的图像的风格也转换成该图像的风格,或者用户希望将一张图像的动态范围转换成另一张图像的动态范围。以下将用户希望进行属性转换的图像称为待处理图像,用户在属性转换过程中,用于作为参考以将待处理图像的属性转换成该图像的属性的图像称为参考图像。
目前,在对图像的属性进行转换时,有些技术只能将待处理图像转换成特定属性的图像,比如,以风格转换为例,通常是预先设置几种风格模板,在对待处理图像进行风格转换时,只能从预先设置的风格模板中选择一种或多种,将待处理图像的风格转换成所选的风格,这种方式只能将待处理图像转换成预先设置好的特定风格,无法实现对待处理图像进行任意风格的转换。也有些技术可以利用大量的样本图像对训练得到生成对抗网络,然后可以将待处理图像以及参考图像输入到生成对抗网络中,由生成对抗网络生成一帧风格和参考图像风格一致的图像,但是这种方式需要生成对抗网络直接生成风格转换后的图像,比较复杂,对设备性能要求也很高,无法在一般的终端设备上部署。
基于此,本申请实施例提供了一种图像处理方法,在对图像的目标属性进行转换时,可以预先设置至少一种初始色彩映射关系,每种初始色彩映射关系可以用于将图像的目标属性转换成某种特定属性(比如,每种初始色彩映射关系用于将图像转换成某种特定风格的图像),然后可以根据待处理图像和参考图像确定每种初始色彩映射关系对应的权重,根据该至少一种初始色彩映射关系和每种初始色彩映射关系对应的权重对待处理图像 进行处理,得到目标图像,使得处理得到的目标图像的目标属性和参考图像的目标属性一致。通过这种方式,可以实现快速地将待处理图像的目标属性转换成任意参考图像的目标属性,并且由于是根据待处理图像和参考图像确定预设的每种初始色彩映射关系对应的权重,然后再根据预设的初始色彩映射关系和权重对待处理图像进行处理,得到目标图像,从而无需直接通过神经网络生成目标图像,可以大大减小计算量,使得该方法在手机、电脑等性能一般的终端设备上也能部署使用。
本申请实施例提供的图像处理方法可以由具备将一张图像的某种属性转成另一张图像的该种属性的功能的任一电子设备执行,该电子设备可以是手机、平板、电脑、手持云台、无人机、服务器等。比如,在一些场景,该方法可以由指定的图像处理软件执行,只要安装有该图像处理软件的设备均可以实现该图像处理方法。在一些实施例中,也可以在设备出厂时,即集成指定的功能服务,由该指定的功能服务执行上述图像处理方法。
具体的,如图1所示,本公开实施例提供的图像处理方法可包括以下步骤:
S102、响应于用户的触发指令,获取待处理图像和所述待处理图像的参考图像;
在步骤S102中,当用户想要将某帧图像的目标属性转成另一帧图像的目标属性时,用户可以发出触发指令,然后执行该方法的设备可以获取待处理图像和待处理图像的参考图像。其中,触发指令的触发方式可以灵活设置,比如,可以由用户点击用户交互界面上的指定控件触发,或者也可以由用户的特定语音、手势、动作等提示信息触发。待处理图像和参考图像可以是单独的图像,也可以是某段视频中的视频帧。待处理图像可以是用户利用摄像头刚采集的图像或视频中视频帧、也可以是用户导入的图像或视频中的视频帧。参考图像可以是用户导入的图像或视频中的视频帧,或者也可以是预先设置好的默认的图像或视频中的视频帧,本申请实施例不做限制。
S104、根据预设的至少一种初始色彩映射关系,以及每种初始色彩映射关系对应的权重对所述待处理图像进行处理,得到目标图像,以将所述目标图像显示在用户交互界面上;其中,所述目标图像的目标属性与所述参考图像的所述目标属性一致,所述权重基于所述待处理图像和所述参考图像确定。
在步骤S104中,在获取到待处理图像和参考图像后,可以根据待处理图像和参考图像确定预设的至少一种初始色彩映射关系中的每种初始色彩映射关系对应的权重,然后利用该至少一种初始映射色彩映射关系和权重对待处理图像进行处理,得到目标属性与参考图像的目标属性一致的目标图像。其中,每种初始色彩映射关系可以将待处理图像的目标属性转换成某种特定的属性,通过对待处理图像和参考图像的特性进行分析和比对,即可以确定每种初始色彩映射关系对应的权重,使得利用该权重对每种初始色彩映射关系进行调整和修正,或者利用该权重对使用每种初始色彩映射关系映射后的图像进行调整和修正,从而可以得到目标属性和参考图像的目标属性一致的目标图像。
其中,在基于待处理图像和参考图像确定每种初始色彩映射对应的权重时,可以采用多种方式,比如,可以利用某些算法对待处理图像和参考图像的目标属性进行分析和比对,基于两者的特性确定每种初始色彩映射对应的权重。或者也可以预先训练神经网络,将待处理图像和参考图像输入到预先训练的神经网络中,通过神经网络确定每种初始色彩映射关系对应的权重。
以目标属性为图像风格为例,假设初始色彩映射关系有三种(初始色彩映射关系1、初始色彩映射关系2、初始色彩映射关系3),利用初始色彩映射关系1对图像进行映射处理后,可以得到风格大体倾向于风格A的图像,利用初始色彩映射关系2对图像进行映射处理后,可以得到风格大体倾向于风格B的图像,利用初始色彩映射关系3对图像进行映射处理后,可以得到风格大体倾向于风格C的图像。因而,可以基于待处理图像和参 考图像的风格实时确定如果需要将待处理图像的风格转换成参考图像的风格,每种初始色彩映射关系对应的权重,然后基于该权重和初始色彩映射关系对待处理图像进行处理,得到风格和参考图像的风格一致的图像。在得到目标图像后,可以将目标图像显示在用户交互界面上,供用户查看。
本申请实施例中,通过基于待处理图像和参考图像的特点实时地确定预设的初始色彩映射关系对应的权重,然后根据该权重和预设的初始色彩映射关系对待处理图像进行处理,得到目标属性和参考图像的目标属性一致的目标图像。通过这种方式,可以快速地将待处理图像的目标属性转换成和任意参考图像的目标属性一致,并且由于只需确定预设的初始色彩映射关系的权重,然后再基于权重和预设的初始色彩映射关系对待处理图像进行处理,相比于直接通过神经网络生成目标图像的方式,可以大大减小计算量,使得该方法在一般的终端设备上也可以部署。
在一些实施例中,本申请实施例中的目标属性可以是图像的风格,比如,可以是与图像色彩有关的风格,例如,图像的明亮度、对比度、颜色鲜艳度等,或者也可以图像的整体风格,比如,卡通风格、漫画风格、素描风格等。在一些实施例中,目标属性也可以是图像的动态范围,比如,可以将低动态范围的图像转换成高动态范围的图像。在一些实施例中,目标属性也可以是图像中人物的风格,比如,人物的年龄属性等。
初始色彩映射关系可以通过用于表示两帧图像像素值之间的转换关系的任意方式表征,比如,可以通过映射表、映射曲线等表征。在一些实施例中,每种初始色彩映射关系通过一个N维查找表表征,其中,N为正整数。比如,该初始色彩映射关系可以通过1D-lut(一维查找表)、2D-lut(二维查找表)、3D-lut(三维查找表)、4D-lut(四维查找表)表征。初始色彩映射关系可以只包含一种,也可以包含多种,具体可以根据实际情况设置。比如,以每种初始色彩映射关系通过一个3D-lut为例,可以只设置一个3D-lut,也可以设置多个3D-lut。
待处理图像可以是图像或视频中的视频帧,参考图像也可以是图像或 视频中的视频帧。在一些实施中,待处理图像和参考图像均为一帧单独的图像,用户可以将该帧待处理图像的目标属性转换成该帧参考图像的目标属性。
在一些实施例中,待处理图像可以是一段视频中的视频帧,参考图像可以是一帧图像。比如,待处理图像可以是视频中的多个视频帧,这多个视频帧的参考图像为同一帧图像,用户可以将一段视频中的所有视频帧的目标属性转换成和某帧参考图像的目标属性一致。比如,待处理图像为视频A,可以将视频帧A中的各视频帧的风格均转换成参考图像R的风格。
在一些实施例中,待处理图像可以是一段视频中的视频帧,参考图像可以是多帧图像。待处理图像为视频中的多个视频帧,多个视频帧的参考图像为多帧第一图像,每帧第一图像作为一个或多个视频帧的参考图像,即用户可以将一段视频中的不同视频帧的目标属性分别转换成不同参考图像的目标属性。比如,待处理图像为视频A,可以将视频帧A中的部分视频帧的风格转换成参考图像R1的风格、部分视频帧的风格转换成参考图像R2的风格。
在一些实施例中,待处理图像和参考图像均可以是视频中的视频帧,待处理图像可以是第一视频中的第一视频帧,参考图像为第二视频中的第二视频帧,每个第二视频帧可以作为一个或多个第一视频帧的参考图像。比如,待处理图像为视频A中的视频帧,参考图像可以将视频B中的视频帧,风格转换后的视频A中的视频帧的风格可以和视频B中的视频帧的风格一一对应。
在一些实施中,待处理图像和参考图像均可以由用户导入。比如,如图2所示,可以在用户交互界面上设置一个控件,当用户触发该控件(图中的“风格转换”控件)后,可以提示用户导入待处理图像和参考图像,用户可以选择路径,从指定存储位置中选择图像或视频分别作为待处理图像或参考图像,然后将用户导入的其中一帧图像或一个视频作为待处理图像,以及将用户导入的另一帧图像或另一个视频作为参考图像。
在一些实施例中,用户也可以直接对用户交互界面显示的图像或视频进行编辑处理,以对其目标属性进行转换。比如,如图3所时,用户可以打开某个图像或视频,使得该图像或视频显示在用户交互界面上,同时用户交互界面中还可以包括用于对图像或视频进行编辑的控件,用户触发指定控件(如图中的“风格转换”控件)后,即可以提示用户导入参考图像,用户可以选择路径,从指定存储位置中选择图像或视频作为参考图像,然后获取用户交互界面上显示的图像或视频中的视频帧作为待处理图像,获取用户导入的图像或视频中的视频帧作为参考图像。
在一些实施例中,在用户发出触发指令后,也可以直接调用摄像头采集图像或视频,在完成图像或视频的采集后,可以获取摄像头采集的图像或视频中的视频帧作为待处理图像,获取用户导入的图像或视频中的视频帧作为参考图像。举个例子,如图4所示,用户交互界面可以包括特定的功能控件(比如,图中的“风格转换”控件),该控件的功能用于将用户采集的图像或视频的风格转换成用户导入的图像或视频的风格,用户点击该控件,即可以自动调用摄像头采集图像或视频,完成图像或视频的采集后,可以提示用户导入参考图像,使得最终呈现给用户的图像或视频为风格转换后的图像或视频。
在一些实施中,参考图像也可以是预先设置的具有特定属性的第二图像,在用户发出触发指令后,可以调用摄像头采集一段视频,在完成视频的采集后,将采集的视频中的视频帧作为待处理图像,然后获取预先存储的第二图像作为参考图像,从而可以自动将用户采集的图像或视频的目标属性转换成特定的属性。其中,第二图像可以是一帧或者多帧。
在一些实施例中,第二图像包括目标属性按照预定方式变化的多帧图像,比如,第二图像可以包括多帧图像,该多帧图像的风格按照某种方式逐渐变化,从而利用该多帧图像作为参考图像对采集的视频进行处理后,处理后的视频中的视频帧的目标属性也可以按照预定方式变化。比如,如图5所示,可以在用户交互界面设置专门的功能控件(如图中的风格转换 控件),该功能控件可以用于采集按照特定方式变化的视频。用户触发该功能控件后,即可以自动调用摄像头采集视频,并利用预先存储的多帧按照一定方式变化的图像的作为参考图像(如图中的参考图像1、参考图像2、参考图像3),对采集的视频进行目标属性的转换,比如,可以将视频帧1-10的目标属性转换成参考图像1的目标属性,可以将视频帧11-20的目标属性转换成参考图像2的目标属性,可以将视频帧21-30的目标属性转换成参考图像3的目标属性,使得最后得到视频的目标属性按照同样的方式变化。
在一些实施例中,目标属性可以是图像风格,处理后的视频中的视频帧的目标属性按照预定方式变化可以是处理后的视频中的视频帧的图像风格按照季节更替变化。举个例子,第二图像可以是预先存储的风格由春天过渡到冬天的四帧图像,在采集到视频后,可以以上述四帧图像作为参考图像对采集的视频进行处理,自动将视频中位于前面的部分视频帧转换成春天风格、位于中间的部分视频帧转换成夏天风格和秋天风格,最后一部分视频帧转换成冬天风格,使得处理后的视频中的视频帧的风格从春天过渡到冬天。
在一些实施例中,目标属性可以是图像风格,处理后的视频中的视频帧的图像风格可以按照日夜更替变化。举个例子,第二图像可以是预先存储的风格从早上过渡到晚上的多帧图像,在采集到视频后,可以将上述多帧图像作为参考图像对采集的视频进行处理,使得处理后的视频中的视频帧的风格从早上过渡到晚上。
在一些实施例中,目标属性可以是图像中的人物风格,处理后的视频中的视频帧的人物风格按照年龄增长变化。举个例子,第二图像可以是预先存储的人物风格按照幼年过渡到老年的多帧图像,在采集到人物的视频后,可以将上述多帧图像作为参考图像对采集的视频进行处理,使得处理后的视频中的视频帧的人物的年龄从幼年过渡到老年。
在一些实施例中,在将采集的视频中的各视频帧作为待处理图像,并 获取预先存储的第二图像作为参考图像时,可以将采集的视频划分成多段子视频,该子视频的数量与第二图像的数量一致,每段子视频对应一帧第二图像,针对每段子视频,将每段子视频中的视频帧作为待处理图像,将每段子视频对应的第二图像作为该子视频中的各视频帧的参考图像。举个例子,假设参考图像包括风格为“春、夏、秋、冬”四种风格的图像,在获取到视频A后,可以将视频A划分成四段子视频{子视频A1、子视频A2、子视频A3、子视频A4},其中,子视频A1中的视频帧以风格为“春”的图像作为参考图像、子视频A2中的视频帧以风格为“夏”的图像作为参考图像、子视频A3中的视频帧以风格为“秋”的图像作为参考图像、子视频A4中的视频帧以风格为“冬”的图像作为参考图像,然后分别对上述四段子视频的视频帧进行风格转换,使得最后展示给用户的视频中的视频帧呈现的效果逐渐从春天过渡到冬天,展示了四季的交替变化。
在一些实施例中,在基于预设的至少一种初始色彩映射关系,以及每种初始色彩映射关系对应的权重对待处理图像进行处理,得到目标图像时,可以先基于预设的至少一种初始色彩映射关系分别对待处理图像进行映射处理,得到利用每种初始色彩映射关系映射后的图像,然后基于每种初始色彩映射关系对应的权重对利用该种初始色彩映射关系映射后的图像进行处理,并对处理后的图像进行融合,得到目标图像。比如,假设有三种初始色彩映射关系(映射关系1、映射关系2、映射关系3),可以先利用上述三种初始色彩映射关系对待处理图像进行映射处理,得到目标属性各不相同的三帧图像,然后利用各初始色彩映射关系对应的权重对该三帧图像进行加权融合处理,使得最终得到的图像的目标属性和参考图像的目标属性一致。
其中,权重可以是一个数值,也可以是一个权重矩阵。即针对映射处理得到的图像,其每个像素点对应的权重可以一样,也可以不一样。比如,可以是每个像素点对应一个权重,或者是每个像素块对应一个权重,或者是整帧映射处理得到的图像对应一个权重。
在一些实施例中,在基于预设的至少一种初始色彩映射关系,以及每种初始色彩映射关系对应的权重对待处理图像进行处理,得到目标图像时,可以先基于预设的至少一种初始色彩映射关系,以及每种初始色彩映射关系对应的权重,确定目标色彩映射关系,该目标色彩映射关系用于将待处理图像的目标属性转换成参考图像的目标属性,然后利用目标色彩映射关系对待处理图像进行映射处理,得到目标图像。比如,假设有三种初始色彩映射关系(映射关系1、映射关系2、映射关系3),可以先利用上述三种初始色彩映射关系以及各初始色彩映射关系对应的权重得到一个目标色彩映射关系,该目标色彩映射关系可以用于将待处理图像的目标属性转换成参考图像的目标属性,然后再利用目标色彩映射关系对待处理图像进行映射处理,得到目标属性和参考图像的目标属性一致的目标图像。
在一些实施例中,在基于待处理图像和参考图像确定每种初始色彩映射关系对应的权重时,可以分别对待处理图像和参考图像进行特征提取,然后根据提取得到的待处理图像的特征和参考图像的特征确定每种初始色彩映射关系对应的权重。比如,可以通过一些特征提取网络提取待处理图像和参考图像的特征,基于提取到的特征确定每种初始色彩映射关系对应的权重。
在一些实施例中,基于待处理图像和参考图像确定每种初始色彩映射关系的权重的步骤由预先训练的生成对抗网络执行。比如,可以将待处理图像和参考图像输入到预先训练的生成对抗网络中,该生成对抗网络即可以输出每种初始色彩映射关系对应的权重。
在一些实施例中,该生成对抗网络可以基于以下方式训练得到,可以获取大量的样本图像对,每个样本图像对包括第三图像和第四图像,两种图像的目标属性不同,其中,第三图像可以是需要进行目标属性转换的图像,第四图像可以是该第三图像的参考图像,即需要通过生成对抗网络将第四图像的目标属性转换成第三图像的目标属性。然后可以将第三图像和第四图像输入到生成对抗网络的生成器中,得到预设的每种初始色彩映射 关系对应的权重,然后可以基于每种初始色彩映射关系对应的权重、预设的每种色彩映射关系对第三图像进行处理,得到目标属性和第四图像的目标属性相匹配的第五图像,然后可以基于生成对抗网络的判别器对第四图像和第五图像进行判断,判断第四图像和第五图像中哪个是生成的图像,哪个是原本真实的图像,并基于判别结果构建目标损失,然后可以基于目标损失对该生成对抗网络进行训练。
在一些实施例中,在构建目标损失时,除了可以根据生成对抗网络的判别器对第四图像和第五图像的判别结果构建,还可以根据利用映射关系对图像进行映射过程中需遵循的条件在目标损失中加入一些约束项,比如,在对图像进行映射的过程中,需保证映射后的图像的亮度的单调性,即映射前的图像中的亮度更大的像素点在映射后的图像中的亮度也要保持更大,基于这个原则可以在目标损失中加入约束项,保证最后输出得到目标图像可以满足上述条件。其次,映射后的图像的颜色也需保持平滑,避免色彩出现断层的问题,因而在目标损失中也可以加入约束项,通过该约束项保证最后得到的目标图像颜色平滑过渡。
进一步地,本申请实施例还提供了一种生成对抗网络的训练方法,如图6所示,该方法可以包括以下步骤:
S602、获取样本图像对,所述样本图像对包括第三图像和第四图像;将所述第三图像和所述第四图像输入到生成对抗网络的生成器中,得到预设的每种初始色彩映射关系对应的权重;
S604、基于所述权重、预设的每种色彩映射关系对所述第三图像进行处理,得到目标属性和所述第四图像的所述目标属性相匹配的第五图像;
S606、基于生成对抗网络的判别器对所述第四图像和所述第五图像的判别结果构建目标损失,基于所述目标损失对所述生成对抗网络进行训练。
其中,生成对抗网络的训练过程的具体细节可以参考上述实施例中的描述,在此不再赘述。
为了进一步解释本申请实施例提供的图像处理方法,以下结合一个具 体的实施例加以解释。
本实施例中通过预先训练一个生成对抗网络,使得训练后的生成对抗网络可以将输入的两帧图像中的一帧图像的风格转换成另一帧图像的风格。具体包括以下两个阶段:
1、生成对抗网络的训练阶段
如图7所示,为生成对抗网络的训练过程的示意图。生成对抗网络包括两个部分,生成器和判别器,其中,生成器可以由通用的特征提取网络构成,比如,该特征提取网络可以是LeNet,AlexNet,VGG,GoogleNet,ResNet,DenseNet等网络,判别器可以是一个二分类网络。
可以预先设置的一个或者多个3D-lut,其中,每个3D-lut为一个查找表,用于将图像映射成一种特定风格的图像。
在训练生成对抗网络时,可以获取大量的样本图像对,样本图像对中包括风格不同的两帧图像,这两帧图像中包括待进行风格转换的第三图像,以及用于作为参考的第四图像。然后可以将样本图像对输入到生成器中,生成器可以对两帧图像进行特征提取,基于提取到的特征确定每个3D-lut对应的权重。然后可以利用预设的每个3D-lut分别对第三图像进行映射处理,得到映射后的图像,再利用每个3D-lut对应的权重对映射后的图像进行加权融合处理,得到第五图像,然后可以将第四图像和第五图像输入到判别器中,由判别器判定第四图像和第五图像为真实的图像,还是生成的图像,基于判定结果构建目标损失,然后可以基于目标损失调整生成器的网络参数,以对生成对抗网络进行训练。
2、生成对抗网络的应用阶段
在训练得到生成对抗网络后,可以利用生成对抗网络将一帧图像的风格转换为另一帧图像的风格,具体如图8所示。比如,在获取到待进行风格转换的待处理图像,以及参考图像后,可以将待处理图像和参考图像输入到生成对抗网络的生成器中,由生成器基于该待处理图像和参考图像确定每个3D-lut对应的权重,利用每个3D-lut分别对待处理图像进行映射处 理,得到映射后的图像,再利用每个3D-lut对应的权重对映射后的图像进行加权融合,得到最终的目标图像。其中,目标图像的风格和参考图像的风格一致。
本申请实施例中通过生成对抗网络确定将待处理图像的风格转化成参考图像的风格时,预设的初始3D-lut对应的权重,然后基于该权重对每个初始3D-lut映射后的图像进行融合处理,得到风格和参考图像风格一致的目标图像。从而可以实现快速地将一帧图像的风格转换成任意一帧其他图像的风格。并且相关技术中,直接利用生成对抗网络生成风格转换后的目标图像,比较复杂,计算量大,对设备性能要求很高,本申请通过生成对抗网络输出预设的每种初始3D-lut对应的权重,利用初始3D-lut和权重对待处理图像进行处理,得到目标图像,可以大大减少计算量,使得该方法在一般的终端设备中也能部署。
此外,本申请实施例还提供了一种图像处理装置,如图9所示,所述装置90包括处理器91、存储器92、存储于所述存储器92可供所述处理器91执行的计算机程序,所述处理器91执行所述计算机程序时,可实现以下步骤:
响应于用户的触发指令,获取待处理图像和所述待处理图像的参考图像;
根据预设的至少一种初始色彩映射关系,以及每种初始色彩映射关系对应的权重对所述待处理图像进行处理,得到目标图像,以将所述目标图像显示在用户交互界面上;其中,所述目标图像的目标属性与所述参考图像的所述目标属性一致,所述权重基于所述待处理图像和所述参考图像确定。
在一些实施例中,所述待处理图像为视频中的多个视频帧,所述多个视频帧的参考图像为同一帧图像;或
所述待处理图像为视频中的多个视频帧,所述多个视频帧的参考图像为多帧第一图像,每帧所述第一图像作为一个或多个视频帧的参考图像; 或
所述待处理图像为第一视频中的第一视频帧,所述第一视频帧的参考图像为第二视频中的第二视频帧,每个第二视频帧作为一个或多个第一视频帧的参考图像。
在一些实施例中,所述处理器用于响应于用户的触发指令,获取待处理图像和所述待处理图像的参考图像时,具体用于:
响应于用户的触发指令,获取用户导入的图像或视频作为所述待处理图像,以及获取用户导入的图像或视频作为所述参考图像。
在一些实施例中,所述处理器用于响应于用户的触发指令,获取待处理图像和所述待处理图像的参考图像时,具体用于:
在用户交互界面显示图像或视频的情况下,响应于用户的触发指令,获取所述用户交互界面上显示的图像或视频中的视频帧作为所述待处理图像,获取用户导入的图像或视频中的视频帧作为所述参考图像。
在一些实施例中,所述处理器用于响应于用户的触发指令,获取待处理图像和所述待处理图像的参考图像时,具体用于:
响应于用户的触发指令,调用摄像头采集图像或视频;
在完成图像或视频的采集后,获取摄像头采集的图像或视频中的视频帧作为所述待处理图像,获取用户导入的图像或视频中的视频帧作为所述参考图像。
在一些实施例中,所述处理器用于响应于用户的触发指令,获取待处理图像和所述待处理图像的参考图像时,具体用于:
响应于用户的触发指令,调用摄像头采集一段视频;
在完成视频的采集后,将采集的视频中的视频帧作为所述待处理图像,并获取预先存储的第二图像作为所述参考图像。
在一些实施例中,所述第二图像包括所述目标属性按照预定方式变化的多帧图像,使得处理后的所述视频中的视频帧的所述目标属性按照预定方式变化。
在一些实施例中,所述处理后的所述视频中的视频帧的所述目标属性按照预定方式变化,包括;
所述目标属性为图像风格,处理后的所述视频中的视频帧的图像风格按照季节更替变化;或
所述目标属性为图像风格,处理后的所述视频中的视频帧的图像风格按照日夜更替变化;或
所述目标属性为图像中的人物风格,处理后的所述视频中的视频帧的人物风格按照年龄增长变化。
在一些实施例中,所述处理器用于将采集的视频中的各视频帧作为所述待处理图像,并获取预先存储的第二图像作为所述参考图像时,具体用于:
将采集的视频划分成多段子视频,所述子视频的数量与所述第二图像的数量一致,每段子视频对应一帧所述第二图像;
针对每段子视频,将所述每段子视频中的视频帧作为所述待处理图像,将所述每段子视频对应的所述第二图像作为所述参考图像。
在一些实施例中,所述处理器用于基于预设的至少一种初始色彩映射关系,以及每种初始色彩映射关系对应的权重对所述待处理图像进行处理,得到目标图像时,具体用于:
基于预设的至少一种初始色彩映射关系分别对所述待处理图像进行映射处理,得到利用每种初始色彩映射关系映射后的图像;基于每种初始色彩映射关系对应的权重对利用该种初始色彩映射关系映射后的图像进行处理,并对处理后的图像进行融合,得到所述目标图像;或
基于预设的至少一种初始色彩映射关系,以及每种初始色彩映射关系对应的权重,确定目标色彩映射关系,所述目标色彩映射关系用于将所述待处理图像的所述目标属性转换成所述参考图像的所述目标属性;利用所述目标色彩映射关系对所述待处理图像进行映射处理,得到所述目标图像。
在一些实施例中,所述目标属性包括以下一种或多种:图像风格、图 像的动态范围、图像中的人物风格。
在一些实施例中,每种初始色彩映射关系通过一个N维查找表表征,其中,N为正整数。
在一些实施例中,所述权重基于所述待处理图像和所述参考图像确定,包括:
分别对所述待处理图像和所述参考图像进行特征提取;
基于提取到的特征确定所述权重。
在一些实施例中,基于所述待处理图像和所述参考图像确定所述权重的步骤由预先训练的生成对抗网络执行。
在一些实施例中,所述生成对抗网络基于以下方式训练得到:
获取样本图像对,所述样本图像对包括第三图像和第四图像;
将所述第三图像和所述第四图像输入到生成对抗网络的生成器中,得到预设的每种初始色彩映射关系对应的权重;
基于所述权重、预设的每种色彩映射关系对所述第三图像进行处理,得到目标属性和所述第四图像的所述目标属性相匹配的第五图像;
基于生成对抗网络的判别器对所述第四图像和所述第五图像的判别结果构建目标损失,基于所述目标损失对所述生成对抗网络进行训练。
相应地,本说明书实施例还提供一种计算机存储介质,所述存储介质中存储有程序,所述程序被处理器执行时实现上述任一实施例中的方法。
本说明书实施例可采用在一个或多个其中包含有程序代码的存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。计算机可用存储介质包括永久性和非永久性、可移动和非可移动媒体,可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括但不限于:相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只 读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。
对于装置实施例而言,由于其基本对应于方法实施例,所以相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
以上对本发明实施例所提供的方法和装置进行了详细介绍,本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本发明的限制。

Claims (32)

  1. 一种图像处理方法,其特征在于,所述方法包括:
    响应于用户的触发指令,获取待处理图像和所述待处理图像的参考图像;
    根据预设的至少一种初始色彩映射关系,以及每种初始色彩映射关系对应的权重对所述待处理图像进行处理,得到目标图像,以将所述目标图像显示在用户交互界面上;其中,所述目标图像的目标属性与所述参考图像的所述目标属性一致,所述权重基于所述待处理图像和所述参考图像确定。
  2. 根据权利要求1所述的方法,其特征在于,所述待处理图像为视频中的多个视频帧,所述多个视频帧的参考图像为同一帧图像;或
    所述待处理图像为视频中的多个视频帧,所述多个视频帧的参考图像为多帧第一图像,每帧所述第一图像作为一个或多个视频帧的参考图像;或
    所述待处理图像为第一视频中的第一视频帧,所述第一视频帧的参考图像为第二视频中的第二视频帧,每个第二视频帧作为一个或多个第一视频帧的参考图像。
  3. 根据权利要求1或2所述的方法,其特征在于,所述响应于用户的触发指令,获取待处理图像和所述待处理图像的参考图像,包括:
    响应于用户的触发指令,获取用户导入的图像或视频作为所述待处理图像,以及获取用户导入的图像或视频作为所述参考图像。
  4. 根据权利要求1或2所述的方法,其特征在于,所述响应于用户的触发指令,获取待处理图像和所述待处理图像的参考图像,包括:
    在用户交互界面显示图像或视频的情况下,响应于用户的触发指令,获取所述用户交互界面上显示的图像或视频中的视频帧作为所述待处理图像,获取用户导入的图像或视频中的视频帧作为所述参考图像。
  5. 根据权利要求1或2所述的方法,其特征在于,所述响应于用户的 触发指令,获取待处理图像和所述待处理图像的参考图像,包括:
    响应于用户的触发指令,调用摄像头采集图像或视频;
    在完成图像或视频的采集后,获取摄像头采集的图像或视频中的视频帧作为所述待处理图像,获取用户导入的图像或视频中的视频帧作为所述参考图像。
  6. 根据权利要求1或2所述的方法,其特征在于,所述响应于用户的触发指令,获取待处理图像和所述待处理图像的参考图像,包括:
    响应于用户的触发指令,调用摄像头采集一段视频;
    在完成视频的采集后,将采集的视频中的视频帧作为所述待处理图像,并获取预先存储的第二图像作为所述参考图像。
  7. 根据权利要求6所述的方法,其特征在于,所述第二图像包括所述目标属性按照预定方式变化的多帧图像,使得处理后的所述视频中的视频帧的所述目标属性按照预定方式变化。
  8. 根据权利要求7所述的方法,其特征在于,所述处理后的所述视频中的视频帧的所述目标属性按照预定方式变化,包括:
    所述目标属性为图像风格,处理后的所述视频中的视频帧的图像风格按照季节更替变化;或
    所述目标属性为图像风格,处理后的所述视频中的视频帧的图像风格按照日夜更替变化;或
    所述目标属性为图像中的人物风格,处理后的所述视频中的视频帧的人物风格按照年龄增长变化。
  9. 根据权利要求7所述的方法,其特征在于,所述将采集的视频中的各视频帧作为所述待处理图像,并获取预先存储的第二图像作为所述参考图像,包括:
    将采集的视频划分成多段子视频,所述子视频的数量与所述第二图像的数量一致,每段子视频对应一帧所述第二图像;
    针对每段子视频,将所述每段子视频中的视频帧作为所述待处理图像, 将所述每段子视频对应的所述第二图像作为所述参考图像。
  10. 根据权利要求1-9任一项所述的方法,其特征在于,所述基于预设的至少一种初始色彩映射关系,以及每种初始色彩映射关系对应的权重对所述待处理图像进行处理,得到目标图像,包括:
    基于预设的至少一种初始色彩映射关系分别对所述待处理图像进行映射处理,得到利用每种初始色彩映射关系映射后的图像;基于每种初始色彩映射关系对应的权重对利用该种初始色彩映射关系映射后的图像进行处理,并对处理后的图像进行融合,得到所述目标图像;或
    基于预设的至少一种初始色彩映射关系,以及每种初始色彩映射关系对应的权重,确定目标色彩映射关系,所述目标色彩映射关系用于将所述待处理图像的所述目标属性转换成所述参考图像的所述目标属性;利用所述目标色彩映射关系对所述待处理图像进行映射处理,得到所述目标图像。
  11. 根据权利要求1-10任一项所述的方法,其特征在于,所述目标属性包括以下一种或多种:图像风格、图像的动态范围、图像中的人物风格。
  12. 根据权利要求1-11任一项所述的方法,其特征在于,每种初始色彩映射关系通过一个N维查找表表征,其中,N为正整数。
  13. 根据权利要求1-12任一项所述的方法,其特征在于,所述权重基于所述待处理图像和所述参考图像确定,包括:
    分别对所述待处理图像和所述参考图像进行特征提取;
    基于提取到的特征确定所述权重。
  14. 根据权利要求1-13任一项所述的方法,其特征在于,基于所述待处理图像和所述参考图像确定所述权重的步骤由预先训练的生成对抗网络执行。
  15. 根据权利要求14所述的方法,其特征在于,所述生成对抗网络基于以下方式训练得到:
    获取样本图像对,所述样本图像对包括第三图像和第四图像;
    将所述第三图像和所述第四图像输入到生成对抗网络的生成器中,得 到预设的每种初始色彩映射关系对应的权重;
    基于所述权重、预设的每种色彩映射关系对所述第三图像进行处理,得到目标属性和所述第四图像的所述目标属性相匹配的第五图像;
    基于生成对抗网络的判别器对所述第四图像和所述第五图像的判别结果构建目标损失,基于所述目标损失对所述生成对抗网络进行训练。
  16. 一种生成对抗网络的训练方法,其特征在于,所述方法包括:
    获取样本图像对,所述样本图像对包括第三图像和第四图像;
    将所述第三图像和所述第四图像输入到生成对抗网络的生成器中,得到预设的每种初始色彩映射关系对应的权重;
    基于所述权重、预设的每种色彩映射关系对所述第三图像进行处理,得到目标属性和所述第四图像的所述目标属性相匹配的第五图像;
    基于生成对抗网络的判别器对所述第四图像和所述第五图像的判别结果构建目标损失,基于所述目标损失对所述生成对抗网络进行训练。
  17. 一种图像处理装置,其特征在于,所述装置包括处理器、存储器、存储于所述存储器可供所述处理器执行的计算机程序,所述处理器执行所述计算机程序时,可实现以下步骤:
    响应于用户的触发指令,获取待处理图像和所述待处理图像的参考图像;
    根据预设的至少一种初始色彩映射关系,以及每种初始色彩映射关系对应的权重对所述待处理图像进行处理,得到目标图像,以将所述目标图像显示在用户交互界面上;其中,所述目标图像的目标属性与所述参考图像的所述目标属性一致,所述权重基于所述待处理图像和所述参考图像确定。
  18. 根据权利要求7所述的装置,其特征在于,所述待处理图像为视频中的多个视频帧,所述多个视频帧的参考图像为同一帧图像;或
    所述待处理图像为视频中的多个视频帧,所述多个视频帧的参考图像为多帧第一图像,每帧所述第一图像作为一个或多个视频帧的参考图像; 或
    所述待处理图像为第一视频中的第一视频帧,所述第一视频帧的参考图像为第二视频中的第二视频帧,每个第二视频帧作为一个或多个第一视频帧的参考图像。
  19. 根据权利要求17或18所述的装置,其特征在于,所述处理器用于响应于用户的触发指令,获取待处理图像和所述待处理图像的参考图像时,具体用于:
    响应于用户的触发指令,获取用户导入的图像或视频作为所述待处理图像,以及获取用户导入的图像或视频作为所述参考图像。
  20. 根据权利要求17或18所述的装置,其特征在于,所述处理器用于响应于用户的触发指令,获取待处理图像和所述待处理图像的参考图像时,具体用于:
    在用户交互界面显示图像或视频的情况下,响应于用户的触发指令,获取所述用户交互界面上显示的图像或视频中的视频帧作为所述待处理图像,获取用户导入的图像或视频中的视频帧作为所述参考图像。
  21. 根据权利要求17或18所述的装置,其特征在于,所述处理器用于响应于用户的触发指令,获取待处理图像和所述待处理图像的参考图像时,具体用于:
    响应于用户的触发指令,调用摄像头采集图像或视频;
    在完成图像或视频的采集后,获取摄像头采集的图像或视频中的视频帧作为所述待处理图像,获取用户导入的图像或视频中的视频帧作为所述参考图像。
  22. 根据权利要求17或18所述的装置,其特征在于,所述处理器用于响应于用户的触发指令,获取待处理图像和所述待处理图像的参考图像时,具体用于:
    响应于用户的触发指令,调用摄像头采集一段视频;
    在完成视频的采集后,将采集的视频中的视频帧作为所述待处理图像, 并获取预先存储的第二图像作为所述参考图像。
  23. 根据权利要求22所述的装置,其特征在于,所述第二图像包括所述目标属性按照预定方式变化的多帧图像,使得处理后的所述视频中的视频帧的所述目标属性按照预定方式变化。
  24. 根据权利要求23所述的装置,其特征在于,所述处理后的所述视频中的视频帧的所述目标属性按照预定方式变化,包括:
    所述目标属性为图像风格,处理后的所述视频中的视频帧的图像风格按照季节更替变化;或
    所述目标属性为图像风格,处理后的所述视频中的视频帧的图像风格按照日夜更替变化;或
    所述目标属性为图像中的人物风格,处理后的所述视频中的视频帧的人物风格按照年龄增长变化。
  25. 根据权利要求24所述的装置,其特征在于,所述处理器用于将采集的视频中的各视频帧作为所述待处理图像,并获取预先存储的第二图像作为所述参考图像时,具体用于:
    将采集的视频划分成多段子视频,所述子视频的数量与所述第二图像的数量一致,每段子视频对应一帧所述第二图像;
    针对每段子视频,将所述每段子视频中的视频帧作为所述待处理图像,将所述每段子视频对应的所述第二图像作为所述参考图像。
  26. 根据权利要求17-25任一项所述的装置,其特征在于,所述处理器用于基于预设的至少一种初始色彩映射关系,以及每种初始色彩映射关系对应的权重对所述待处理图像进行处理,得到目标图像时,具体用于:
    基于预设的至少一种初始色彩映射关系分别对所述待处理图像进行映射处理,得到利用每种初始色彩映射关系映射后的图像;基于每种初始色彩映射关系对应的权重对利用该种初始色彩映射关系映射后的图像进行处理,并对处理后的图像进行融合,得到所述目标图像;或
    基于预设的至少一种初始色彩映射关系,以及每种初始色彩映射关系 对应的权重,确定目标色彩映射关系,所述目标色彩映射关系用于将所述待处理图像的所述目标属性转换成所述参考图像的所述目标属性;利用所述目标色彩映射关系对所述待处理图像进行映射处理,得到所述目标图像。
  27. 根据权利要求17-26任一项所述的装置,其特征在于,所述目标属性包括以下一种或多种:图像风格、图像的动态范围、图像中的人物风格。
  28. 根据权利要求17-27任一项所述的装置,其特征在于,每种初始色彩映射关系通过一个N维查找表表征,其中,N为正整数。
  29. 根据权利要求17-28任一项所述的装置,其特征在于,所述权重基于所述待处理图像和所述参考图像确定,包括:
    分别对所述待处理图像和所述参考图像进行特征提取;
    基于提取到的特征确定所述权重。
  30. 根据权利要求17-29任一项所述的装置,其特征在于,基于所述待处理图像和所述参考图像确定所述权重的步骤由预先训练的生成对抗网络执行。
  31. 根据权利要求30所述的装置,其特征在于,所述生成对抗网络基于以下方式训练得到:
    获取样本图像对,所述样本图像对包括第三图像和第四图像;
    将所述第三图像和所述第四图像输入到生成对抗网络的生成器中,得到预设的每种初始色彩映射关系对应的权重;
    基于所述权重、预设的每种色彩映射关系对第三图像进行处理,得到目标属性和第四图像的所述目标属性相匹配的第五图像;
    基于生成对抗网络的判别器对所述第四图像和所述第五图像的判别结果构建目标损失,基于所述目标损失对所述生成对抗网络进行训练。
  32. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被执行时实现权利要求1-15和/或权利要求16任一项所述的方法。
PCT/CN2022/080213 2022-03-10 2022-03-10 图像处理方法、装置、神经网络训练方法及存储介质 WO2023168667A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/080213 WO2023168667A1 (zh) 2022-03-10 2022-03-10 图像处理方法、装置、神经网络训练方法及存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/080213 WO2023168667A1 (zh) 2022-03-10 2022-03-10 图像处理方法、装置、神经网络训练方法及存储介质

Publications (1)

Publication Number Publication Date
WO2023168667A1 true WO2023168667A1 (zh) 2023-09-14

Family

ID=87936911

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/080213 WO2023168667A1 (zh) 2022-03-10 2022-03-10 图像处理方法、装置、神经网络训练方法及存储介质

Country Status (1)

Country Link
WO (1) WO2023168667A1 (zh)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190026870A1 (en) * 2017-07-19 2019-01-24 Petuum Inc. Real-time Intelligent Image Manipulation System
CN113780326A (zh) * 2021-03-02 2021-12-10 北京沃东天骏信息技术有限公司 一种图像处理方法、装置、存储介质及电子设备
CN113869429A (zh) * 2021-09-29 2021-12-31 北京百度网讯科技有限公司 模型训练方法及图像处理方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190026870A1 (en) * 2017-07-19 2019-01-24 Petuum Inc. Real-time Intelligent Image Manipulation System
CN113780326A (zh) * 2021-03-02 2021-12-10 北京沃东天骏信息技术有限公司 一种图像处理方法、装置、存储介质及电子设备
CN113869429A (zh) * 2021-09-29 2021-12-31 北京百度网讯科技有限公司 模型训练方法及图像处理方法

Similar Documents

Publication Publication Date Title
TWI777162B (zh) 圖像處理方法及裝置、電子設備和電腦可讀儲存媒體
US10755391B2 (en) Digital image completion by learning generation and patch matching jointly
US10628680B2 (en) Event-based image classification and scoring
US10672164B2 (en) Predicting patch displacement maps using a neural network
US20210160556A1 (en) Method for enhancing resolution of streaming file
JP2022528294A (ja) 深度を利用した映像背景減算法
WO2019091412A1 (zh) 拍摄图像的方法、装置、终端和存储介质
CN107820020A (zh) 拍摄参数的调整方法、装置、存储介质及移动终端
EP3779891A1 (en) Method and device for training neural network model, and method and device for generating time-lapse photography video
US20150116350A1 (en) Combined composition and change-based models for image cropping
US11949848B2 (en) Techniques to capture and edit dynamic depth images
US10706512B2 (en) Preserving color in image brightness adjustment for exposure fusion
US20210211575A1 (en) Adjusting image capture parameters via machine learning
US20210225005A1 (en) Selection of Video Frames Using a Machine Learning Predictor
WO2019120025A1 (zh) 照片的调整方法、装置、存储介质及电子设备
CN114630057B (zh) 确定特效视频的方法、装置、电子设备及存储介质
CN112150347A (zh) 从有限的修改后图像集合中学习的图像修改样式
CN113259583A (zh) 一种图像处理方法、装置、终端及存储介质
WO2024067461A1 (zh) 图像处理方法、装置、计算机设备和存储介质
CN110727810A (zh) 图像处理方法、装置、电子设备及存储介质
WO2023168667A1 (zh) 图像处理方法、装置、神经网络训练方法及存储介质
US20210224571A1 (en) Automated Cropping of Images Using a Machine Learning Predictor
WO2023149135A1 (ja) 画像処理装置、画像処理方法及びプログラム
CN115037905A (zh) 录屏文件处理方法、电子设备及相关产品
CN115665472A (zh) 传输内容管控装置及方法