WO2023168667A1 - Appareil et procédé de traitement d'image, procédé d'entraînement de réseau neuronal, et support d'enregistrement - Google Patents

Appareil et procédé de traitement d'image, procédé d'entraînement de réseau neuronal, et support d'enregistrement Download PDF

Info

Publication number
WO2023168667A1
WO2023168667A1 PCT/CN2022/080213 CN2022080213W WO2023168667A1 WO 2023168667 A1 WO2023168667 A1 WO 2023168667A1 CN 2022080213 W CN2022080213 W CN 2022080213W WO 2023168667 A1 WO2023168667 A1 WO 2023168667A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
processed
video
target
mapping relationship
Prior art date
Application number
PCT/CN2022/080213
Other languages
English (en)
Chinese (zh)
Inventor
应礼剑
李志强
徐斌
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to PCT/CN2022/080213 priority Critical patent/WO2023168667A1/fr
Publication of WO2023168667A1 publication Critical patent/WO2023168667A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/40Analysis of texture

Definitions

  • the present application relates to the field of image processing technology, specifically, to an image processing method, device, neural network training method and storage medium.
  • users need to convert certain attributes of one image into the same attributes of another image.
  • style conversion of images after seeing images of a certain style taken by others, the user hopes to convert the images taken by himself into images of that style.
  • some technologies can only convert the image into an image of a specific style, and cannot convert any style quickly in real time.
  • technologies that can use image pairs as input to train a neural network, so that the trained neural network can convert the style of one frame of the input image pair into the style of another frame, but this technology has a negative impact on the performance of the device. The requirements are very high and cannot be used on some ordinary terminal devices.
  • this application provides an image processing method, device and storage medium.
  • an image processing method including:
  • the image to be processed is processed according to at least one preset initial color mapping relationship and the weight corresponding to each initial color mapping relationship to obtain a target image, so as to display the target image on the user interaction interface; wherein, The target attribute of the target image is consistent with the target attribute of the reference image, and the weight is determined based on the image to be processed and the reference image.
  • a training method for a generative adversarial network includes:
  • the sample image pair includes a third image and a fourth image; input the third image and the fourth image into the generator of the generative adversarial network to obtain each preset initial color mapping The corresponding weight of the relationship;
  • a target loss is constructed based on the discrimination results of the fourth image and the fifth image by a discriminator of the generative adversarial network, and the generative adversarial network is trained based on the target loss.
  • an image processing device includes a processor, a memory, and a computer program stored in the memory and executable by the processor.
  • the processor executes the computer program. program, perform the following steps:
  • the image to be processed is processed according to at least one preset initial color mapping relationship and the weight corresponding to each initial color mapping relationship to obtain a target image, so as to display the target image on the user interaction interface; wherein, The target attribute of the target image is consistent with the target attribute of the reference image, and the weight is determined based on the image to be processed and the reference image.
  • a computer-readable storage medium is provided.
  • a computer program is stored on the computer-readable storage medium.
  • the computer program is executed, the above-mentioned first aspect and/or the second aspect are implemented.
  • At least one initial color mapping relationship can be set in advance.
  • Each initial color mapping relationship can be used to convert the target attribute of the image into a specific attribute, and then The weight corresponding to each initial color mapping relationship can be determined according to the image to be processed and the reference image, and the image to be processed is processed according to the at least one initial color mapping relationship and the weight corresponding to each initial color mapping relationship to obtain the target image, so that the processing The target attributes of the obtained target image are consistent with those of the reference image.
  • the target attribute of the image to be processed can be quickly converted into the target attribute of any reference image, and since the weight corresponding to each preset initial color mapping relationship is determined based on the image to be processed and the reference image, then Then the image to be processed is processed according to the preset initial color mapping relationship and weight to obtain the target image.
  • Figure 1 is a flow chart of an image processing method according to an embodiment of the present application.
  • Figure 2 is a schematic diagram of a user importing images to be processed and reference images according to an embodiment of the present application.
  • Figure 3 is a schematic diagram of style conversion of images or videos displayed on a user interaction interface according to an embodiment of the present application.
  • Figure 4 is a schematic diagram of style conversion of collected images or videos according to an embodiment of the present application.
  • Figure 5 is a schematic diagram of a video in which target attributes change in a specific manner, collected according to an embodiment of the present application.
  • Figure 6 is a flow chart of a training method for a generative adversarial network according to an embodiment of the present application.
  • Figure 7 is a schematic diagram of a training method for a generative adversarial network according to an embodiment of the present application.
  • Figure 8 is a schematic diagram of using a generative adversarial network to convert the style of an image according to an embodiment of the present application.
  • Figure 9 is a schematic diagram of the logical structure of an image processing device according to an embodiment of the present application.
  • users have the need to quickly convert a certain attribute of one image into the same attribute of another image. For example, when users see some excellent image works in photography communities or public accounts, they may want to convert the style of the image they took into the style of that image, or the user may want to convert the dynamic range of one image into another.
  • the dynamic range of the image In the following, the image on which the user wishes to perform attribute conversion is called an image to be processed.
  • the image used by the user as a reference to convert the attributes of the image to be processed into the attributes of the image is called a reference image.
  • some technologies can only convert the image to be processed into an image with specific attributes. For example, taking style conversion as an example, several style templates are usually set in advance, and the style conversion is performed on the image to be processed. You can only select one or more from the preset style templates to convert the style of the image to be processed into the selected style. This method can only convert the image to be processed into a specific preset style and cannot Achieve any style conversion of the image to be processed.
  • embodiments of the present application provide an image processing method.
  • at least one initial color mapping relationship can be set in advance.
  • Each initial color mapping relationship can be used to convert the target attributes of the image. Convert to a specific attribute (for example, each initial color mapping relationship is used to convert the image into a specific style of image), and then the weight corresponding to each initial color mapping relationship can be determined based on the image to be processed and the reference image, according to
  • the at least one initial color mapping relationship and the weight corresponding to each initial color mapping relationship are processed on the image to be processed to obtain a target image, so that the target attribute of the processed target image is consistent with the target attribute of the reference image.
  • the target attribute of the image to be processed can be quickly converted into the target attribute of any reference image, and since the weight corresponding to each preset initial color mapping relationship is determined based on the image to be processed and the reference image, then Then the image to be processed is processed according to the preset initial color mapping relationship and weight to obtain the target image.
  • the image processing method provided by the embodiments of the present application can be executed by any electronic device that has the function of converting a certain attribute of an image into the same attribute of another image.
  • the electronic device can be a mobile phone, a tablet, a computer, Handheld gimbal, drone, server, etc.
  • this method can be executed by designated image processing software, and any device with the image processing software installed can implement this image processing method.
  • a specified functional service may be integrated when the device leaves the factory, and the specified functional service may execute the above image processing method.
  • the image processing method provided by the embodiment of the present disclosure may include the following steps:
  • step S102 when the user wants to convert the target attribute of a certain frame of image into the target attribute of another frame of image, the user can issue a trigger instruction, and then the device executing the method can obtain the image to be processed and the reference of the image to be processed. image.
  • the triggering method of the triggering instruction can be set flexibly. For example, it can be triggered by the user clicking a specified control on the user interaction interface, or it can also be triggered by the user's specific voice, gesture, action and other prompt information.
  • the image to be processed and the reference image can be separate images or video frames in a certain video.
  • the image to be processed may be an image or a video frame in a video that has just been captured by the user using a camera, or it may be an image or a video frame in the video imported by the user.
  • the reference image may be an image or a video frame in a video imported by the user, or it may be a preset default image or a video frame in the video, which is not limited in the embodiment of this application.
  • S104 Process the image to be processed according to at least one preset initial color mapping relationship and the weight corresponding to each initial color mapping relationship to obtain a target image, so as to display the target image on the user interaction interface; Wherein, the target attribute of the target image is consistent with the target attribute of the reference image, and the weight is determined based on the image to be processed and the reference image.
  • step S104 after the image to be processed and the reference image are obtained, the weight corresponding to each initial color mapping relationship in the preset at least one initial color mapping relationship can be determined based on the image to be processed and the reference image, and then the weight corresponding to the initial color mapping relationship can be determined using the image to be processed and the reference image.
  • At least one initial mapping color mapping relationship and weight are processed on the image to be processed to obtain a target image whose target attributes are consistent with those of the reference image.
  • each initial color mapping relationship can convert the target attribute of the image to be processed into a specific attribute.
  • the weight is used to adjust and correct each initial color mapping relationship, or the weight is used to adjust and correct the image mapped using each initial color mapping relationship, so that the target attributes and the target attributes of the reference image can be obtained Consistent target image.
  • the weight corresponding to each initial color mapping when determining the weight corresponding to each initial color mapping based on the image to be processed and the reference image, various methods can be used. For example, certain algorithms can be used to analyze and compare the target attributes of the image to be processed and the reference image, based on The characteristics of the two determine the weight corresponding to each initial color map.
  • the neural network can be pre-trained, the image to be processed and the reference image are input into the pre-trained neural network, and the weight corresponding to each initial color mapping relationship is determined through the neural network.
  • initial color mapping relationship 1 initial color mapping relationship 1
  • initial color mapping relationship 2 initial color mapping relationship 3
  • initial color mapping relationship 3 initial color mapping relationship 3
  • the weight corresponding to each initial color mapping relationship can be determined in real time based on the style of the image to be processed and the reference image, and then based on the weight and the initial color mapping relationship to be processed
  • the image is processed to obtain an image whose style is consistent with that of the reference image.
  • the target image can be displayed on the user interaction interface for the user to view.
  • the weight corresponding to the preset initial color mapping relationship is determined in real time based on the characteristics of the image to be processed and the reference image, and then the image to be processed is processed according to the weight and the preset initial color mapping relationship to obtain the target A target image whose attributes are consistent with those of the reference image.
  • the target attributes of the image to be processed can be quickly converted to be consistent with the target attributes of any reference image, and because only the weight of the preset initial color mapping relationship is determined, and then based on the weight and the preset initial The color mapping relationship is used to process the image to be processed.
  • the amount of calculation can be greatly reduced, so that this method can also be deployed on general terminal devices.
  • the target attribute in the embodiment of the present application may be the style of the image, for example, it may be the style related to the color of the image, for example, the brightness, contrast, color vividness of the image, etc., or it may also be the style of the image. Overall style, such as cartoon style, comic style, sketch style, etc.
  • the target attribute may also be the dynamic range of the image. For example, a low dynamic range image may be converted into a high dynamic range image.
  • the target attribute may also be the style of the character in the image, such as the age attribute of the character, etc.
  • the initial color mapping relationship can be characterized by any method used to represent the conversion relationship between the pixel values of the two frames of images, for example, it can be represented by a mapping table, a mapping curve, etc.
  • each initial color mapping relationship is represented by an N-dimensional lookup table, where N is a positive integer.
  • the initial color mapping relationship can be characterized by 1D-lut (one-dimensional lookup table), 2D-lut (two-dimensional lookup table), 3D-lut (three-dimensional lookup table), and 4D-lut (four-dimensional lookup table).
  • the initial color mapping relationship can contain only one type or multiple types, which can be set according to the actual situation. For example, taking each initial color mapping relationship through a 3D-lut as an example, you can set only one 3D-lut or multiple 3D-luts.
  • the image to be processed can be a video frame in an image or video
  • the reference image can also be a video frame in an image or video.
  • the image to be processed and the reference image are both a single frame of images, and the user can convert the target attribute of the frame of the image to be processed into the target attribute of the frame of the reference image.
  • the image to be processed may be a video frame in a video
  • the reference image may be a frame of image
  • the image to be processed can be multiple video frames in a video, and the reference images of these multiple video frames are the same frame image.
  • the user can convert the target attributes of all video frames in a video into the target of a reference image of a certain frame. Properties are consistent. For example, if the image to be processed is video A, the style of each video frame in video frame A can be converted into the style of reference image R.
  • the image to be processed may be a video frame in a video
  • the reference image may be a multi-frame image.
  • the images to be processed are multiple video frames in the video.
  • the reference images of the multiple video frames are the first images of multiple frames.
  • the first image of each frame serves as the reference image of one or more video frames. That is, the user can convert the The target attributes of different video frames are converted into target attributes of different reference images. For example, if the image to be processed is video A, the style of some video frames in video frame A can be converted into the style of reference image R1, and the style of some video frames can be converted into the style of reference image R2.
  • both the image to be processed and the reference image may be video frames in the video.
  • the image to be processed may be the first video frame in the first video
  • the reference image may be the second video frame in the second video.
  • Each A second video frame may serve as a reference image for one or more first video frames.
  • the image to be processed is a video frame in video A
  • the reference image can be a video frame in video B
  • the style of the video frame in video A after style conversion can correspond one-to-one to the style of the video frame in video B.
  • both the image to be processed and the reference image can be imported by the user.
  • a control can be set on the user interaction interface.
  • the user triggers the control (the "style conversion" control in the figure)
  • the user can be prompted to import the image to be processed and the reference image, and the user can select a path.
  • select an image or video from the specified storage location as the image to be processed or the reference image respectively then use one of the image frames or videos imported by the user as the image to be processed, and use another image frame or video imported by the user as the image to be processed. Reference image.
  • the user can also directly edit the image or video displayed on the user interaction interface to convert its target attribute.
  • the user can open an image or video so that the image or video is displayed on the user interaction interface.
  • the user interaction interface can also include controls for editing the image or video.
  • the user triggers After specifying a control (such as the "Style Conversion" control in the picture), the user can be prompted to import a reference image.
  • the user can select a path, select an image or video from the specified storage location as a reference image, and then obtain the image displayed on the user interaction interface. Or the video frame in the video is used as the image to be processed, and the image imported by the user or the video frame in the video is obtained as the reference image.
  • the camera can also be directly called to collect images or videos.
  • the video frames in the images or videos collected by the camera can be obtained as images to be processed.
  • User-imported images or video frames from videos serve as reference images.
  • the user interaction interface can include specific functional controls (for example, the "style conversion" control in the figure). The function of this control is to convert the style of the image or video collected by the user into the user's style. The style of the imported image or video.
  • the camera can be automatically called to collect the image or video.
  • the user can be prompted to import the reference image, so that the image or video finally presented to the user is style converted. image or video.
  • the reference image can also be a preset second image with specific attributes.
  • the camera can be called to collect a video.
  • the video frames in the collected video will be As the image to be processed, the pre-stored second image is then obtained as the reference image, so that the target attributes of the image or video collected by the user can be automatically converted into specific attributes.
  • the second image may be one frame or multiple frames.
  • the second image includes multiple frames of images in which the target attributes change in a predetermined manner.
  • the second image may include multiple frames of images, and the style of the multiple frames of images gradually changes in a certain manner, thereby utilizing the multiple frames of images.
  • the target attributes of the video frames in the processed video can also be changed in a predetermined manner.
  • special functional controls such as the style conversion control in the figure
  • This functional control can be used to collect videos that change in a specific way.
  • the camera can be automatically called to collect video, and the pre-stored multiple frames of images that change in a certain way can be used as reference images (reference image 1, reference image 2, and reference image 3 in the figure). Convert the target attributes of the collected video. For example, the target attributes of video frames 1-10 can be converted into the target attributes of reference image 1, and the target attributes of video frames 11-20 can be converted into the target attributes of reference image 2. The target attributes of the video frames 21-30 can be converted into the target attributes of the reference image 3, so that the target attributes of the final video change in the same way.
  • the target attribute may be an image style
  • the target attribute of the video frames in the processed video changing in a predetermined manner may be the image style of the video frames in the processed video changing according to seasonal changes.
  • the second image can be a pre-stored four-frame image of a style transitioning from spring to winter.
  • the collected video can be processed using the above-mentioned four-frame images as reference images, and the positions in the video can be automatically
  • the front part of the video frames is converted to spring style
  • the middle part of the video frames is converted to summer style and autumn style
  • the last part of the video frames is converted to winter style, so that the style of the video frames in the processed video transitions from spring to winter.
  • the target attribute may be an image style
  • the image style of the video frames in the processed video may change according to day and night.
  • the second image can be a pre-stored multi-frame image of a style transitioning from morning to evening. After the video is collected, the above-mentioned multi-frame image can be used as a reference image to process the collected video, so that the processed video The style of the video frames transitions from morning to evening.
  • the target attribute may be the style of the characters in the image, and the style of the characters in the video frames in the processed video changes according to age.
  • the second image can be a pre-stored multi-frame image of a character transitioning from childhood to old age. After collecting the video of the character, the above-mentioned multi-frame image can be used as a reference image to process the collected video, so that the processing The age of the characters in the video frames after the video transitions from early childhood to old age.
  • each video frame in the captured video when each video frame in the captured video is used as an image to be processed, and a pre-stored second image is obtained as a reference image, the captured video can be divided into multiple sub-videos.
  • the number of sub-videos Consistent with the number of second images, each sub-video corresponds to one frame of the second image.
  • the video frame in each sub-video is used as the image to be processed, and the second image corresponding to each sub-video is used as the second image in the sub-video.
  • the reference image of each video frame when each video frame in the captured video is used as an image to be processed, and a pre-stored second image is obtained as a reference image.
  • video A can be divided into four sub-videos ⁇ sub-video A1, sub-video A2, Sub-video A3, sub-video A4 ⁇ , among which, the video frame in sub-video A1 uses the image with the style "spring” as the reference image, the video frame in sub-video A2 uses the image with the style "summer” as the reference image, The video frames in video A3 use the image with the style "autumn” as the reference image, and the video frames in the sub-video A4 use the image with the style "winter” as the reference image, and then perform style conversion on the video frames of the above four sub-videos respectively. , so that the effect of the video frames in the video finally displayed to the user gradually transitions from spring to winter, showing the alternation of the four seasons.
  • the target image when the image to be processed is processed based on at least one preset initial color mapping relationship and the weight corresponding to each initial color mapping relationship to obtain the target image, the target image may be obtained based on at least one preset initial color mapping relationship.
  • the color mapping relationship is mapped to the image to be processed respectively to obtain the image mapped using each initial color mapping relationship, and then the image mapped using the initial color mapping relationship is processed based on the weight corresponding to each initial color mapping relationship. And fuse the processed images to obtain the target image.
  • mapping relationship 1, mapping relationship 2, and mapping relationship 3 you can first use the above three initial color mapping relationships to perform mapping processing on the image to be processed, and obtain three frames of images with different target attributes. , and then use the weights corresponding to each initial color mapping relationship to perform weighted fusion processing on the three frame images, so that the target attributes of the final image are consistent with the target attributes of the reference image.
  • the weight can be a numerical value or a weight matrix. That is, for the image obtained by mapping processing, the weight corresponding to each pixel can be the same or different. For example, each pixel may correspond to a weight, or each pixel block may correspond to a weight, or the image obtained by the entire frame mapping process may correspond to a weight.
  • the target image when the image to be processed is processed based on at least one preset initial color mapping relationship and the weight corresponding to each initial color mapping relationship to obtain the target image, the target image may be obtained based on at least one preset initial color mapping relationship.
  • the color mapping relationship, as well as the weight corresponding to each initial color mapping relationship determines the target color mapping relationship.
  • the target color mapping relationship is used to convert the target attributes of the image to be processed into the target attributes of the reference image, and then uses the target color mapping relationship to treat Process the image and perform mapping processing to obtain the target image.
  • mapping relationship 1, mapping relationship 2, and mapping relationship 3 For example, assuming there are three initial color mapping relationships (mapping relationship 1, mapping relationship 2, and mapping relationship 3), you can first use the above three initial color mapping relationships and the weights corresponding to each initial color mapping relationship to obtain a target color mapping relationship, which The target color mapping relationship can be used to convert the target attributes of the image to be processed into the target attributes of the reference image, and then use the target color mapping relationship to map the image to be processed to obtain a target image whose target attributes are consistent with those of the reference image.
  • feature extraction can be performed on the image to be processed and the reference image respectively, and then based on the extracted features of the image to be processed and the reference
  • the characteristics of the image determine the weight corresponding to each initial color mapping relationship.
  • the features of the image to be processed and the reference image can be extracted through some feature extraction networks, and the weight corresponding to each initial color mapping relationship is determined based on the extracted features.
  • the step of determining the weight of each initial color mapping relationship based on the image to be processed and the reference image is performed by a pre-trained generative adversarial network.
  • the image to be processed and the reference image can be input into a pre-trained generative adversarial network, and the generative adversarial network can output the weight corresponding to each initial color mapping relationship.
  • the generative adversarial network can be trained based on the following method: a large number of sample image pairs can be obtained, each sample image pair includes a third image and a fourth image, and the target attributes of the two images are different, where, the first The third image may be an image that requires target attribute conversion, and the fourth image may be a reference image of the third image, that is, the target attribute of the fourth image needs to be converted into the target attribute of the third image through a generative adversarial network.
  • the third image and the fourth image can be input into the generator of the generative adversarial network to obtain the preset weight corresponding to each initial color mapping relationship, and then based on the weight corresponding to each initial color mapping relationship, the preset Each color mapping relationship processes the third image to obtain a fifth image whose target attributes match the target attributes of the fourth image. Then the fourth image and the fifth image can be judged based on the discriminator of the generative adversarial network. Which of the fourth and fifth images is the generated image and which is the original real image, and a target loss is constructed based on the discrimination results, and then the generative adversarial network can be trained based on the target loss.
  • the target loss in addition to the discrimination results of the fourth image and the fifth image by the discriminator of the generative adversarial network, it can also be constructed based on the conditions that need to be followed in the process of mapping the images using the mapping relationship. Add some constraints to the target loss. For example, in the process of mapping the image, it is necessary to ensure the monotonicity of the brightness of the mapped image, that is, the pixels with greater brightness in the image before mapping are in the image after mapping. The brightness in should also be kept larger. Based on this principle, constraints can be added to the target loss to ensure that the final output target image can meet the above conditions. Secondly, the color of the mapped image also needs to be kept smooth to avoid the problem of color discontinuity. Therefore, a constraint term can also be added to the target loss, and the constraint term can ensure a smooth transition of the color of the final target image.
  • this embodiment of the present application also provides a training method for a generative adversarial network. As shown in Figure 6, the method may include the following steps:
  • a generative adversarial network is pre-trained so that the trained generative adversarial network can convert the style of one of the two input frames into the style of the other frame. Specifically, it includes the following two stages:
  • the generative adversarial network consists of two parts, the generator and the discriminator.
  • the generator can be composed of a general feature extraction network.
  • the feature extraction network can be LeNet, AlexNet, VGG, GoogleNet, ResNet, DenseNet and other networks.
  • the discriminator The network can be a binary classification network.
  • One or more 3D-luts can be preset, where each 3D-lut is a lookup table used to map images into images of a specific style.
  • the sample image pairs include two frames of images with different styles.
  • the two frames of images include a third image to be performed for style conversion, and a fourth image used as a reference. image.
  • the sample image pairs can then be input into the generator, which can perform feature extraction on the two frames of images and determine the weight corresponding to each 3D-lut based on the extracted features.
  • each preset 3D-lut can be used to map the third image separately to obtain the mapped image, and then the weight corresponding to each 3D-lut can be used to perform weighted fusion processing on the mapped image to obtain the fifth image.
  • the fourth image and the fifth image can be input into the discriminator, and the discriminator determines whether the fourth image and the fifth image are real images or generated images. Based on the determination results, the target loss can be constructed, and then the target loss can be based on Adjust the network parameters of the generator to train the generative adversarial network.
  • the generative adversarial network can be used to convert the style of one frame of image into the style of another frame of image, as shown in Figure 8.
  • the image to be processed and the reference image can be input into the generator of the generative adversarial network, and the generator determines each image based on the image to be processed and the reference image.
  • the weight corresponding to each 3D-lut is used to map the image to be processed separately to obtain the mapped image, and then the weight corresponding to each 3D-lut is used to perform weighted fusion on the mapped image to obtain the final target image.
  • the style of the target image is consistent with the style of the reference image.
  • the weight corresponding to the initial 3D-lut is preset, and then the image after mapping each initial 3D-lut is mapped based on the weight. Perform fusion processing to obtain a target image with the same style as the reference image.
  • the style of one frame of image can be quickly converted into the style of any other frame of image.
  • the generative adversarial network is directly used to generate the target image after style conversion, which is relatively complex, requires a large amount of calculation, and has high requirements on device performance.
  • This application uses the generative adversarial network to output the preset corresponding to each initial 3D-lut. Weight, use the initial 3D-lut and weight to process the image to be processed, and obtain the target image, which can greatly reduce the amount of calculation, making this method also deployable in general terminal devices.
  • the embodiment of the present application also provides an image processing device.
  • the device 90 includes a processor 91, a memory 92, and a computer program stored in the memory 92 and executable by the processor 91. , when the processor 91 executes the computer program, the following steps can be implemented:
  • the image to be processed is processed according to at least one preset initial color mapping relationship and the weight corresponding to each initial color mapping relationship to obtain a target image, so as to display the target image on the user interaction interface; wherein, The target attribute of the target image is consistent with the target attribute of the reference image, and the weight is determined based on the image to be processed and the reference image.
  • the image to be processed is multiple video frames in the video, and the reference images of the multiple video frames are the same frame image; or
  • the image to be processed is a plurality of video frames in the video, the reference images of the multiple video frames are multiple first images, and the first image of each frame serves as a reference image for one or more video frames; or
  • the image to be processed is the first video frame in the first video
  • the reference image of the first video frame is the second video frame in the second video
  • each second video frame serves as one or more first videos The reference image of the frame.
  • the processor when used to obtain an image to be processed and a reference image of the image to be processed in response to a user's trigger instruction, it is specifically used to:
  • the image or video imported by the user is obtained as the image to be processed, and the image or video imported by the user is obtained as the reference image.
  • the processor when used to obtain an image to be processed and a reference image of the image to be processed in response to a user's trigger instruction, it is specifically used to:
  • the video frame in the image or video displayed on the user interactive interface is obtained as the image to be processed, and the image or video imported by the user is obtained.
  • the video frame serves as the reference image.
  • the processor when used to obtain an image to be processed and a reference image of the image to be processed in response to a user's trigger instruction, it is specifically used to:
  • the video frames in the images or videos collected by the camera are obtained as the images to be processed, and the video frames in the images or videos imported by the user are obtained as the reference images.
  • the processor when used to obtain an image to be processed and a reference image of the image to be processed in response to a user's trigger instruction, it is specifically used to:
  • the video frames in the collected video are used as the images to be processed, and the pre-stored second image is obtained as the reference image.
  • the second image includes a multi-frame image in which the target attribute changes in a predetermined manner, such that the target attribute of the video frames in the processed video changes in a predetermined manner.
  • the target attributes of the video frames in the processed video change in a predetermined manner, including;
  • the target attribute is an image style, and the image style of the video frames in the processed video changes according to seasonal changes; or
  • the target attribute is an image style, and the image style of the video frames in the processed video changes according to day and night; or
  • the target attribute is the character style in the image, and the character style of the processed video frames in the video changes according to age.
  • the processor when used to use each video frame in the collected video as the image to be processed, and to obtain a pre-stored second image as the reference image, it is specifically used to:
  • each sub-video the video frame in each sub-video is used as the image to be processed, and the second image corresponding to each sub-video is used as the reference image.
  • the processor is configured to process the image to be processed based on at least one preset initial color mapping relationship and the weight corresponding to each initial color mapping relationship, and when obtaining the target image, specifically for :
  • the images to be processed are mapped based on at least one preset initial color mapping relationship to obtain images mapped using each initial color mapping relationship; based on the weight corresponding to each initial color mapping relationship, the images to be processed are mapped using the initial color mapping relationship. Process the image mapped by the color mapping relationship, and fuse the processed images to obtain the target image; or
  • a target color mapping relationship is determined, and the target color mapping relationship is used to convert the target attribute of the image to be processed into The target attribute of the reference image; use the target color mapping relationship to perform mapping processing on the image to be processed to obtain the target image.
  • the target attributes include one or more of the following: image style, dynamic range of the image, and style of the characters in the image.
  • each initial color mapping relationship is represented by an N-dimensional lookup table, where N is a positive integer.
  • the weight is determined based on the image to be processed and the reference image, including:
  • the weights are determined based on the extracted features.
  • the step of determining the weight based on the image to be processed and the reference image is performed by a pre-trained generative adversarial network.
  • the generative adversarial network is trained based on the following method:
  • sample image pair including a third image and a fourth image
  • a target loss is constructed based on the discrimination results of the fourth image and the fifth image by a discriminator of the generative adversarial network, and the generative adversarial network is trained based on the target loss.
  • embodiments of this specification also provide a computer storage medium, the storage medium stores a program, and when the program is executed by a processor, the method in any of the above embodiments is implemented.
  • Embodiments of the present description may take the form of a computer program product implemented on one or more storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having program code embodied therein.
  • Storage media available for computers include permanent and non-permanent, removable and non-removable media, and can be implemented by any method or technology to store information.
  • Information may be computer-readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to: phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, compact disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, Magnetic tape cassettes, tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium can be used to store information that can be accessed by a computing device.
  • PRAM phase change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • ROM read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory or other memory technology
  • CD-ROM compact disc read-only memory
  • DVD digital versatile disc
  • Magnetic tape cassettes tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium can be used to store information that can be accessed by
  • the device embodiment since it basically corresponds to the method embodiment, please refer to the partial description of the method embodiment for relevant details.
  • the device embodiments described above are only illustrative.
  • the units described as separate components may or may not be physically separated.
  • the components shown as units may or may not be physical units, that is, they may be located in One location, or it can be distributed across multiple network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. Persons of ordinary skill in the art can understand and implement the method without any creative effort.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Processing Or Creating Images (AREA)

Abstract

L'invention concerne un appareil et un procédé de traitement d'image, un procédé d'entraînement de réseau neuronal et un support d'enregistrement. Le procédé de traitement d'image consiste à : acquérir, en réponse à une instruction de déclenchement d'un utilisateur, une image à traiter et une image de référence de ladite image ; et en fonction d'au moins une relation de mappage de couleur initiale prédéfinie et d'une pondération correspondant à chaque relation de mappage de couleur initiale, traiter ladite image pour obtenir une image cible, et afficher l'image cible sur une interface d'interaction d'utilisateur, un attribut cible de l'image cible étant cohérent avec un attribut cible de l'image de référence, et la pondération étant déterminée sur la base de ladite image et de l'image de référence. De cette manière, un attribut cible d'une image à traiter peut être rapidement converti pour être cohérent avec un attribut cible de n'importe quelle image de référence, et la quantité de calcul peut être considérablement réduite, de sorte que le procédé puisse également être déployé sur un équipement terminal commun.
PCT/CN2022/080213 2022-03-10 2022-03-10 Appareil et procédé de traitement d'image, procédé d'entraînement de réseau neuronal, et support d'enregistrement WO2023168667A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/080213 WO2023168667A1 (fr) 2022-03-10 2022-03-10 Appareil et procédé de traitement d'image, procédé d'entraînement de réseau neuronal, et support d'enregistrement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/080213 WO2023168667A1 (fr) 2022-03-10 2022-03-10 Appareil et procédé de traitement d'image, procédé d'entraînement de réseau neuronal, et support d'enregistrement

Publications (1)

Publication Number Publication Date
WO2023168667A1 true WO2023168667A1 (fr) 2023-09-14

Family

ID=87936911

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/080213 WO2023168667A1 (fr) 2022-03-10 2022-03-10 Appareil et procédé de traitement d'image, procédé d'entraînement de réseau neuronal, et support d'enregistrement

Country Status (1)

Country Link
WO (1) WO2023168667A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190026870A1 (en) * 2017-07-19 2019-01-24 Petuum Inc. Real-time Intelligent Image Manipulation System
CN113780326A (zh) * 2021-03-02 2021-12-10 北京沃东天骏信息技术有限公司 一种图像处理方法、装置、存储介质及电子设备
CN113869429A (zh) * 2021-09-29 2021-12-31 北京百度网讯科技有限公司 模型训练方法及图像处理方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190026870A1 (en) * 2017-07-19 2019-01-24 Petuum Inc. Real-time Intelligent Image Manipulation System
CN113780326A (zh) * 2021-03-02 2021-12-10 北京沃东天骏信息技术有限公司 一种图像处理方法、装置、存储介质及电子设备
CN113869429A (zh) * 2021-09-29 2021-12-31 北京百度网讯科技有限公司 模型训练方法及图像处理方法

Similar Documents

Publication Publication Date Title
TWI777162B (zh) 圖像處理方法及裝置、電子設備和電腦可讀儲存媒體
US10755391B2 (en) Digital image completion by learning generation and patch matching jointly
US10628680B2 (en) Event-based image classification and scoring
US10672164B2 (en) Predicting patch displacement maps using a neural network
US20210160556A1 (en) Method for enhancing resolution of streaming file
JP2022528294A (ja) 深度を利用した映像背景減算法
WO2019091412A1 (fr) Procédé, appareil, terminal et support d'informations de capture d'image
EP3779891A1 (fr) Procédé et dispositif pour entraîner un modèle de réseau neuronal et procédé et dispositif pour générer une vidéo photographique à intervalles de temps
US10706512B2 (en) Preserving color in image brightness adjustment for exposure fusion
US11949848B2 (en) Techniques to capture and edit dynamic depth images
US20210211575A1 (en) Adjusting image capture parameters via machine learning
US20210225005A1 (en) Selection of Video Frames Using a Machine Learning Predictor
GB2587833A (en) Image modification styles learned from a limited set of modified images
CN114630057B (zh) 确定特效视频的方法、装置、电子设备及存储介质
US20240205376A1 (en) Image processing method and apparatus, computer device, and storage medium
CN110727810A (zh) 图像处理方法、装置、电子设备及存储介质
CN111626922B (zh) 图片生成方法、装置、电子设备及计算机可读存储介质
WO2023168667A1 (fr) Appareil et procédé de traitement d'image, procédé d'entraînement de réseau neuronal, et support d'enregistrement
US20210224571A1 (en) Automated Cropping of Images Using a Machine Learning Predictor
WO2023149135A1 (fr) Dispositif de traitement d'image, procédé de traitement d'image, et programme
CN115037905A (zh) 录屏文件处理方法、电子设备及相关产品
CN115665472A (zh) 传输内容管控装置及方法
CN114299089A (zh) 图像处理方法、装置、电子设备及存储介质
CN113657255A (zh) 一种自动框选人的方法和存储设备
US10735754B2 (en) Chromatic aberration modeling in image compression and enhancement