WO2023071810A1 - Traitement d'image - Google Patents

Traitement d'image Download PDF

Info

Publication number
WO2023071810A1
WO2023071810A1 PCT/CN2022/125012 CN2022125012W WO2023071810A1 WO 2023071810 A1 WO2023071810 A1 WO 2023071810A1 CN 2022125012 W CN2022125012 W CN 2022125012W WO 2023071810 A1 WO2023071810 A1 WO 2023071810A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
pixel
probability
original image
map
Prior art date
Application number
PCT/CN2022/125012
Other languages
English (en)
Chinese (zh)
Inventor
程俊奇
四建楼
钱晨
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023071810A1 publication Critical patent/WO2023071810A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Definitions

  • the present disclosure relates to computer vision techniques, and more particularly to image processing.
  • region replacement is widely used in various image editing software, camera back-end algorithms and other scenarios.
  • Segmentation models are usually used to semantically segment the original image, resulting in a rough mask of the replaced region in the original image. Then, according to the mask result, it is fused with the image containing the target area, so as to replace the replaced area with the target area.
  • the embodiments of the present disclosure at least provide an image processing method, device, electronic device, and storage medium.
  • the present disclosure provides an image processing method, the method comprising: performing matting processing on an original image to obtain a matting result, the matting result including a first image and a transparency map corresponding to the original image; the The first image includes a reserved area in the original image, and the reserved area is the foreground or background in the original image; the numerical value of the pixel in the transparency map indicates the transparency of the pixel; according to the pixel in the first image and the difference between pixels in the material image containing the target area, determine the color difference between the reserved area and the target area; the target area is used to replace the non-retained area in the original image; according to the The color difference is used to adjust the hue of the pixels in the first image to obtain a second image that matches the tone of the target area; based on the transparency map, image fusion is performed on the second image and the material image to obtain target image.
  • the present disclosure proposes an image processing device, which includes: a map-cutting module, configured to perform map-cutting processing on an original image to obtain a map-cutting result, the map-cutting result including a first image and an image corresponding to the original image
  • the transparency map includes a reserved area in the original image, and the reserved area is the foreground or background in the original image; the numerical value of the pixel in the transparency map indicates the transparency of the pixel;
  • the determination module It is used to determine the color difference between the reserved area and the target area according to the difference between the pixels in the first image and the pixels in the material image containing the target area; the target area is used to replace the target area The non-retained area in the original image;
  • the adjustment module is used to adjust the color tone of the pixels in the first image according to the color difference to obtain a second image that matches the color tone of the target area; the fusion module is used to adjust the color tone based on the color difference.
  • image fusion is performed on the second image and the material image to
  • the present disclosure proposes an electronic device, including: a processor; a memory for storing executable instructions of the processor; wherein, the processor executes the executable instructions to implement the image processing method shown in any of the foregoing embodiments .
  • the present disclosure provides a computer-readable storage medium, the storage medium stores a computer program, and the computer program is used to cause a processor to execute the image processing method as shown in any one of the foregoing embodiments.
  • the pixel values of the pixels in the reserved area and the pixel values of the pixels in the target area obtained by matting the original image can be Pixel value, determine the color difference between the reserved area and the target area, and then adjust the color tone of the pixels in the reserved area according to the color difference, and unify the pixels in the reserved area and the target area
  • the tone of the pixel so that in the process of area replacement, it can ensure that the tone of the target area matches the tone of the non-retained area in the original image, thereby improving the effect of area replacement.
  • the three-part map can be used for matting, so that the detailed information of the junction position between the reserved area and the non-reserved area can be well preserved.
  • the keying network can also be processed by channel compression, etc., and the original image can be scaled, so that the time-consuming and memory consumption of the keying process can be within the processing capability of the mobile terminal, so that there is no need to go through the server. Region replacement ensures data security and privacy.
  • FIG. 1 is a schematic flowchart of an image processing method shown in an embodiment of the present disclosure
  • FIG. 2 is a schematic flowchart of a method for determining color differences shown in an embodiment of the present disclosure
  • FIG. 3 is a schematic flowchart of a method for determining color differences shown in an embodiment of the present disclosure
  • FIG. 4 is a schematic flow chart of a tone adjustment method shown in an embodiment of the present disclosure
  • FIG. 5 is a schematic flow diagram of a map-cutting method shown in an embodiment of the present disclosure.
  • FIG. 6 is a schematic flowchart of a method for obtaining a tripartite graph shown in an embodiment of the present disclosure
  • Fig. 7a is a schematic diagram of a character image shown in an embodiment of the present disclosure.
  • Fig. 7b is a schematic diagram of a semantic probability map shown in an embodiment of the present disclosure.
  • Fig. 7c is a schematic diagram of a tripartite graph shown in an embodiment of the present disclosure.
  • Fig. 7d is a schematic diagram of a transparency map shown in an embodiment of the present disclosure.
  • Fig. 7e is a schematic diagram of a foreground image shown in an embodiment of the present disclosure.
  • Fig. 7f is a schematic diagram of a target image shown in an embodiment of the present disclosure.
  • FIG. 8 is a schematic diagram of a region replacement process shown in an embodiment of the present disclosure.
  • FIG. 9 is a schematic diagram of an area replacement process based on FIG. 8;
  • FIG. 10 is a schematic flowchart of a network training method shown in an embodiment of the present disclosure.
  • FIG. 11 is a schematic structural diagram of an image processing method shown in an embodiment of the present disclosure.
  • Fig. 12 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present disclosure.
  • This disclosure relates to the field of augmented reality.
  • AR Augmented Reality
  • the target object may involve faces, limbs, gestures, actions, etc. related to the human body, or markers and markers related to objects, or sand tables, display areas or display items related to venues or places.
  • Vision-related algorithms may involve visual positioning, SLAM (Simultaneous Localization and Mapping), 3D reconstruction, image registration, background segmentation, object key point extraction and tracking, object pose or depth detection, etc.
  • Specific applications can not only involve interactive scenes such as guided tours, navigation, explanations, reconstructions, virtual effect overlays and display related to real scenes or objects, but also special effects processing related to people, such as makeup beautification, body beautification, special effect display, virtual Interactive scenarios such as model display.
  • the relevant features, states and attributes of the target object can be detected or identified through the convolutional neural network.
  • the aforementioned convolutional neural network is a neural network model obtained through model training based on a deep learning framework.
  • region replacement is widely used in various image editing software, camera back-end algorithms and other scenarios. Segmentation models are usually used to semantically segment the original image, resulting in a rough mask of the replaced region in the original image. Then, according to the mask result, it is fused with the image containing the target area, so as to replace the replaced area with the target area.
  • segmentation models are usually used to semantically segment the original image, resulting in a rough mask of the replaced region in the original image. Then, according to the mask result, it is fused with the image containing the target area, so as to replace the replaced area with the target area.
  • there is an obvious color tone difference between the target area and the original image and a direct replacement often results in an obvious inconsistency in picture color tone, resulting in a poor area replacement effect.
  • the present disclosure proposes an image processing method. This method can ensure that the tone of the target area matches the reserved area in the original image during area replacement, thereby improving the effect of area replacement.
  • FIG. 1 is a schematic flowchart of an image processing method shown in an embodiment of the present disclosure.
  • the processing method shown in FIG. 1 can be applied to electronic equipment.
  • the electronic device may execute the method by carrying software logic corresponding to the processing method.
  • the type of the electronic device may be a notebook computer, a computer, a mobile phone, a PDA (Personal Digital Assistant, PDA) and the like.
  • the type of the electronic device is not particularly limited in the present disclosure.
  • the electronic device may also be a client device and/or a server device, which is not specifically limited here.
  • the image processing method may include S102-S108. Unless otherwise specified, the present disclosure does not specifically limit the execution order of these steps.
  • S102 Perform matting processing on the original image to be processed to obtain a matting result, the matting result including a first image and a transparency map corresponding to the original image; the first image includes A reserved area, the reserved area is the foreground or background in the original image; the value of the pixel in the transparency map indicates the transparency of the pixel.
  • the original image needs to be replaced.
  • the original image may include reserved areas and non-reserved areas.
  • the non-reserved area is generally used as a replaced area, which is replaced by other materials.
  • the reserved area and the non-reserved area can be distinguished by image processing techniques.
  • the reserved area refers to an area that is reserved and not replaced during the process of performing area replacement on the image. For example, in a scene where background areas are replaced, foreground areas are preserved. For example, the background area (such as the sky area) in the person image needs to be replaced, and the foreground area containing the person can be used as the reserved area. In scenes where the foreground area is replaced, the background area is the preserved area.
  • the first image may include a reserved area cut out from the original image.
  • the first image has the same size as the original image. Areas in the first image other than the reserved area may be filled with pixels of preset pixel values.
  • the preset pixel value may be 0, 1 and so on.
  • the transparency map is used to distinguish reserved areas and non-reserved areas by different values of transparency.
  • the value of a pixel within the transparency map indicates the transparency of the corresponding pixel.
  • the transparency values of the pixels belonging to the reserved area in the transparency map are the first value
  • the transparency values of the pixels belonging to the non-reserved area are the second value.
  • the first value and the second value will vary.
  • the first value of the transparency map may be 1, indicating that the pixels in the reserved area are non-transparent, and the first value of the transparency map
  • the second value can be 0, indicating that the pixels in the non-reserved area are transparent.
  • the non-reserved area is replaced by this transparency, and the original non-reserved area is not preserved at all.
  • the first value of the transparency map can be 1, indicating that the pixels in the reserved area are non-transparent, and the first value of the transparency map
  • the second value can be 0.3, indicating that the pixels in the non-reserved area are semi-transparent.
  • the three-part map corresponding to the original image to be processed can be obtained; for each pixel in the three-part map, the value corresponding to the pixel indicates that the pixel belongs to the reserved area, non-reserved area, or area to be determined; then, according to the tripartite map, the original image can be matted to obtain the first image and the transparency map.
  • the trimap has the characteristics of distinguishing the foreground, the background, and the transition area between the foreground and the background in the image. That is, regardless of whether the reserved area is the foreground or the background in the original image, the three-part map is used to distinguish the reserved area in the original image, the non-reserved area, and the area to be determined between the reserved area and the non-reserved area, so as to save the reserved area Details of the location of the handoff with the non-reserved area.
  • a pre-trained matting network may also be used for matting processing.
  • the matting network is trained in a supervised manner through training samples marked with transparency information and reserved area information in advance.
  • the first image and the transparency map can be obtained by inputting the original image into the matting network.
  • S104 Determine the color difference between the reserved area and the target area according to the difference between the pixels in the first image and the pixels in the material image containing the target area; the target area in the material image Used to replace non-preserved regions in the original image.
  • the material images are generally some pre-acquired images, and these images contain replacement materials used to replace non-reserved areas.
  • the areas occupied by these replacement materials in the material image may be referred to as target areas.
  • the material image may contain some sky materials, and these sky materials may be used to replace the sky in the original image (that is, the non-preserved area in the original image).
  • the color difference refers to the pixel value difference between the pixels in the reserved area and the pixels in the target area.
  • a pixel value of a pixel may indicate a color value of the pixel.
  • the color difference between the reserved area and the target area may be obtained by calculating an average difference between pixel values of pixels in the first image and pixel values of pixels in the material image.
  • the reserved area and the target area can be sampled by sampling, and the color difference can be determined by the pixel value of the sampling point, thereby reducing the amount of computation for determining the color difference, thereby improving the efficiency of area replacement.
  • FIG. 2 is a schematic flowchart of a method for determining color differences shown in an embodiment of the present disclosure.
  • the steps shown in FIG. 2 are descriptions of S104.
  • the method for determining the color difference may include S202-S204. Unless otherwise specified, the present disclosure does not specifically limit the execution order of these steps.
  • the step size can be preset. For example, the result obtained by dividing the short side of the original image by a preset value (for example, 10, 20, 30, etc.) may be determined as the step.
  • a preset value for example, 10, 20, 30, etc.
  • sampling may be performed in a preset order (for example, from left to right, from top to bottom) and a set step size to obtain some first sampling points; and for the material image, Sampling is performed according to the preset order (for example, from left to right, from top to bottom) and a set step size to obtain some second sampling points.
  • the pixel mean value or pixel median value of the first sampling point can be determined according to the pixel value of the first sampling point, and the pixel value of the second sampling point can be determined according to the pixel value of the second sampling point Mean or pixel median value.
  • the color difference is then determined based on the difference between two pixel means or two pixel median values.
  • the pixel value of the sampling point is represented according to the pixel mean value or the pixel median value of the sampling point, which can simplify the operation.
  • the transparency of the first sampling point can be combined, so that the determined pixel mean value is more accurate, which helps to accurately determine the color difference between the reserved area and the target area , thereby enhancing the tone adjustment effect.
  • the pixels in the transparency map may be first sampled to obtain the third sampling point.
  • the steps disclosed in the foregoing S202 may be used for sampling to obtain some third sampling points.
  • FIG. 3 is a schematic flowchart of a method for determining color differences shown in an embodiment of the present disclosure.
  • the steps shown in FIG. 3 are supplementary descriptions of S204.
  • the method for determining the color difference may include S302-S306. Unless otherwise specified, the present disclosure does not specifically limit the execution order of these steps.
  • S302. Determine a first pixel average value of the first sampling point based on the pixel value of the first sampling point and the transparency value of the third sampling point.
  • the disclosure refers to the pixel mean value of the first sampling point determined based on the pixel value of the first sampling point and the transparency value of the third sampling point as the first pixel mean value.
  • the embodiment of the present disclosure does not limit how to determine the specific formula of the first pixel mean value, and the following is only an example:
  • fg_mean indicates the first pixel mean value.
  • FG1 refers to the pixel value of the first sampling point.
  • Alpha1 refers to the transparency value of the third sampling point. The exact mean value of the first pixel can be obtained by combining the transparency of the sampling point through the formula (1).
  • the pixel mean value of the second sampling point is referred to as the second pixel mean value.
  • the second pixel mean value bg_mean can be obtained by averaging BG1 through the mean number calculation formula.
  • BG1 is the pixel value of the second sampling point.
  • S306. Determine the color difference between the reserved area and the target area according to the difference between the first pixel average value and the second pixel average value.
  • a pixel value of a pixel may indicate color information of the pixel.
  • the difference in color of a pixel can be determined by the difference in pixel value.
  • the embodiment of the present disclosure does not limit the specific formula of how to determine the color difference, and the following is only an example:
  • the pixel mean value or pixel median value of the sampling point can be used to represent the pixel value of the sampling point, which can simplify the operation.
  • the transparency of the sampling point is combined, so that the determined pixel mean value is more accurate, which helps to accurately determine the color difference between the reserved area and the target area, thereby improving the color tone Adjust the effect.
  • Hue refers to the overall tendency of the color of the image. Although the image includes a variety of colors, it generally has a color tendency. For example, the image may be bluish or reddish, warm or cold, and so on. This tendency in color is the hue of the image. That is, the hue of the image can be indicated by the pixel value (color value) of the pixel in the image, and the hue adjustment can be completed by adjusting the pixel value of the image pixel.
  • the tone adjustment may be to adjust the color values of the pixels in the first image to be closer to the color values of the pixels in the target area.
  • the tone matching of the two images means that the difference between the color values of the pixels in the two images is smaller than the preset color threshold (empirical threshold), that is, the color values of the pixels in the two images are relatively close, showing roughly the same hue Effect.
  • the preset color threshold empirical threshold
  • the color difference may be fused with pixel values of pixels in the first image, so as to achieve the effect that the hue of the first image matches the hue of the target area, and complete the hue adjustment.
  • FG refers to the pixel value of the pixel in the first image.
  • new_FG refers to the pixel value of a pixel within the second image.
  • q is the preset adjustment factor.
  • the q is a preset value according to business requirements.
  • diff indicates the color difference between the target area and the reserved area. According to the formula (3), the color tone of the pixels in the first image can be adjusted based on the color difference to obtain a second image that matches the color tone of the target area, thereby facilitating the improvement of the area replacement effect.
  • the color difference between the second image and the reserved area before adjustment can also be fused, and the pixels in the second image can be adjusted again to avoid the pixel value of the pixel in the second image being too large or Too small to enhance the tone adjustment effect.
  • FIG. 4 is a schematic flowchart of a method for adjusting hue according to an embodiment of the present disclosure.
  • the steps shown in FIG. 4 are supplementary descriptions of S106.
  • the tone adjustment method may include S402-S404. Unless otherwise specified, the present disclosure does not specifically limit the execution order of these steps.
  • the third image may be obtained by using the foregoing formula (3).
  • the difference between the pixel mean value of the pixels in the third image and the pixel mean value of the pixels in the first image may indicate a color difference between the third image and the first image.
  • the difference between the pixel mean value of the pixels in the third image and the pixel mean value of the pixels in the first image may be determined first, and then the difference is fused into the pixel values of the pixels in the third image to obtain the second image.
  • new_FG’ new_FG+(mean(FG)-mean(new_FG)).
  • new_FG' on the left side of the equal sign is the pixel value of the pixel in the second image.
  • the new_FG on the right side of the equal sign is the pixel value of the pixel in the third image obtained by the aforementioned formula (3).
  • mean() is the average function.
  • the color difference between the third image and the first image can be obtained by mean(FG)-mean(new_FG).
  • formula (3) can be used to initially adjust the hue of the first image, and then formula (4) can be used to adjust the hue of the third image to obtain the second image, so that the hue of the second image is closer to The tone of the target area will not deviate too much from the tone of the first image, reducing the possibility of the color being too bright or too dark caused by the pixel value of the pixel in the second image being too large or too small, and improving the tone adjustment effect, and then Improve area replacement effect.
  • the image fusion may include, but is not limited to, splicing, addition, and multiplication of pixel values of pixels in two images.
  • the first result can be obtained based on the fusion of the transparency map and the second image, and the second result can be obtained based on the fusion of the material image and the reverse transparency map corresponding to the transparency map; and then the obtained The first result is fused with the second result to obtain the target image.
  • Equation (5) where new indicates the pixel value of a pixel in the target image.
  • new_FG' indicates the pixel value of the pixel in the second image obtained in S106.
  • BG indicates the pixel value of the pixel in the material image corresponding to the target area.
  • Alpha indicates the transparency value of the pixel within the transparency map.
  • 1-Alpha can be expressed as the reverse transparency map corresponding to the transparency map.
  • the first result obtained by the fusion of Alpha and new_FG' can be expressed as new_FG'*Alpha
  • the second result obtained by the fusion of BG and reverse transparency can be expressed as BG*(1-Alpha)
  • the fusion of the first result and the second result The result is new obtained by formula (5).
  • the transparency values of the pixels belonging to the reserved area in the transparency map are the first value
  • the transparency values of the pixels belonging to the non-reserved area are the second value.
  • the first value may be 1, indicating that the pixel is non-transparent
  • the value may be 0, indicating that the pixel is transparent.
  • new_FG'*Alpha the pixels belonging to the reserved area in the second image can be adjusted to be non-transparent, and the pixels belonging to the non-reserved area can be adjusted to be transparent.
  • BG*(1-Alpha) the pixels belonging to the target area in the material image are adjusted to be non-transparent, and the pixels belonging to the non-target area are adjusted to be transparent.
  • the image fusion can be realized by formula 5.
  • the color difference between the reserved area and the target area can be determined according to the pixel values of the pixels in the reserved area obtained by matting the original image and the pixel values of the pixels in the target area , and then adjust the hue of the pixels in the reserved area according to the color difference, and unify the hues of the pixels in the reserved area and the pixels in the target area, so that when the second image and the target area are combined Image fusion, in the process of region replacement, can ensure that the tone of the target region matches the reserved region in the original image, thereby improving the effect of region replacement.
  • segmentation models are commonly used to perform semantic segmentation on reserved regions to obtain rough mask results for non-reserved regions. Then, the replacement of the original non-reserved area is realized according to the mask result and the material image. Since the mask result output by the segmentation model is often rough in the boundary area between the reserved area and the non-reserved area, directly using the mask result for area replacement will cause obvious artifacts in the boundary area. For example, in a sky replacement scene, some local details between the sky and the horizon in the original image may be missing.
  • a tripartite matting method may be used to solve the foregoing problems.
  • the original image can be matted by using the trimap corresponding to the original image to obtain the first image and the transparency map, so that the original image can be well distinguished by using the trimap
  • the characteristics of the reserved area, the unreserved area, and the area to be determined between the reserved area and the unreserved area enable the obtained transparency map to preserve the detailed information of the junction position between the reserved area and the unreserved area, and directly use the mask result to perform
  • the region replacement based on the transparency map obtained by matting the original image by using the tripartite map can help improve the connection effect between the reserved region and the target region.
  • FIG. 5 is a schematic flowchart of a method for cutting out images according to an embodiment of the present disclosure.
  • the steps shown in FIG. 5 are supplementary descriptions of S102.
  • the map-cutting method may include S502-S504. Unless otherwise specified, the present disclosure does not specifically limit the execution order of these steps.
  • the trimap has the characteristics of distinguishing the foreground, the background, and the transition area between the foreground and the background in the image. That is, regardless of whether the reserved area is the foreground or the background in the original image, the three-part map is used to distinguish the reserved area in the original image, the non-reserved area, and the area to be determined between the reserved area and the non-reserved area, so as to save the reserved area Details of the location of the handoff with the non-reserved area.
  • the trimap in the present disclosure can be represented by a trimap.
  • editing software can be used to assist in obtaining the trimap of the original image.
  • the reserved area as the foreground area as an example, the non-reserved area (background area), reserved area (foreground area), and undetermined area can be marked on the original image through image editing software to obtain a tripartite map.
  • the trimap may be obtained by using a trimap extraction network generated based on a neural network.
  • the trimap extraction network can be trained in advance based on training samples marked with trimap information.
  • the user does not need to manually label the trimap, nor does it need to pre-train the prediction network for predicting the trimap, but the trimap can be obtained based on the results of semantic segmentation combined with probability conversion. .
  • FIG. 6 is a schematic flowchart of a method for obtaining a tripartite graph according to an embodiment of the present disclosure.
  • the steps shown in FIG. 6 are supplementary descriptions of the method for obtaining the trimap in S502.
  • the method for obtaining a trimap may include S602-S604. Unless otherwise specified, the present disclosure does not specifically limit the execution order of these steps.
  • S602. Perform semantic segmentation processing on the original image to be processed to obtain a semantic probability map corresponding to the original image.
  • the image to be matted may be referred to as an original image.
  • the person image may be called an original image.
  • the non-sky area is the target to be extracted in the matting process, which may be called a reserved area, and the reserved area may be the foreground or background in the original image.
  • the semantic segmentation processing may be performed on the original image, for example, the semantic segmentation processing may be performed through a semantic segmentation network.
  • the semantic segmentation network includes but is not limited to commonly used semantic segmentation networks such as SegNet, U-Net, DeepLab, and FCN.
  • a semantic probability map of the original image can be obtained, and the semantic probability map can include: for each pixel in the original image, the first probability that the pixel belongs to the reserved area. Taking the reserved area as the foreground as an example, in the semantic probability map, the probability of a certain pixel in the original image belonging to the foreground may be 0.85, and the probability of another pixel belonging to the foreground may be 0.24.
  • probability conversion processing may be performed based on the semantic segmentation processing result to obtain a tripartite graph.
  • the trimap obtained through probability conversion processing in this embodiment can be represented by soft-trimap.
  • the probability conversion process may be to map the probability corresponding to the pixel obtained in the semantic probability map to the value corresponding to the pixel in the soft-trimap through a mathematical conversion method.
  • the probability in the semantic probability graph can be transformed into the following two parts:
  • the first probability is converted to obtain the second probability.
  • the trimap soft-trimap may include three kinds of regions: "reserved region (foreground)", “non-reserved region (background)” and "to-be-determined region".
  • the probability that the pixel belongs to the region to be determined in the tripartite map may be referred to as the second probability.
  • the first probability characterizes the probability that the pixel belongs to the reserved area (foreground) or the non-reserved area (background)
  • the higher the value the lower the probability that the second probability indicates that the pixel belongs to the region to be determined in the tripartite map. For example, the closer the first probability is to 1 and 0, the closer the second probability is to 0; the closer the first probability is to 0.5, the closer the second probability is to 1.
  • the above conversion principle is that if a pixel in the image has a higher probability of belonging to the reserved area (foreground), or a higher probability of belonging to the non-reserved area (background), the lower the probability of the pixel belonging to the area to be determined; and
  • the probability that the pixel belongs to the reserved area (foreground) or the non-reserved area (background) is around 0.5, it means that the probability that the pixel belongs to the area to be determined is higher.
  • the first probability can be converted to obtain the second probability.
  • the embodiment of the present disclosure does not limit the specific formula of probability conversion, and the following is only an example:
  • polynomial fitting is used to convert the first probability into the second probability, which can make the polynomial conversion calculation more efficient, and also more accurately reflect the above conversion principle.
  • a semantic probability map can be obtained, and the reserved area (foreground) and non-reserved area (background) in the original image can be roughly distinguished through the semantic probability map. For example, if the first probability of a pixel belonging to the foreground is 0.96, then the probability of belonging to the foreground is very high; if the first probability of a pixel belonging to the foreground is 0.14, it means that the probability of the pixel belonging to the background is very high.
  • the second probability that each pixel belongs to the region to be determined can be obtained.
  • the first probability corresponding to the pixel in the semantic probability map and the second probability that the pixel belongs to the region to be determined can be combined for probability fusion, and the pixel can be obtained in the tripartition map soft-
  • the corresponding numerical value in the trimap which can represent the probability that the pixel belongs to any area in the determined reserved area (foreground), non-reserved area (background) or undetermined area in the original image.
  • the value corresponding to a pixel is closer to 1, it means that the pixel is more likely to belong to the reserved area (foreground) in the original image; the closer the value of the pixel in soft-trimap is to 0, it means The pixel is more likely to belong to the non-reserved area (background); the closer the value of the pixel in the soft-trimap is to 0.5, the more likely the pixel belongs to the area to be determined. That is, the probability that the pixel belongs to any one of the reserved area, the non-reserved area, or the area to be determined can be expressed by the value corresponding to the pixel in the soft-trimap.
  • soft_trimap -k5*un/k6*sign(score-k7)+(sign(score-k7)+k8)/k9....(7)
  • soft_trimap represents the value corresponding to the pixel in soft-trimap
  • un represents the second probability
  • score represents the first probability
  • sign() represents the sign function.
  • this embodiment does not limit the specific values of the aforementioned coefficients "k5/k6/k7/k8".
  • the probability conversion based on the semantic probability map is realized Process to get the three-point map soft_trimap.
  • pooling processing may be performed on the semantic probability map, and the above-mentioned probability conversion processing is performed on the pooled semantic probability map. See equation (8) below:
  • the average pooling process can be performed on the semantic probability map, and the pooling is performed according to the convolution stride and the convolution kernel size (kernel_size, ks).
  • score_ represents the semantic probability map after pooling, which contains the probability of each pooling.
  • the scores in the above formulas (6) and (7) are replaced with the pooled probability, that is, the pooled semantic probability map is used to perform probability conversion.
  • the size of the kernel used in the above pooling process can be adjusted, and the pooling process is performed before the probability conversion of the semantic probability map, which helps to adjust the width of the area to be determined in the soft_trimap to be generated by adjusting the size of the convolution kernel .
  • the image size of the original image can also be preprocessed, and the preprocessing can be based on the semantic segmentation network
  • the image size of the original image is processed by an integer multiple of the downsampling multiple, so that the image size after the integer multiple processing can be divided by the above-mentioned downsampling multiple scale_factor, which is the semantic segmentation network for the original image.
  • the downsampling multiple the specific value is determined by the network structure of the semantic segmentation network.
  • the semantic probability map obtained by semantic segmentation based on the original image can be probabilistically converted to obtain the three-part map, which makes the acquisition of the three-part map faster and more convenient, no manual labeling is required, and It is no longer necessary to train the prediction network through trimap annotations, which makes the process of map matting easier to implement; moreover, this method of using probability conversion to obtain a trimap is based on the semantic probability map of semantic segmentation, so that the generated trimap The graph is more accurate.
  • the process of the matting process may include: using the tripartite image and the original image as the input of the matting network, obtaining the residual of the reserved area output by the matting network and the initial transparency of the original image picture.
  • the residual of the reserved area may be a residual result obtained by the residual processing unit in the matting network.
  • the reserved area residual may indicate a difference between a pixel value of a pixel in the reserved area extracted by the residual processing unit and a pixel value of a corresponding pixel in the original image.
  • the value of the pixel in the initial transparency map indicates the transparency of the pixel.
  • the first image can be obtained based on the original image and the reserved area residual (for example, the foreground image can be obtained based on the addition of the foreground residual and the original image, or the background image can be obtained based on the addition of the background residual and the original image), and can According to the trimap soft_trimap, the values of the pixels in the initial transparency map are adjusted to obtain a transparency map corresponding to the original image.
  • the foreground image can be obtained based on the addition of the foreground residual and the original image, or the background image can be obtained based on the addition of the background residual and the original image
  • the transparency values of pixels in the reserved region of the trimap in the initial transparency map may be adjusted to a first value.
  • the transparency values of pixels in the non-reserved region of the trimap in the initial transparency map may be adjusted to a second value.
  • the transparency value of the pixel in the reserved area of the tripartite map in the initial transparency map can be adjusted to 1
  • the transparency value of the pixels in the non-reserved area of the tripartite map in the initial transparency map can be adjusted to 0, and the transparency value of the pixels in the initial transparency map in the region to be determined can be greater than 0.5
  • the opacity value is adjusted to 1, and the opacity value less than 0.5 is adjusted to 0.
  • the three-part map can be used for matting, so that the detailed information of the handover position between the reserved area and the non-reserved area can be well preserved, and it is beneficial to improve the reserved area and the target when performing area replacement. Cohesion effect between regions.
  • the area replacement software on the mobile phone mainly uploads data to the server for processing, and then transmits the area replacement result back to the mobile phone for local reading.
  • the security and privacy of data in this scheme are difficult to guarantee.
  • the network deployed to the mobile terminal can be miniaturized, and the size of the original image can be scaled, so that the time-consuming and memory consumption of the mobile terminal can be reduced.
  • the processing capability of the terminal there is no need to replace the region through the server to ensure data security and privacy.
  • An example of image keying on the mobile terminal is described as follows.
  • a semantic segmentation network and an image matting network may be used.
  • the semantic segmentation network may be a network such as SegNet, U-Net, etc.
  • the matting network may include an encoder (encoder) and a decoder (decoder).
  • the encoder of the image matting network can adopt the structure design of mobv2, and before the image matting network is deployed to the mobile terminal, the channel compression of the image matting network can be performed, and the channel compression can be carried out in the middle of the network of the image matting network The number of channels of the features (that is, the features of the middle layer of the network) is compressed.
  • the number of output channels of the convolution kernel in the process of matting network processing. Assuming that the number of output channels of the convolution kernel is originally a, it can be compressed according to 0.35 times the number of channels. After compression, the output of the convolution kernel The number of channels is 0.35*a.
  • FIG. 7a is a schematic diagram of a character image shown in an embodiment of the present disclosure.
  • the sky area in the person image shown in FIG. 7a is used as a background area, which is also a non-reserved area, and needs to be replaced with another sky area (ie, the target area in this disclosure) in the pre-acquired material image.
  • the non-sky area in Figure 7a is the reserved area output by the matting network in this example, that is, the foreground area.
  • FIG. 8 is a schematic diagram of a region replacement process shown in an embodiment of the present disclosure.
  • FIG. 9 is a schematic flowchart of the region replacement method based on FIG. 8 .
  • the region replacement method may include S901-S909. Unless otherwise specified, the present disclosure does not specifically limit the execution order of these steps.
  • the original image in this embodiment may be the person image shown in Fig. 7a.
  • the person image may be captured by the user through the camera of the mobile terminal, or may be an image stored in the mobile terminal or received from other devices.
  • the purpose of the matting process in this embodiment may be to extract the non-sky area in the person image.
  • Non-sky regions in the original image can be considered as foreground.
  • the original image can be scaled in order to reduce the processing load on the mobile terminal and save the calculation amount of the mobile terminal. Assuming that the size of the original image in Figure 7a is 1080*1920, the image can be scaled to a size of 480*288. For example, scaling can be done by way of bilinear difference. Scaling can be performed with reference to the following formula (9) and formula (10):
  • h and w are the length and width of the original image
  • basesize is the base size, which is 480 in this example
  • int(x) means rounding x
  • new_h and new_w are the scaled dimensions of the original image respectively, where the specific values of the coefficients in formula (10) are not limited in this embodiment.
  • the image size of the original image can be processed by an integer multiple of the downsampling multiple to control the scaled image size to be able to divide the semantic segmentation network's downsampling multiple scale_factor of the image. It can be understood that other formulas may also be used for the integer multiple processing, and are not limited to the following two formulas.
  • This embodiment does not limit the specific values of the respective coefficients in the above formula (11) and formula (12).
  • the above values of k12 to k15 may all be set to 1. If the original image before scaling is marked with A, then the original image obtained by normalizing the image after being scaled into a 480*288 image can be marked with B. Referring to FIG. 8 , the original image B is the original image after scaling.
  • the semantic segmentation process can be performed on the original image B through the semantic segmentation network 81, and the semantic probability map 82 output by the semantic segmentation network can be obtained.
  • the semantic probability map can be identified by score, and Fig. 7b shows a semantic probability picture. It can be seen that the score of the semantic probability map indicates the probability that the pixel belongs to the non-sky area (foreground), which roughly distinguishes the foreground and background in the image, that is, roughly distinguishes the sky area from the non-sky area.
  • the trimap soft-trimap may be generated according to the probability conversion process described in the aforementioned S604.
  • the semantic probability map can be pooled according to formula (8), and then the probability conversion process can be performed on the pooled semantic probability map according to formula (6) and formula (7) to generate a tripartite map. See this tripartite diagram 83 in FIG. 8 .
  • FIG. 7c illustrates a three-part map soft-trimap. It can be seen that the probability value of the pixel in the soft-trimap can represent the probability that the pixel belongs to three types of regions. According to the probability value A "sky area”, a "non-sky area” and "a region to be determined between the sky area and the non-sky area" in the image can be distinguished.
  • the three-part image 83 and the original image B can be used as the input of the matting network 84, and the matting network can output a 4-channel result, wherein the result of one channel is the initial transparency map raw_alpha, and in addition The result of the three channels is the foreground residual fg_res.
  • the first result 85 output by the keying network in FIG. 8 may include "raw_alpha+fg_res".
  • the foreground is the non-sky area in the person image.
  • the foreground residual fg_res can be enlarged through the bilinear difference, so that it can be restored to the scale before the original image is scaled, and then execute the formula (13):
  • a matting result 86 that is, the foreground image FG in the original image, can be obtained.
  • clip(x, s1, s2) is to limit the value of x to [s1, s2].
  • This embodiment does not limit specific values of s1 and s2 in the above formula (13), for example, s1 may be 0, and s2 may be 1.
  • Alpha represents the transparency corresponding to the non-sky area. After obtaining the Alpha, the Alpha can be enlarged back to the original size of the original image before scaling through the bilinear difference.
  • this embodiment does not limit the specific values of the respective coefficients s3 to s8 in the above formula (14) and formula (15).
  • Fig. 7d shows a transparency map Alpha, where the non-sky area and the sky area can be clearly distinguished in Alpha.
  • the first value of a pixel in the sky area is 1, indicating non-transparency.
  • the second value of pixels in the non-sky area is 0, which means completely transparent.
  • Fig. 7e illustrates the extracted non-sky area, ie the foreground image FG.
  • the value obtained after the short side of the person image is divided by 20 can be used as the step, and the matting result 86 (foreground image FG and Alpha) and the material image 87 are sampled.
  • the color difference can be obtained based on the method shown in FIG. 3 .
  • the first pixel mean value fg_mean of the foreground sampling point can be obtained according to the foregoing formula (1).
  • the color difference diff can be obtained according to the aforementioned formula (2).
  • tone adjustment may be performed according to the tone adjustment method shown in FIG. 4 .
  • the non-sky area (the foreground image FG in the matting result) can be adjusted according to the preset adjustment coefficient q first, according to the aforementioned formula (3), to obtain the preliminary adjusted foreground image (the third image). Then, based on the aforementioned formula (4), the tone correction is performed on the preliminarily adjusted foreground image (third image) to obtain the final adjustment result 88, that is, the final adjusted foreground image new_FG' (second image).
  • image fusion can be performed according to the foregoing formula (5), and the target image new is obtained after replacing the sky area.
  • FIG. 7f it is the target image after replacing the sky area obtained through S901-S909.
  • this method on the one hand, in the process of region replacement, it can be ensured that the tone of the target region matches the non-sky region in the original image, thereby improving the effect of region replacement.
  • the three-part map can be used for matting, so that the detailed information of the intersection position between the sky area and the non-sky area can be well obtained.
  • area replacement it is beneficial to improve the connection effect between the sky area and the non-sky area. .
  • the method of region replacement is to directly obtain the matting result by using a single original image as input, that is, to provide an original image, and based on the region replacement method provided by the embodiment of the present disclosure, the corresponding region replacement method can be obtained.
  • the prediction of the foreground in the original image requires less input information, which makes the image processing more convenient.
  • FIG. 10 is a schematic flowchart of a network training method shown in an embodiment of the present disclosure. This method can be used for joint training of semantic segmentation network and matting network. As shown in Figure 10, the method may include the following processing:
  • each sample data in the training sample set may include a sample image, a first feature label corresponding to the sample image, and a second feature label corresponding to the sample image.
  • the first feature label may be a segmentation label for the sample image
  • the second feature label may be a matting label for the sample image.
  • S1004 for each sample data in the training sample set, process the sample data to obtain a global image including global image information of the sample image, a segmentation label corresponding to the global image, and local image information including the sample image The partial image of and the keying label corresponding to the partial image.
  • the first processing can be performed on the sample image of the sample data to obtain a global image including most of the image information of the sample image. It can be considered that the global image includes the global image information of the sample image.
  • the same first processing is performed on the first feature label of , to obtain the segmentation label corresponding to the global image.
  • the sample image can be scaled according to the size requirements of the semantic segmentation network for the input image, but still retain most of the image information of the sample image to obtain the global image, and perform the same scaling process on the first feature label to obtain Split tags.
  • the second processing is performed on the sample image of the sample data to obtain a partial image including partial image information of the sample image, and at the same time, the same second processing is performed on the second feature label corresponding to the sample image to obtain the keying corresponding to the partial image Label.
  • the sample image may be partially cropped to obtain a partial image including partial image information of the sample image, and the same partial cropping may be performed on the second feature label to obtain the matte label.
  • the matting network perform matting processing based on the trimap and the partial image, to obtain a matting result.
  • the matting result may indicate a matting result for a reserved region in the sample.
  • the obtained global image including global image information and the first label are used to train the first sub-network, and the local image including local image information and The second label trains the second sub-network to improve the joint training effect and reduce the risk of network effect degradation.
  • the generation of soft-trimap adopts the method of probability conversion processing, which can assist the network training to a certain extent and have a better effect.
  • soft-trimap can be adaptively adjusted during network training. For example, in the process of adjusting the network parameters of the semantic segmentation network according to the difference between the semantic probability map and the segmentation label, and adjusting the network parameters of the matting network based on the difference between the matting result and the matting label, the network parameters of the semantic segmentation network The parameters will be updated, and thus the semantic probability map output by the semantic segmentation network will also be updated.
  • the soft-trimap is generated based on the semantic probability map. Therefore, the update of the semantic probability map will bring the update of the three-part map soft-trimap, and then the matting result will also be updated. That is, in the network training process, it is usually iterated multiple times, and after each iteration, if the parameters of the semantic segmentation network are updated, even if the input is the same image, the semantic probability map, soft-trimap and matting results will be adaptive. Update, and continue to adjust the network parameters according to the updated results. This method of adaptively adjusting the soft-trimap will help to dynamically optimize the generated soft-trimap and matting results along with the adjustment of the semantic segmentation network, so that the training effect of the final model is better and more accurate. Extract into preserved regions in the target image.
  • Fig. 11 illustrates an image processing device, which can be applied to implement the image processing method of any embodiment of the present disclosure.
  • the apparatus may include: a map matting module 1110 , a determination module 1120 , an adjustment module 1130 and a fusion module 1140 .
  • the device 1100 includes:
  • the matting module 1110 is configured to perform matting processing on the original image to be processed to obtain a matting result, the matting result including a first image and a transparency map corresponding to the original image; the first image includes the A reserved area in the original image, the reserved area is the foreground or background in the original image; the numerical value of the pixel in the transparency map indicates the transparency of the pixel;
  • a determination module 1120 configured to determine the color difference between the reserved area and the target area according to the difference between the pixels in the first image and the pixels in the material image containing the target area; the target area for replacing non-preserved regions in said original image;
  • An adjustment module 1130 configured to adjust the hue of the pixels in the first image according to the color difference to obtain a second image that matches the hue of the target area;
  • the fusion module 1140 is configured to perform image fusion on the second image and the material image based on the transparency map to obtain a target image.
  • the determining module 1120 is specifically configured to:
  • the color difference between the reserved area and the target area is determined based on the difference between the pixel value of the first sampling point and the pixel value of the second sampling point.
  • the device 1100 also includes:
  • a module is used to sample pixels in the transparency map to obtain a third sampling point
  • the determining module 1120 is specifically used for:
  • the color difference between the reserved area and the target area is determined according to the difference between the first pixel mean value and the second pixel mean value.
  • the adjustment module 1130 is specifically used to:
  • the fusion module 1140 is specifically used for:
  • the map-cutting module 1110 is specifically used for:
  • the value corresponding to the pixel indicates that the pixel belongs to any of the reserved area, non-reserved area or undetermined area in the original image the probability of a region;
  • the map-cutting module 1110 is specifically used for:
  • Probability conversion processing is performed based on the semantic probability map to obtain a tripartite map corresponding to the original image.
  • the map-cutting module 1110 is specifically used for:
  • Performing image matting processing according to the trimap and the original image includes: performing matting processing according to the trimap and the original image through a matting network.
  • the map-cutting module 1110 is specifically used for:
  • a probability conversion is performed based on the first probability of the pixel to obtain a second probability that the pixel belongs to a region to be determined in the tripartite map;
  • the trimap is generated according to the first probability and the second probability of each pixel in the semantic probability map.
  • the first probability of each pixel in the semantic probability map represents the higher the probability that the pixel belongs to the foreground or the background, and the second probability obtained through probability conversion represents that the pixel belongs to the tripartite map The lower the probability of the area to be determined;
  • Generating the trimap according to the first probability and the second probability of each pixel in the semantic probability map includes: for each pixel in the original image, according to the corresponding Perform probability fusion of the first probability and the second probability to determine the value corresponding to the pixel in the tripartite map.
  • the map-cutting module 1110 is specifically used for:
  • the values of the pixels in the initial transparency map are adjusted to obtain the transparency map corresponding to the original image.
  • the device 1100 also includes:
  • a scaling module configured to scale the original image
  • the map-cutting module 1110 is specifically used for:
  • the first image is obtained according to the enlarged retained area residual and the original image.
  • the non-preserved area includes a sky area in the original image; the target area includes a sky area in the material image.
  • Embodiments of the image processing apparatus shown in the present disclosure can be applied to electronic equipment. Accordingly, the present disclosure discloses an electronic device, which may include: a processor.
  • Memory used to store processor-executable instructions.
  • the processor is configured to call executable instructions stored in the memory to implement the image processing method shown in any one of the foregoing embodiments.
  • FIG. 12 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present disclosure.
  • the electronic device may include a processor for executing instructions, a network interface for connecting to a network, a memory for storing operation data for the processor, and a memory for storing instructions corresponding to the image processing device. volatile memory.
  • the embodiment of the device may be implemented by software, or by hardware or a combination of software and hardware.
  • software implementation as an example, as a device in a logical sense, it is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory for operation by the processor of the electronic device where it is located.
  • the electronic device where the device in the embodiment is usually based on the actual function of the electronic device can also include other Hardware, no more details on this.
  • the corresponding instructions of the image processing device may also be directly stored in the memory, which is not limited herein.
  • the present disclosure provides a computer-readable storage medium, the storage medium stores a computer program, and the computer program can be used to make a processor execute the image processing method shown in any one of the foregoing embodiments.
  • one or more embodiments of the present disclosure may be provided as a method, system or computer program product. Accordingly, one or more embodiments of the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of the present disclosure may employ a computer implemented on one or more computer-usable storage media (which may include, but are not limited to, disk storage, CD-ROM, optical storage, etc.) with computer-usable program code embodied therein. The form of the Program Product.
  • Embodiments of the subject matter and functional operations described in this disclosure can be implemented in digital electronic circuitry, tangibly embodied computer software or firmware, computer hardware that may include the structures disclosed in this disclosure and their structural equivalents, or their A combination of one or more of them.
  • Embodiments of the subject matter described in this disclosure can be implemented as one or more computer programs, i.e. one or more of computer program instructions encoded on a tangible, non-transitory program carrier for execution by or to control the operation of data processing apparatus. Multiple modules.
  • the program instructions may be encoded on an artificially generated propagated signal, such as a machine-generated electrical, optical or electromagnetic signal, which is generated to encode and transmit information to a suitable receiver device for transmission by the data
  • the processing means executes.
  • a computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
  • the processes and logic flows described in this disclosure can be performed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating on input data and generating output.
  • the processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, such as an FPGA (Field Programmable Gate Array) or an ASIC (Application Specific Integrated Circuit).
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • Computers suitable for the execution of a computer program may include, for example, general and/or special purpose microprocessors, or any other type of central processing unit.
  • a central processing unit will receive instructions and data from a read only memory and/or a random access memory.
  • the basic components of a computer may include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data.
  • a computer will also include, or be operatively coupled to, one or more mass storage devices for storing data, such as magnetic or magneto-optical disks, or optical disks, to receive data therefrom or Send data to it, or both.
  • mass storage devices for storing data, such as magnetic or magneto-optical disks, or optical disks, to receive data therefrom or Send data to it, or both.
  • a computer is not required to have such a device.
  • a computer may be embedded in another device such as a mobile phone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a device such as a Universal Serial Bus (USB) ) portable storage devices like flash drives, to name a few.
  • PDA personal digital assistant
  • GPS Global Positioning System
  • USB Universal Serial Bus
  • Computer-readable media suitable for storing computer program instructions and data may include all forms of non-volatile memory, media and memory devices and may include, for example, semiconductor memory devices such as EPROM, EEPROM and flash memory devices, magnetic disks such as internal hard drives or removable disks), magneto-optical disks, and CD ROM and DVD-ROM disks.
  • semiconductor memory devices such as EPROM, EEPROM and flash memory devices
  • magnetic disks such as internal hard drives or removable disks
  • magneto-optical disks and CD ROM and DVD-ROM disks.
  • the processor and memory can be supplemented by, or incorporated in, special purpose logic circuitry.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Processing (AREA)

Abstract

Des modes de réalisation de la présente divulgation concernent un procédé et un appareil de traitement d'image, un dispositif électronique et un support de stockage. Le procédé consiste à : effectuer un traitement de montage sur sous-carte d'une image d'origine pour obtenir un résultat de montage sur sous-carte, le résultat de montage sur sous-carte comprenant une première image et une carte de transparence correspondant à l'image d'origine; déterminer une différence de couleur entre une zone réservée et une zone cible en fonction d'une différence entre des pixels dans la première image et des pixels dans une image de matériau contenant la zone cible; effectuer un ajustement de tonalité sur les pixels dans la première image en fonction de la différence de couleur pour obtenir une seconde image correspondant à la tonalité de la zone cible; et sur la base de la carte de transparence, effectuer une fusion d'image sur la seconde image et l'image de matériau pour obtenir une image cible.
PCT/CN2022/125012 2021-10-29 2022-10-13 Traitement d'image WO2023071810A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111273984.7A CN113920032A (zh) 2021-10-29 2021-10-29 图像处理方法、装置、电子设备及存储介质
CN202111273984.7 2021-10-29

Publications (1)

Publication Number Publication Date
WO2023071810A1 true WO2023071810A1 (fr) 2023-05-04

Family

ID=79243957

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/125012 WO2023071810A1 (fr) 2021-10-29 2022-10-13 Traitement d'image

Country Status (2)

Country Link
CN (1) CN113920032A (fr)
WO (1) WO2023071810A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117522717A (zh) * 2024-01-03 2024-02-06 支付宝(杭州)信息技术有限公司 一种图像的合成方法、装置及设备
CN117928565A (zh) * 2024-03-19 2024-04-26 中北大学 一种复杂遮挡环境下的偏振导航定向方法

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113920032A (zh) * 2021-10-29 2022-01-11 上海商汤智能科技有限公司 图像处理方法、装置、电子设备及存储介质
CN114615443A (zh) * 2022-03-15 2022-06-10 维沃移动通信有限公司 图像处理方法及其装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170294000A1 (en) * 2016-04-08 2017-10-12 Adobe Systems Incorporated Sky editing based on image composition
CN110335277A (zh) * 2019-05-07 2019-10-15 腾讯科技(深圳)有限公司 图像处理方法、装置、计算机可读存储介质和计算机设备
CN111179282A (zh) * 2019-12-27 2020-05-19 Oppo广东移动通信有限公司 图像处理方法、图像处理装置、存储介质与电子设备
CN111275729A (zh) * 2020-01-17 2020-06-12 新华智云科技有限公司 精分割天空区域的方法及***、图像换天的方法及***
CN113920032A (zh) * 2021-10-29 2022-01-11 上海商汤智能科技有限公司 图像处理方法、装置、电子设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170294000A1 (en) * 2016-04-08 2017-10-12 Adobe Systems Incorporated Sky editing based on image composition
CN110335277A (zh) * 2019-05-07 2019-10-15 腾讯科技(深圳)有限公司 图像处理方法、装置、计算机可读存储介质和计算机设备
CN111179282A (zh) * 2019-12-27 2020-05-19 Oppo广东移动通信有限公司 图像处理方法、图像处理装置、存储介质与电子设备
CN111275729A (zh) * 2020-01-17 2020-06-12 新华智云科技有限公司 精分割天空区域的方法及***、图像换天的方法及***
CN113920032A (zh) * 2021-10-29 2022-01-11 上海商汤智能科技有限公司 图像处理方法、装置、电子设备及存储介质

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117522717A (zh) * 2024-01-03 2024-02-06 支付宝(杭州)信息技术有限公司 一种图像的合成方法、装置及设备
CN117522717B (zh) * 2024-01-03 2024-04-19 支付宝(杭州)信息技术有限公司 一种图像的合成方法、装置及设备
CN117928565A (zh) * 2024-03-19 2024-04-26 中北大学 一种复杂遮挡环境下的偏振导航定向方法
CN117928565B (zh) * 2024-03-19 2024-05-31 中北大学 一种复杂遮挡环境下的偏振导航定向方法

Also Published As

Publication number Publication date
CN113920032A (zh) 2022-01-11

Similar Documents

Publication Publication Date Title
WO2023071810A1 (fr) Traitement d'image
CN108388882B (zh) 基于全局-局部rgb-d多模态的手势识别方法
KR102135478B1 (ko) 딥러닝 기반 가상 헤어 염색방법 및 시스템
US10839585B2 (en) 4D hologram: real-time remote avatar creation and animation control
CN107771336B (zh) 基于颜色分布的图像中的特征检测和掩模
Yang et al. Semantic portrait color transfer with internet images
EP1298933A2 (fr) Système de télécommunication vidéo
WO2023066099A1 (fr) Traitement de matage
EP2556660A1 (fr) Une methode de detourage en temps reel d'une entite reelle enregistree dans une sequence video
CN112308977B (zh) 视频处理方法、视频处理装置和存储介质
CN114445562A (zh) 三维重建方法及装置、电子设备和存储介质
CN107766803B (zh) 基于场景分割的视频人物装扮方法、装置及计算设备
CN113689372A (zh) 图像处理方法、设备、存储介质及程序产品
CN116917938A (zh) 整个身体视觉效果
CN111402118B (zh) 图像替换方法、装置、计算机设备和存储介质
KR102181144B1 (ko) 이미지 딥러닝 기반 성별 인식 방법
CN110827341A (zh) 一种图片深度估计方法、装置和存储介质
CN117218246A (zh) 图像生成模型的训练方法、装置、电子设备及存储介质
CN113822798B (zh) 生成对抗网络训练方法及装置、电子设备和存储介质
CN117136381A (zh) 整个身体分割
CN108171716B (zh) 基于自适应跟踪框分割的视频人物装扮方法及装置
CN115039137A (zh) 基于亮度估计渲染虚拟对象的方法、用于训练神经网络的方法以及相关产品
CN112016548A (zh) 一种封面图展示方法及相关装置
CN108010038B (zh) 基于自适应阈值分割的直播服饰装扮方法及装置
CN113657403B (zh) 图像处理方法及图像处理网络的训练方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22885689

Country of ref document: EP

Kind code of ref document: A1