WO2022042470A1 - Procédé de décomposition d'image et appareil et dispositif associés - Google Patents

Procédé de décomposition d'image et appareil et dispositif associés Download PDF

Info

Publication number
WO2022042470A1
WO2022042470A1 PCT/CN2021/114023 CN2021114023W WO2022042470A1 WO 2022042470 A1 WO2022042470 A1 WO 2022042470A1 CN 2021114023 W CN2021114023 W CN 2021114023W WO 2022042470 A1 WO2022042470 A1 WO 2022042470A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
normal vector
feature map
decomposed
scene
Prior art date
Application number
PCT/CN2021/114023
Other languages
English (en)
Chinese (zh)
Inventor
章国锋
鲍虎军
罗俊丹
黄昭阳
李易瑾
周晓巍
Original Assignee
浙江商汤科技开发有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浙江商汤科技开发有限公司 filed Critical 浙江商汤科技开发有限公司
Publication of WO2022042470A1 publication Critical patent/WO2022042470A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Definitions

  • the present disclosure relates to the technical field of image processing, and in particular, to an image decomposition method and related devices and equipment.
  • Intrinsic image decomposition is one of the important problems in computer vision and computer graphics. Intrinsic image refers to decomposing an original image into a shading image and a reflectance/albedo image. Intrinsic images have a wide range of applications in 3D reconstruction, photorealistic image editing, augmented reality, semantic segmentation and other fields, and have a great impact.
  • the embodiments of the present disclosure provide at least an image decomposition method and related apparatus and equipment.
  • a first aspect of the embodiments of the present disclosure provides an image decomposition method, the method includes: acquiring an image to be decomposed; using a normal vector estimation model to acquire normal vector information of the image to be decomposed; based on the normal vector information, using the image decomposition model to decompose the image to be decomposed Decomposition is performed to obtain the intrinsic image of the image to be decomposed.
  • the image decomposition model can use the normal vector information to better understand the environmental conditions of the scene in the image to be decomposed, so that the intrinsic image decomposed by the image decomposition model can be decomposed with the image to be decomposed.
  • the scene is better matched, which improves the decomposition effect of the intrinsic image;
  • the normal vector information of the image to be decomposed is obtained by using a normal vector estimation model independent of the image decomposition model, and the targeted model can be used to obtain accurate normal vector information.
  • the matching degree of the eigenimage obtained by subsequent decomposition and the scene of the image to be decomposed is further improved.
  • the above-mentioned intrinsic image includes an illuminance image; the above-mentioned based on normal vector information, using an image decomposition model to decompose the to-be-decomposed image to obtain an intrinsic image of the to-be-decomposed image, including: using the image decomposition model
  • the to-be-decomposed image is processed to obtain scene illumination condition information of the to-be-decomposed image; based on the scene illumination condition information and normal vector information, an illumination rate image of the to-be-decomposed image is obtained.
  • the effect of the image decomposition model for intrinsic image decomposition in a scene with complex illumination environment can be improved.
  • the above-mentioned scene lighting condition information is a normal vector adaptation map including normal vector adaptation vectors of different pixels of the image to be decomposed
  • the above-mentioned normal vector information is a normal vector including the normal vectors of different pixels of the to-be-decomposed image. Normal vector map.
  • the above-mentioned obtaining the illumination rate image of the image to be decomposed based on the scene illumination condition information and the normal vector information includes: performing a dot product on the normal vector adaptive map and the normal vector map to obtain the illumination rate image of the to-be-decomposed image.
  • the illumination conditions that change with the change of the space can be modeled, and the effect of the image decomposition model in the intrinsic image decomposition of the scene with complex illumination environment can be improved.
  • the above-mentioned image decomposition model includes a shared encoder and an illumination rate decoder.
  • the above-mentioned use of the image decomposition model to process the image to be decomposed to obtain scene illumination condition information of the image to be decomposed includes: using a shared encoder to perform feature extraction on the image to be decomposed to obtain an image feature map, and to the image feature map and the normal vector estimation model.
  • the first scene structure feature map output by the normal vector encoder is fused to obtain a first fused feature map; the first fused feature map is decoded by an illumination rate decoder to obtain scene illumination condition information of the image to be decomposed.
  • the image decomposition model can use the structure feature information of the first scene structure feature map to improve The decomposition effect of the eigenimage is obtained.
  • the above-mentioned shared encoder comprises at least one coding unit connected in sequence, each coding unit comprising a normal vector adaptor.
  • the above-mentioned fusion of the image feature map and the first scene structure feature map output by the normal vector encoder of the normal vector estimation model to obtain the first fusion feature map includes: outputting the image feature map to the first coding unit; encoding units: use the normal vector adaptor to fuse the feature map output by the previous encoding unit and the first scene structure feature map to obtain the second fusion feature map corresponding to the encoding unit; wherein, the scene structure corresponding to each encoding unit The feature richness in the feature map is different; based on the second fused feature map of the last coding unit, the first fused feature map is obtained.
  • the image decomposition model can subsequently use the scene structure feature map.
  • the scene structure information about the scene in the image to be decomposed realizes the effect of passing the feature information obtained by the normal vector estimation model to the image decomposition model for use.
  • the method before using the normal vector adaptor to fuse the feature map output by the previous coding unit and the scene structure feature map to obtain the second fused feature map corresponding to the coding unit, the method further includes: Perform down-sampling processing on the feature map output by the previous coding unit; and/or, use a normal vector adaptor to fuse the feature map output by the previous coding unit and the scene structure feature map to obtain the second fusion feature corresponding to the coding unit image, including: using a normal vector adaptor to perform: adjusting the scene structure feature map to a scene structure feature map of a preset scale, and concatenating and convolving the adjusted scene structure feature map with the feature map output by the previous coding unit , to obtain the second fusion feature map corresponding to the coding unit.
  • the normal vector adaptor also concatenates and convolves the scene structure feature map with the feature map output by the previous coding unit, and realizes the fusion of the scene structure feature map and the feature map output by the previous coding unit.
  • using the illumination rate decoder to decode the first fusion feature map to obtain scene illumination condition information of the image to be decomposed includes: using the illumination rate decoder to decode the first fusion feature map and at least one The second fusion feature map of the normal vector adaptor is decoded to obtain scene illumination condition information of the image to be decomposed.
  • the illumination rate decoder can obtain the scene illumination condition information of the image to be decomposed by using the first fusion feature map and the second fusion feature map output by the normal vector adaptor.
  • the above-mentioned image decomposition model further includes a reflectivity decoder; based on the normal vector information, using the image decomposition model to decompose the to-be-decomposed image to obtain an intrinsic image of the to-be-decomposed image, further comprising: using the reflectivity The decoder decodes the first fusion feature map to obtain a reflectivity image of the image to be decomposed.
  • the reflectivity decoder can obtain the reflectivity image of the image to be decomposed by using the first fusion feature map.
  • using the reflectivity decoder to decode the first fusion feature map to obtain the reflectivity image of the image to be decomposed includes: using the reflectivity decoder to decode the first fusion feature map and at least one method The second fusion feature map of the vector adaptor is decoded to obtain a reflectivity image of the image to be decomposed.
  • the reflectivity decoder can obtain the reflectivity image of the image to be decomposed by using the first fused feature map and the second fused feature map of the at least one normal vector adaptor.
  • the above-mentioned normal vector estimation model includes a normal vector encoder, a normal vector decoder and a sub-network.
  • the above-mentioned use of the normal vector estimation model to obtain the normal vector information of the image to be decomposed includes: using a normal vector encoder to encode the image to be decomposed to obtain a first scene structure feature map; using a normal vector decoder to perform the first scene structure feature map. Decode to obtain a decoded feature map; use the sub-network to fuse the first scene structure feature map and the decoded feature map to obtain normal vector information of the image to be decomposed.
  • the normal vector information of the to-be-decomposed image can be obtained by processing the to-be-decomposed image by using the normal vector encoder, the normal vector decoder and the subdivided sub-network of the normal vector estimation model.
  • using the normal vector encoder to encode the to-be-decomposed image to obtain the first scene structure feature map includes: using the normal vector encoder to perform multi-layer encoding on the to-be-decomposed image to obtain the first scene corresponding to each layer.
  • the above-mentioned use of the sub-molecular network to fuse the first scene structure feature map and the decoding feature map to obtain the normal vector information of the image to be decomposed includes: using the sub-molecular network to perform: connecting the first scene structure feature maps corresponding to each layer in series to obtain The second scene structure feature map is concatenated with the decoded feature map to obtain a third scene structure feature map, and based on the third scene structure feature map, normal vector information of the image to be decomposed is obtained.
  • the above-mentioned normal vector estimation model and image decomposition model are obtained by training separately.
  • the method before using the normal vector estimation model to obtain the normal vector information of the image to be decomposed, the method further includes: using a first sample set to train to obtain a normal vector estimation model, wherein the first sample set The images are marked with normal vector information; use the trained normal vector estimation model to obtain the sample normal vector information of the images in the second sample set, and use the second sample set and the sample normal vector information to train the image decomposition model.
  • the above-mentioned second sample set includes a first sub-sample set and a second sub-sample set
  • the image decomposition model is trained by using the second sample set and sample normal vector information, including: using the first sub-sample set
  • the sample set and the sample normal vector information corresponding to the first sub-sample set are used to train the image decomposition model to adjust the parameters of the shared encoder and the illumination rate decoder in the image decomposition model; the second sub-sample set and the second sub-sample set are used.
  • the corresponding sample normal vector information trains the image decomposition model to adjust the parameters of the shared encoder and reflectivity decoder in the image decomposition model.
  • the image decomposition model can obtain better illuminance maps and reflectance maps when the image to be decomposed is decomposed.
  • a second aspect of the embodiments of the present disclosure provides an image decomposition device, the device includes an acquisition module, a normal vector estimation module, and a decomposition module; the acquisition module is configured to acquire an image to be decomposed; the normal vector estimation module is configured to obtain an image by using a normal vector estimation model normal vector information of the image to be decomposed; the decomposition module is configured to decompose the image to be decomposed by using an image decomposition model based on the normal vector information to obtain an intrinsic image of the image to be decomposed.
  • a third aspect of the embodiments of the present disclosure provides an electronic device, including a mutually coupled memory and a processor, where the processor is configured to execute program instructions stored in the memory to implement the image decomposition method in the first aspect.
  • a fourth aspect of the embodiments of the present disclosure provides a computer-readable storage medium on which program instructions are stored, and when the program instructions are executed by a processor, the image decomposition method in the first aspect is implemented.
  • a fifth aspect of the embodiments of the present disclosure provides a computer program, including computer-readable codes, when the computer-readable codes are executed in an electronic device, a processor in the electronic device executes and is configured to implement the above-mentioned first aspect The image decomposition method in .
  • the image decomposition model can use the normal vector information to better understand the environment of the scene in the image to be decomposed, so that the intrinsic image decomposed by the image decomposition model can be decomposed with the to-be-decomposed image.
  • the scene of the image is better matched, which improves the decomposition effect of the intrinsic image;
  • the normal vector information of the image to be decomposed is obtained by using a normal vector estimation model independent of the image decomposition model, and the targeted model can be used to obtain accurate normal vectors information, which further improves the matching degree between the eigenimage obtained by subsequent decomposition and the scene of the image to be decomposed.
  • FIG. 1 is a schematic flowchart 1 of an image decomposition method according to an embodiment of the present disclosure
  • FIG. 2 is a second schematic flowchart of an image decomposition method according to an embodiment of the present disclosure
  • FIG. 3 is a schematic flowchart of obtaining normal vector information of an image to be decomposed by using a normal vector estimation model in an image decomposition method according to an embodiment of the present disclosure
  • FIG. 4 is a schematic diagram of a framework of a normal vector estimation model in an image decomposition method according to an embodiment of the present disclosure
  • FIG. 5 is a schematic flowchart of obtaining a first fusion feature map in an image decomposition method according to an embodiment of the present disclosure
  • FIG. 6 is a schematic diagram of a framework of an image decomposition model in an image decomposition method according to an embodiment of the present disclosure
  • FIG. 7 is a schematic frame diagram of an image decomposition apparatus according to an embodiment of the present disclosure.
  • FIG. 8 is a schematic diagram of a framework of an electronic device according to an embodiment of the present disclosure.
  • FIG. 9 is a schematic diagram of a framework of a computer-readable storage medium according to an embodiment of the present disclosure.
  • Intrinsic image decomposition aims to estimate the illumination rate of the scene and the reflectivity of the material from a single input image, that is, to obtain the illumination rate image and the reflectivity image.
  • a device for implementing the image decomposition method may be a computer or a server or other device.
  • the image decomposition method may be implemented by a processor invoking computer-readable instructions stored in a memory.
  • FIG. 1 is a schematic flow chart 1 of an image decomposition method according to an embodiment of the present disclosure. As shown in Figure 1, the following steps may be included:
  • Step S11 Acquire an image to be decomposed.
  • the image to be decomposed is used as the original input image to decompose the corresponding intrinsic image.
  • the image to be decomposed may be a color image, or a depth image or the like.
  • Step S12 Obtain normal vector information of the image to be decomposed by using the normal vector estimation model.
  • the normal vector estimation model is a neural network built based on deep learning, which is used to extract feature information from the image to be decomposed to obtain the normal vector information of the image to be decomposed.
  • the normal vector estimation model can obtain several feature maps by extracting feature information from the image to be decomposed.
  • the normal vector information is, for example, the normal vector information of each pixel in the image to be decomposed. Through the normal vector information, environmental information in the image to be input can be obtained, such as the structure information of the scene in the image to be decomposed.
  • the normal vector estimation model is a fully convolutional neural network, which can be composed of a coarse-grained to fine-grained two-level network structure.
  • the two-layer network can fuse feature maps of multiple scales (different feature numbers, different image resolutions) to obtain eigenimages with higher resolution, richer details, and more accurate object boundaries in the image.
  • Step S13 Based on the normal vector information, use an image decomposition model to decompose the image to be decomposed to obtain an intrinsic image of the image to be decomposed.
  • the image decomposition model can use the normal vector information to decompose the input image.
  • the image decomposition model can decompose the to-be-decomposed image based on the normal vector information of each pixel in the normal vector information and the structure information of the scene contained in the normal vector information to obtain an intrinsic image, that is, to obtain an illumination rate image. and reflectance images.
  • the image decomposition model is a fully convolutional neural network.
  • the image decomposition model can use the normal vector information to better understand the environmental conditions of the scene in the image to be decomposed, so that the intrinsic image decomposed by the image decomposition model can be decomposed with the image to be decomposed.
  • the scene is better matched, which improves the decomposition effect of the intrinsic image;
  • the normal vector information of the image to be decomposed is obtained by using a normal vector estimation model independent of the image decomposition model, and the targeted model can be used to obtain accurate normal vector information.
  • the matching degree of the eigenimage obtained by subsequent decomposition and the scene of the image to be decomposed is further improved.
  • FIG. 2 is a second schematic flowchart of an image decomposition method according to an embodiment of the present disclosure. As shown in Figure 2, the following steps may be included:
  • Step S21 Acquire the image to be decomposed.
  • Step S22 Obtain normal vector information of the image to be decomposed by using the normal vector estimation model.
  • the normal vector information is a normal vector map including normal vectors of different pixels of the image to be decomposed, that is, each pixel in the image to be decomposed has a corresponding normal vector.
  • the normal vector estimation model includes a normal vector encoder, a normal vector decoder and a sub-network.
  • the normal vector encoder can perform feature extraction on the decomposed image
  • the normal vector decoder can decode the features and output a feature map
  • the subdivision sub-network can refine the output of the decoder.
  • FIG. 3 is a schematic flowchart of obtaining normal vector information of an image to be decomposed by using a normal vector estimation model in an image decomposition method according to an embodiment of the present disclosure.
  • using the normal vector estimation model to obtain the normal vector information of the image to be decomposed may include the following steps S221 to S223 .
  • Step S221 use a normal vector encoder to encode the image to be decomposed to obtain a first scene structure feature map.
  • the normal vector encoder of the normal vector estimation model can be used to encode the to-be-decomposed image, and to extract feature information in the to-be-decomposed image.
  • the feature information obtained by encoding the image to be decomposed by the normal vector encoder is, for example, structural feature information of the scene in the image to be decomposed, and the structural feature information includes, for example, plane information and object boundary information.
  • the normal vector encoder can output the first scene structure feature map, that is, the structure feature map about the scene in the image to be decomposed.
  • the normal vector encoder can be used to perform multi-layer encoding (ie, feature extraction) on the image to be decomposed, and the feature map obtained by each layer of the encoder is the first scene structure feature map.
  • multi-layer encoding ie, feature extraction
  • the first layer of encoding blocks encodes the to-be-decomposed image and outputs the first scene structure feature map.
  • the second-layer coding block takes the first scene structure feature map output by the first-layer coding block as input, performs coding again, and outputs the corresponding first scene structure feature map.
  • the feature richness in the first scene structure feature map output by the coding block of each layer of the normal vector encoder can also be set to be different.
  • the feature richness may include the resolution of the first scene structure feature map, the dimension of feature information, and the like.
  • the first scene structure feature map corresponding to the coding block of the last layer is output to the normal vector decoder.
  • Step S222 Use the normal vector decoder to decode the first scene structure feature map to obtain the decoded feature map.
  • the normal vector decoder may be used to decode the first scene structure feature map to obtain the decoded feature map.
  • the normal vector decoder when it decodes the feature map of the first scene structure, it can decode the feature information extracted by the normal vector encoder, and reconstruct a decoded feature map of a preset dimension and a preset resolution.
  • the dimension of the feature information in the decoded feature map may be 64 dimensions, and the resolution is 1/2 of the image to be decomposed.
  • the normal vector decoder when the normal vector decoder has a multi-layer structure, the normal vector decoder also performs multi-layer decoding on the feature map of the first scene structure, and the first layer decoder also performs multi-layer decoding on the first scene structure.
  • the feature map is decoded, and the corresponding pre-decoded feature map is output.
  • the second layer decoder decodes the decoded feature map output by the first layer decoder, and outputs the corresponding pre-decoded feature map.
  • the pre-decoded feature map output by the last layer is the decoded feature map.
  • Step S223 Fusion of the first scene structure feature map and the decoded feature map by using a fine molecular network to obtain normal vector information of the image to be decomposed.
  • the subdivision network can be used to decode the first scene structure feature map and decoding
  • the feature maps are fused to obtain the normal vector information of the image to be decomposed.
  • the feature information in the first scene structure feature map and the feature information in the decoded feature map may be fused to obtain normal vector information of the image to be decomposed.
  • the feature information in the first scene structure feature map and the feature information in the decoded feature map are both 64-dimensional, and the normal vector information obtained after fusion may be 128-dimensional.
  • the normal vector information is a normal vector map including normal vectors of different pixels of the image to be decomposed, that is, each pixel in the image to be decomposed has a corresponding normal vector.
  • the sub-network in the case where the normal vector encoder has a multi-layer structure, can be used to concatenate the first scene structure feature maps corresponding to each layer to obtain the second scene structure feature map, and the second scene structure feature map can be obtained The scene structure feature map is concatenated with the decoding feature map to obtain a third scene structure feature map.
  • it may also be configured to use the first scene structure feature map output by the partial encoding layer of the normal vector encoder to perform concatenation.
  • a refinement sub-network may be used to process the first scene structure feature map output by each layer of the encoder, so that each second scene structure feature map has the same feature dimension and resolution.
  • the refinement sub-network may further decode based on the feature information of the third scene structure feature map to obtain normal vector information of the image to be decomposed, such as a normal vector map.
  • FIG. 4 is a schematic diagram of a framework of a normal vector estimation model in an image decomposition method according to an embodiment of the present disclosure.
  • the normal vector estimation model 400 includes: a normal vector encoder 401 , a normal vector decoder 402 and a refinement sub-network 403 .
  • the normal vector encoder 401 includes at least one initial convolution block 4011 (referred to as conv1, which may include three convolution layers and a maximum pooling layer), and four including squeeze-and-excitation blocks (Squeeze-and-Excitation block, SE block) coding block 4012.
  • the initial convolution block 4011 can perform preliminary coding on the to-be-decomposed image, and output the feature map to the coding block 4012 .
  • the encoding block 4012 compresses the resolution of the feature map to 1/4, 1/8, 1/16, and 1/32 of the original input image while gradually extracting features of higher dimensions.
  • Each encoding block 4012 outputs the first scene structure feature map, and the first scene structure feature map output by the last encoding block 4012 is output to the normal vector decoder 402 .
  • the normal vector decoder 402 includes a convolution block 4021 (denoted as conv2) and 4 up-projection blocks 4022 (denoted as up-projection block 5 to up-projection block 8). These 4 up-projection blocks 4022 The features are decoded step by step and a decoded feature map with a dimension of 64 and a resolution of 1/2 of the image to be decomposed is reconstructed.
  • the refinement sub-network 403 includes 4 up-projection blocks 4031 (denoted as up-projection block 1 to up-projection block 4) and 4 convolutional layers 4032 (denoted as conv3 to conv6).
  • the first scene feature map extracted by the encoding block 4012 is concatenated using the skip-connection and the up-projection block 4031 to obtain the second scene structure feature map.
  • the second scene structure feature map and the decoded feature map are then concatenated to obtain a third scene structure feature map.
  • four convolution layers 4032 are used to perform layer-by-layer decoding, and finally the normal vector information of the image to be decomposed, that is, the normal vector map, is obtained.
  • the image to be decomposed After obtaining the normal vector information of the image to be decomposed, the image to be decomposed can be decomposed by using the obtained normal vector information to obtain the intrinsic image of the image to be decomposed.
  • the above-mentioned steps of "decomposing an image to be decomposed by using an image decomposition model based on normal vector information to obtain an intrinsic image of the image to be decomposed" include the following steps.
  • Step S23 Use the image decomposition model to process the image to be decomposed to obtain scene lighting condition information of the image to be decomposed.
  • the image decomposition model can be, for example, a fully convolutional neural network.
  • the image decomposition model can perform a feature extraction operation on the image to be decomposed, and obtain scene illumination condition information of the image to be decomposed.
  • the scene illumination condition information can be understood as the illumination condition of the scene in the image to be decomposed.
  • the scene lighting condition information is a normal vector adaptation map including normal vector adaptation vectors of different pixels of the image to be decomposed. Normal vector adaptation maps can be used to encode scene lighting conditions.
  • the image decomposition model includes a shared encoder, an illumination rate decoder, and a reflectivity decoder. Use the image decomposition model to process the image to be decomposed to obtain scene lighting condition information of the image to be decomposed, which may specifically include the following steps:
  • Step S231 use the shared encoder to perform feature extraction on the image to be decomposed to obtain an image feature map, and fuse the image feature map and the first scene structure feature map output by the normal vector encoder of the normal vector estimation model to obtain a first fusion feature map .
  • the feature information extracted by the shared encoder is used to obtain both the illuminance image and the reflectance image.
  • the first fusion feature map obtained after fusion may include structural feature information and other feature information of the scene in the image to be decomposed.
  • the shared encoder includes at least one coding unit connected in sequence, and each coding unit includes a normal vector adaptor (Normal Feature Adapter, NFA).
  • NFA Normal Feature Adapter
  • FIG. 5 is a schematic flowchart of obtaining a first fusion feature map in an image decomposition method according to an embodiment of the present disclosure.
  • the image feature map and the scene structure feature map output by the normal vector encoder of the normal vector estimation model are fused to obtain the first fused feature map, which may specifically include the following steps S2311 to S2313 .
  • Step S2311 Output the image feature map to the first coding unit.
  • other encoders of the image decomposition model can first perform feature extraction on the image to be decomposed to obtain an image feature map. Then, output the image feature map to the first coding unit. The image feature map is further processed by the first coding unit.
  • Step S2312 each coding unit uses a normal vector adaptor to fuse the feature map and the scene structure feature map output by the previous coding unit to obtain the second fusion feature map corresponding to the coding unit; wherein, the scene corresponding to each coding unit Feature richness differs in structural feature maps.
  • a normal vector adaptor may be used to fuse the feature map output by the previous coding unit and the scene structure feature map to obtain a second fused feature map corresponding to the coding unit.
  • the feature richness in the scene structure feature map corresponding to each coding unit is different. Different feature richness can be understood as the resolution of the scene structure feature map and the dimension of the feature information.
  • the first coding unit For the first coding unit, it obtains the image feature map after feature extraction by other convolution blocks.
  • the acquired image feature map is the second fusion feature map output by the first coding unit.
  • the normal vector encoder of the normal vector estimation model When the normal vector encoder of the normal vector estimation model has only one layer, it means that the normal vector encoder only outputs a first scene structure feature map. At this time, all coding units can use the unique first scene structure feature map. It is fused with the feature map output by the previous coding unit. When the normal vector encoder of the normal vector estimation model has multiple layers, the first scene structure feature map output by the multi-layer normal vector encoder can be fused with the feature map output by the coding unit.
  • the first scene structure feature map obtained by the normal vector encoder of the layer is output to the first coding unit, and the first scene structure feature map obtained by the normal vector encoder of the second layer is output to the second coding unit, so that the second Each coding unit can use the feature map output by the previous coding unit to fuse with the first scene structure feature map obtained by the normal vector encoder of the second layer.
  • the normal vector adaptor fuses the feature map output by the previous coding unit and the scene structure feature map to obtain a second fused feature map corresponding to the coding unit, including: the normal vector adaptor fuses the scene
  • the structure feature map is adjusted to a scene structure feature map of a preset scale, for example, the resolution of the scene structure feature map and the dimension of feature information are adjusted.
  • the normal vector adaptor then concatenates and convolves the adjusted scene structure feature map and the feature map output by the previous coding unit to obtain a second fusion feature map corresponding to the coding unit.
  • the normal vector adaptor of the second coding unit can concatenate and convolve the second fusion feature map output by the first coding unit and the scene structure feature map input to it, so as to obtain a The corresponding second fusion feature map.
  • the normal vector adaptor also concatenates and convolves the scene structure feature map with the feature map output by the previous coding unit, so as to realize the fusion of the scene structure feature map and the feature map output by the previous coding unit.
  • each coding unit before using the normal vector adaptor to fuse the feature map and the scene structure feature map output by the previous coding unit to obtain the second fused feature map corresponding to the coding unit , and perform down-sampling processing on the feature map output by the previous coding unit.
  • the second encoding unit performs down-sampling processing on the second fusion feature map output by the first unit. Through the down-sampling process, the second fusion feature map can be reduced so that the second fusion feature map meets the requirements.
  • Step S2313 Obtain a first fused feature map based on the second fused feature map of the last coding unit.
  • the last layer of the shared encoder of the image decomposition model is not the last coding unit, that is, after the last coding unit of the shared encoder, there are still several coding blocks for the last coding unit.
  • After a coding unit outputs the second fused feature map it continues to encode it to further process the fused feature information.
  • the image output after the last layer of shared encoder processing is the first fusion feature map.
  • the second fused feature map output by the last coding unit can be down-sampled, the second fused feature map can be further reduced, and then the coding block is used for coding again to extract feature information. At this time, the output feature map is the first fusion feature map.
  • the second fusion feature map may also be directly used as the first fusion feature map.
  • the image decomposition model can subsequently use the scene structure feature map.
  • the scene structure information about the scene in the image to be decomposed realizes the effect of passing the feature information obtained by the normal vector estimation model to the image decomposition model for use, and improves the decomposition effect of the intrinsic image.
  • the image to be decomposed can be decomposed by using the first fusion feature map to obtain an intrinsic image.
  • Step S232 Decode the first fusion feature map by using an illumination rate decoder to obtain scene illumination condition information of the image to be decomposed.
  • an illumination rate decoder can be used to decode the first fused feature map to obtain the scene illumination condition information of the image to be decomposed, for example is to obtain the normal vector adaptation map of the normal vector adaptation vector of each pixel in the image to be decomposed.
  • the normal vector adaptation vector is defined as follows: the three components of the normal vector adaptation vector are represented by x, y, and z,
  • the decoding the first fusion feature map by the illumination rate decoder to obtain the scene illumination condition information of the image to be decomposed includes: using the illumination rate decoder to decode the first fusion feature map by using the illumination rate decoder
  • the fusion feature map and the second fusion feature map of at least one normal vector adaptor are decoded to obtain scene illumination condition information of the image to be decomposed.
  • the illumination rate decoder can simultaneously obtain the first fused feature map output by the last layer of the shared encoder of the image decomposition model, and the second fused feature map of at least one normal vector adaptor, and decode the two feature maps, In order to obtain the scene lighting condition information of the image to be decomposed.
  • the last layer of the shared encoder is a coding unit
  • the first fused feature map output by the last coding unit and the second fused feature map output by the normal vector adaptors of other coding units can be obtained for decoding.
  • the illumination rate decoder can simultaneously acquire the second fused feature maps output by the multiple coding units for decoding. For example, if the illumination rate decoder obtains the second fusion feature map output by 3 connected coding units, then in the illumination rate decoder, you can set 3 connected convolutional layers (such as up-projection blocks) to obtain 3 The second fusion feature map output by each coding unit is used for decoding.
  • the first convolutional layer of the illumination rate decoder can obtain the first fused feature map output by the shared encoder and the second fused feature map output by the first normal vector adaptor for decoding, and output the feature map.
  • the second convolutional layer of the illumination rate decoder can use the feature map output from the previous convolutional layer and the second fused feature map output from the second normal vector adaptor for decoding.
  • the convolutional layer of the illumination rate decoder uses the first fusion feature map and the second fusion feature map for decoding
  • several convolutional layers may be used for decoding to adjust the illumination rate decoding The final output of the light rate map.
  • the scene illumination condition information about the image to be decomposed for example, by obtaining the normal vector adaptation map of the normal vector adaptation vector of each pixel, the illumination conditions that change with the change of space can be modeled, and the image decomposition can be improved.
  • Step S24 Based on the scene illumination condition information and the normal vector information, an illumination rate image of the to-be-decomposed image is obtained.
  • the decomposed graph can be decomposed based on the scene illumination condition information and the normal vector information output by the normal vector estimation model to obtain the illumination rate image of the image to be decomposed.
  • a normal vector adaptation map and a normal vector map can be used to obtain the illumination rate image of the image to be decomposed.
  • a dot product of the normal vector adaptive map and the normal vector map may be performed to obtain an illumination rate image of the image to be decomposed.
  • the normal vector adaptation map makes full use of the plane information and object boundary information in the scene structure feature information provided by the normal vector estimation model, so that the illumination rate image decomposed by the image decomposition model can reduce the problem of texture residue on the plane area.
  • the object can have a clear and sharp outline, and the scene of the reflectivity image can be better matched with the scene of the image to be decomposed.
  • the image decomposition model may further include an reflectivity decoder. Because the feature information extracted by the shared encoder can be used to obtain the reflectivity image. Therefore, after the shared encoder performs feature extraction on the image to be decomposed, that is, after step S231, the following step 1 can be continued:
  • Step 1 Decode the first fusion feature map with a reflectivity decoder to obtain a reflectivity image of the image to be decomposed.
  • the output of the last layer of the shared encoder is the first fusion feature map
  • the first fusion feature map includes scene structure feature information of the scene in the image to be decomposed. Therefore, a reflectivity decoder can be used to decode the first fusion feature map to obtain a reflectivity image of the image to be decomposed.
  • using the reflectivity decoder to decode the first fusion feature map to obtain the reflectivity image of the image to be decomposed includes: using the reflectivity decoder to fuse the first fusion feature map The feature map and the second fusion feature map of at least one normal vector adaptor are decoded to obtain a reflectivity image of the image to be decomposed.
  • the reflectivity decoder can simultaneously obtain the first fused feature map output by the last layer of the shared encoder of the image decomposition model, and the second fused feature map of at least one normal vector adaptor, and decode the two feature maps, to obtain the reflectance image of the image to be decomposed.
  • the last layer of the shared encoder is a coding unit
  • the first fused feature map output by the last coding unit and the second fused feature map output by the normal vector adaptors of other coding units can be obtained for decoding.
  • the reflectivity decoder can simultaneously acquire the second fused feature maps output by the multiple coding units for decoding.
  • the reflectivity decoder obtains the second fusion feature map output by three sequentially connected coding units
  • three sequentially connected convolutional layers (such as up-projection blocks) can be set up respectively Obtain the second fusion feature map output by the three coding units for decoding.
  • the first convolutional layer of the reflectivity decoder can obtain the first fused feature map output by the shared encoder and the second fused feature map output by the first normal vector adaptor for decoding, and output the feature map.
  • the second convolutional layer of the reflectivity decoder can use the feature map output from the previous convolutional layer and the second fused feature map output from the second normal vector adaptor for decoding.
  • the convolutional layer of the reflectivity decoder uses the first fused feature map and the second fused feature map for decoding
  • several convolutional layers can be used for decoding to adjust the reflectivity decoding The reflectance map of the final output of the device.
  • the image to be decomposed is decomposed by using the first fusion feature map that contains the scene structure feature information of the scene in the image to be decomposed, and the scene structure feature information is used, and then each object of the scene in the image to be decomposed is assigned a more consistent
  • the reflectivity improves the decomposition effect of the intrinsic image.
  • FIG. 6 is a schematic frame diagram of an image decomposition model in an image decomposition method according to an embodiment of the present disclosure.
  • the image decomposition model 60 includes: a shared encoder 61 , an illumination rate decoder 62 and a reflectance decoder 63 .
  • the image decomposition model 60 is, for example, a fully convolutional neural network.
  • the shared encoder 61 includes a convolution block 611 (conv1 as shown) and several coding units 612 .
  • the encoding unit 612 includes a normal vector adaptor 6121 (eg, NFA1, NFA2, NFA3).
  • the normal vector adaptor 6121 may be linked with partial coding of the normal vector estimation model.
  • the illumination rate decoder 62 includes several convolutional blocks 621, some of which are up-projection blocks (up-projection block 1 to up-projection block 4 as shown in the figure).
  • the reflectivity decoder 63 includes several convolution blocks 631, some of which are up-projection blocks (up-projection block 5 to up-projection block 8 as shown in the figure).
  • the normal vector adaptor 6121 is skip-linked to the partial convolution block 621 of the illumination rate decoder 62 and the partial convolution block 631 of the reflectivity decoder 63, respectively.
  • the image decomposition model 60 may process the to-be-decomposed image to obtain scene illumination condition information of the to-be-decomposed image.
  • the image decomposition model 60 may also obtain the illumination rate image of the image to be decomposed based on the scene illumination condition information and the normal vector information output by the normal vector estimation model.
  • the image decomposition model 60 may also output a reflectance image.
  • the shared encoder 61 can perform feature extraction on the image to be decomposed to obtain an image feature map, and fuse the image feature map with the first scene structure feature map output by the normal vector encoder of the normal vector estimation model, and output the first fused feature map.
  • the convolution block located before the encoding unit 612 may perform feature extraction on the to-be-decomposed image to obtain the image feature map mentioned in the above embodiments.
  • the coding unit 612 can use the normal vector adaptor 6121 to fuse the feature map output by the previous coding unit and the scene structure feature map output by the encoder of the normal vector estimation model to obtain a second fused feature map corresponding to the coding unit.
  • Y represents the scene structure feature map output by the encoder of the normal vector estimation model.
  • the convolution block located after the coding unit 612 may further encode the second fused feature map output by the last coding unit, and finally output the first fused feature map.
  • the coding unit 612 may further include a down-sampling convolution block 6122 (referred to as a down-sampling block, as shown in the figure, the sampling block 1 to the down-sampling block 4), for down-sampling the feature map output by the previous coding unit deal with.
  • Illumination rate decoder 62 may include 5 convolutional blocks.
  • the last layer of convolution block 621 (conv4 as shown in the figure) outputs scene lighting condition information of the image to be decomposed, such as a normal vector adaptation map.
  • A represents the normal vector adaptation map
  • N represents the normal vector map output by the refinement sub-network.
  • the image decomposition model 60 takes the dot product of the normal vector adaptation map A and the normal vector map N, and can obtain the illuminance graph.
  • the reflectivity decoder 63 may include five convolution blocks 631, the convolution block 631 will perform layer-by-layer decoding on the first fusion structure feature map output by the shared encoder, and the last layer of the convolution block 631 (as shown in the figure) conv6) directly output the reflectance image.
  • the embodiments of the present disclosure also provide the training methods for the normal vector estimation model and the image decomposition model mentioned in the above-mentioned image decomposition method embodiments.
  • the normal vector estimation model and the image decomposition model may be trained first.
  • the normal vector estimation model contains independent normal vector encoder, normal vector decoder and sub-network. Therefore, a separate training of the normal vector estimation model can be implemented. At the same time, separate training of the image decomposition model can also be achieved.
  • the normal vector estimation model and the image decomposition model are obtained by training separately. That is, when training the normal vector estimation model and the image decomposition model, the normal vector estimation model and the image decomposition model can be trained separately.
  • the normal vector estimation model can be trained separately, so that the normal vector estimation model can be trained using only normal vector sample data.
  • the model is used to improve the decomposition effect of the intrinsic image and reduce the influence on the decomposition effect of the intrinsic image caused by the lack of intrinsic image sample data.
  • the normal vector estimation model when the normal vector estimation model is trained, may be obtained by training with a first sample set, wherein the images in the first sample set are marked with normal vector information.
  • the normal vector information is, for example, that each pixel in the image has a corresponding normal vector.
  • the first sample set includes, for example, the NYUv2 dataset and the Dense Indoor and Outdoor DEpth (DIODE) dataset.
  • the sample normal vector information of the images in the second sample set can be obtained by using the trained normal vector estimation model, and the image decomposition model can be trained by using the second sample set and the sample normal vector information.
  • the images of the second sample set may be annotated with the ground truth illuminance map and the ground truth reflectance map.
  • the second sample set is, for example, a CGI data set.
  • the second sample set includes a first sub-sample set and a second sub-sample set.
  • the images of the first sub-sample set may be marked with the ground truth value of the illuminance map, and the images of the second subset of samples may be marked with the ground truth value of the reflectivity map.
  • steps 1 and 2 may be specifically performed.
  • Step 1 The image decomposition model is trained by using the first sub-sample set and the sample normal vector information corresponding to the first sub-sample set, so as to adjust the parameters of the shared encoder and the illumination rate decoder in the image decomposition model.
  • the normal vector information corresponding to the first sub-sample set is obtained by using the trained normal vector estimation model.
  • the training of the shared encoder and the illuminance decoder in the image decomposition model can be achieved by utilizing the first subset of samples annotated with the ground truth of the illuminance map.
  • Step 2 The image decomposition model is trained by using the second sub-sample set and the sample normal vector information corresponding to the second sub-sample set, so as to adjust the parameters of the shared encoder and the reflectivity decoder in the image decomposition model.
  • the normal vector information corresponding to the second sub-sample set is obtained by using the trained normal vector estimation model.
  • the shared encoder and illuminance decoder in the image decomposition model After training the shared encoder and illuminance decoder in the image decomposition model with the first sub-sample set marked with the ground truth value of the illuminance map, the shared encoder and reflectance in the image decomposition model can be further trained on this basis.
  • the decoder is trained. Specifically, the shared encoder and the reflectivity decoder in the image decomposition model can be trained by using the second sub-sample set marked with the ground truth value of the reflectivity map.
  • the training effect can be judged according to the relevant loss function, and then the network parameters of each model can be adjusted according to the size of the loss value to complete the training.
  • the image decomposition model can obtain better illuminance maps and reflectance maps when decomposing the to-be-decomposed image.
  • the image decomposition model can use the normal vector information to better understand the environment of the scene in the image to be decomposed, so that the intrinsic image decomposed by the image decomposition model can be decomposed with the to-be-decomposed image.
  • the scene of the image is better matched, which improves the decomposition effect of the intrinsic image;
  • the normal vector information of the image to be decomposed is obtained by using a normal vector estimation model independent of the image decomposition model, and the targeted model can be used to obtain accurate normal vectors information, which further improves the matching degree between the eigenimage obtained by subsequent decomposition and the scene of the image to be decomposed.
  • the writing order of each step does not mean a strict execution order but constitutes any limitation on the implementation process, and the specific execution order of each step should be based on its function and possible Internal logic is determined.
  • FIG. 7 is a schematic frame diagram of an image decomposition apparatus according to an embodiment of the present disclosure.
  • the image decomposition apparatus 70 includes an acquisition module 71 , a normal vector estimation module 72 and a decomposition module 73 .
  • the acquisition module 71 is configured to perform the acquisition of the image to be decomposed;
  • the normal vector estimation module 72 is configured to perform the acquisition of the normal vector information of the image to be decomposed by using the normal vector estimation model;
  • the image is decomposed to obtain the intrinsic image of the image to be decomposed.
  • the above-mentioned intrinsic images include illuminance images.
  • the above-mentioned decomposition module 73 is configured to utilize the image decomposition model to process the image to be decomposed to obtain scene illumination condition information of the image to be decomposed; based on the scene illumination condition information and normal vector information, obtain the illumination rate image of the image to be decomposed.
  • the above-mentioned scene lighting condition information is a normal vector adaptation map including normal vector adaptation vectors of different pixels of the image to be decomposed
  • the normal vector information is a normal vector including normal vectors of different pixels of the image to be decomposed.
  • the above-mentioned decomposition module 73 is configured to: perform a dot product on the normal vector adaptive map and the normal vector map to obtain an illumination rate image of the image to be decomposed.
  • the above-mentioned image decomposition model includes a shared encoder and an illumination rate decoder.
  • the above-mentioned decomposition module 73 is configured to: use the shared encoder to perform feature extraction on the image to be decomposed to obtain an image feature map, and fuse the image feature map and the first scene structure feature map output by the normal vector encoder of the normal vector estimation model to obtain.
  • the first fusion feature map; the first fusion feature map is decoded by an illumination rate decoder to obtain scene illumination condition information of the image to be decomposed.
  • the above-mentioned shared encoder includes at least one encoding unit connected in sequence, each encoding unit includes a normal vector adaptor, and the above-mentioned decomposition module 73 is configured to: output the image feature map to the first coding unit; for each coding unit: use a normal vector adaptor to fuse the feature map output by the previous coding unit and the first scene structure feature map to obtain the second fusion feature map corresponding to the coding unit; wherein, each coding unit The feature richness in the scene structure feature map corresponding to the unit is different; based on the second fused feature map of the last coding unit, the first fused feature map is obtained.
  • the above-mentioned decomposition module 73 is configured to perform fusion of the feature map and the scene structure feature map output by the previous coding unit by using the normal vector adaptor to obtain the second fusion feature map corresponding to the coding unit Before, down-sampling is performed on the feature map output by the previous coding unit. And/or, the above-mentioned decomposition module 73 is configured to: use a normal vector adaptor to perform: adjust the scene structure feature map to a scene structure feature map of a preset scale, and output the adjusted scene structure feature map and the previous coding unit. The feature maps are concatenated and convolved to obtain the second fusion feature map corresponding to the coding unit.
  • the above-mentioned decomposition module 73 is configured to: use an illumination rate decoder to decode the first fused feature map and the second fused feature map of at least one normal vector adaptor to obtain the scene of the image to be decomposed Lighting condition information.
  • the above-mentioned image decomposition model further includes a reflectivity decoder.
  • the above-mentioned decomposition module 73 is configured to: decode the first fusion feature map by using a reflectivity decoder to obtain a reflectivity image of the image to be decomposed.
  • the above-mentioned decomposition module 73 is configured to: use a reflectivity decoder to decode the first fused feature map and the second fused feature map of at least one normal vector adaptor to obtain the reflection of the image to be decomposed rate image.
  • the above-mentioned normal vector estimation model includes a normal vector encoder, a normal vector decoder and a sub-network.
  • the above-mentioned normal vector estimation module 72 is configured to: use the normal vector encoder to encode the image to be decomposed to obtain the first scene structure feature map; use the normal vector decoder to decode the first scene structure feature map to obtain the decoded feature map; use The fine molecular network fuses the first scene structure feature map and the decoded feature map to obtain normal vector information of the image to be decomposed.
  • the above-mentioned normal vector estimation module 72 is configured to: use a normal vector encoder to perform multi-layer encoding on the to-be-decomposed image to obtain a first scene structure feature map corresponding to each layer, wherein the first scene structure feature map corresponding to each layer is The feature richness of the two scene structure feature maps is different, and the first scene structure feature map corresponding to the last layer is output to the normal vector decoder.
  • the above-mentioned normal vector estimation module 72 is configured to: use the sub-network to perform: connect the first scene structure feature map corresponding to each layer in series to obtain the second scene structure feature map, and combine the second scene structure feature map with the decoding feature map
  • the third scene structure feature map is obtained in series, and based on the third scene structure feature map, normal vector information of the image to be decomposed is obtained.
  • the above-mentioned normal vector estimation model and image decomposition model are obtained by training separately.
  • the image decomposition apparatus 70 further includes a training module, which is configured to use the first sample set to train to obtain the normal vector information before the normal vector estimation module 72 executes using the normal vector estimation model to obtain the normal vector information of the image to be decomposed.
  • a vector estimation model wherein the images in the first sample set are marked with normal vector information; use the trained normal vector estimation model to obtain sample normal vector information of the images in the second sample set, and use the second sample set and the sample normal vector information to train the image decomposition model.
  • the above-mentioned second sample set includes a first sub-sample set and a second sub-sample set.
  • the above-mentioned training module is configured to: use the first sub-sample set and the sample normal vector information corresponding to the first sub-sample set to train the image decomposition model, so as to adjust the parameters of the shared encoder and the illumination rate decoder in the image decomposition model;
  • the second sub-sample set and the sample normal vector information corresponding to the second sub-sample set train the image decomposition model to adjust the parameters of the shared encoder and the reflectivity decoder in the image decomposition model.
  • FIG. 8 is a schematic diagram of a frame of an electronic device according to an embodiment of the present disclosure.
  • the electronic device 80 includes a memory 81 and a processor 82 coupled to each other, and the processor 82 is configured to execute program instructions stored in the memory 81 to implement the steps of any of the above image decomposition method embodiments.
  • the electronic device 80 may include, but is not limited to, a microcomputer and a server.
  • the electronic device 80 may also include mobile devices such as a notebook computer and a tablet computer, which are not limited herein.
  • the processor 82 is configured to control itself and the memory 81 to implement the steps of any of the above image decomposition method embodiments.
  • the processor 82 may also be referred to as a central processing unit (Central Processing Unit, CPU).
  • the processor 82 may be an integrated circuit chip with signal processing capability.
  • the processor 82 may also be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a field programmable gate array (Field-Programmable Gate Array, FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the processor 82 may be jointly implemented by an integrated circuit chip.
  • FIG. 9 is a schematic diagram of a framework of a computer-readable storage medium according to an embodiment of the disclosure.
  • the computer-readable storage medium 90 stores program instructions 901 that can be executed by the processor, and the program instructions 901 are used to implement the steps of any of the above image decomposition method embodiments.
  • the image decomposition model can use the normal vector information to better understand the environment of the scene in the image to be decomposed, so that the intrinsic image decomposed by the image decomposition model can be decomposed with the to-be-decomposed image.
  • the scene of the image is better matched, which improves the decomposition effect of the intrinsic image;
  • the normal vector information of the image to be decomposed is obtained by using a normal vector estimation model independent of the image decomposition model, and the targeted model can be used to obtain accurate normal vectors information, which further improves the matching degree between the eigenimage obtained by subsequent decomposition and the scene of the image to be decomposed.
  • the functions or modules included in the apparatuses provided in the embodiments of the present disclosure may be used to execute the methods described in the above method embodiments.
  • the disclosed method and apparatus may be implemented in other manners.
  • the device implementations described above are only illustrative.
  • the division of modules or units is only a logical function division. In actual implementation, there may be other divisions.
  • units or components may be combined or integrated. to another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, which may be in electrical, mechanical or other forms.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
  • the integrated unit if implemented as a software functional unit and sold or used as a stand-alone product, may be stored in a computer-readable storage medium.
  • the technical solutions of the present application can be embodied in the form of software products in essence, or the parts that contribute to the prior art, or all or part of the technical solutions, and the computer software products are stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the methods of the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

La présente invention concerne un procédé de décomposition d'image et un appareil et un dispositif associés. Le procédé consiste à : obtenir une image à décomposer (S11) ; utiliser un modèle d'estimation de vecteur normal pour obtenir des informations de vecteur normal de l'image à décomposer (S12) ; sur la base des informations de vecteur normal, utiliser un modèle de décomposition d'image pour décomposer l'image à décomposer, pour obtenir une image intrinsèque de l'image à décomposer (S13).
PCT/CN2021/114023 2020-08-31 2021-08-23 Procédé de décomposition d'image et appareil et dispositif associés WO2022042470A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010898798.1 2020-08-31
CN202010898798.1A CN112053338A (zh) 2020-08-31 2020-08-31 图像分解方法和相关装置、设备

Publications (1)

Publication Number Publication Date
WO2022042470A1 true WO2022042470A1 (fr) 2022-03-03

Family

ID=73608057

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/114023 WO2022042470A1 (fr) 2020-08-31 2021-08-23 Procédé de décomposition d'image et appareil et dispositif associés

Country Status (2)

Country Link
CN (1) CN112053338A (fr)
WO (1) WO2022042470A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117095158A (zh) * 2023-08-23 2023-11-21 广东工业大学 一种基于多尺度分解卷积的太赫兹图像危险品检测方法

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112053338A (zh) * 2020-08-31 2020-12-08 浙江商汤科技开发有限公司 图像分解方法和相关装置、设备
CN115222930B (zh) * 2022-09-02 2022-11-29 四川蜀天信息技术有限公司 一种基于WebGL的3D模型的编排组合的方法

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105447906A (zh) * 2015-11-12 2016-03-30 浙江大学 基于图像和模型计算光照参数进行重光照渲染的方法
US20160328630A1 (en) * 2015-05-08 2016-11-10 Samsung Electronics Co., Ltd. Object recognition apparatus and method
CN106296749A (zh) * 2016-08-05 2017-01-04 天津大学 基于l1范数约束的rgb‑d图像本征分解方法
CN110428491A (zh) * 2019-06-24 2019-11-08 北京大学 基于单帧图像的三维人脸重建方法、装置、设备及介质
CN110647859A (zh) * 2019-09-29 2020-01-03 浙江商汤科技开发有限公司 人脸图像分解方法和装置、电子设备及存储介质
CN111445582A (zh) * 2019-01-16 2020-07-24 南京大学 一种基于光照先验的单张图像人脸三维重建方法
CN112053338A (zh) * 2020-08-31 2020-12-08 浙江商汤科技开发有限公司 图像分解方法和相关装置、设备

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160328630A1 (en) * 2015-05-08 2016-11-10 Samsung Electronics Co., Ltd. Object recognition apparatus and method
CN105447906A (zh) * 2015-11-12 2016-03-30 浙江大学 基于图像和模型计算光照参数进行重光照渲染的方法
CN106296749A (zh) * 2016-08-05 2017-01-04 天津大学 基于l1范数约束的rgb‑d图像本征分解方法
CN111445582A (zh) * 2019-01-16 2020-07-24 南京大学 一种基于光照先验的单张图像人脸三维重建方法
CN110428491A (zh) * 2019-06-24 2019-11-08 北京大学 基于单帧图像的三维人脸重建方法、装置、设备及介质
CN110647859A (zh) * 2019-09-29 2020-01-03 浙江商汤科技开发有限公司 人脸图像分解方法和装置、电子设备及存储介质
CN112053338A (zh) * 2020-08-31 2020-12-08 浙江商汤科技开发有限公司 图像分解方法和相关装置、设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117095158A (zh) * 2023-08-23 2023-11-21 广东工业大学 一种基于多尺度分解卷积的太赫兹图像危险品检测方法
CN117095158B (zh) * 2023-08-23 2024-04-26 广东工业大学 一种基于多尺度分解卷积的太赫兹图像危险品检测方法

Also Published As

Publication number Publication date
CN112053338A (zh) 2020-12-08

Similar Documents

Publication Publication Date Title
WO2022042470A1 (fr) Procédé de décomposition d'image et appareil et dispositif associés
CN111210435B (zh) 一种基于局部和全局特征增强模块的图像语义分割方法
US11151780B2 (en) Lighting estimation using an input image and depth map
CN112001914A (zh) 深度图像补全的方法和装置
CN112183150B (zh) 图像二维码及其制备方法、解析装置和解析方法
Xu et al. Multi-exposure image fusion techniques: A comprehensive review
US20180300531A1 (en) Computer-implemented 3d model analysis method, electronic device, and non-transitory computer readable storage medium
CN112396607A (zh) 一种可变形卷积融合增强的街景图像语义分割方法
CN113870335A (zh) 一种基于多尺度特征融合的单目深度估计方法
CN114038006A (zh) 一种抠图网络训练方法及抠图方法
CN113850324B (zh) 一种基于Yolov4的多光谱目标检测方法
CN114969417B (zh) 图像重排序方法、相关设备及计算机可读存储介质
Liu et al. Band-independent encoder–decoder network for pan-sharpening of remote sensing images
CN115358917B (zh) 一种手绘风格非对齐人脸迁移方法、设备、介质和***
CN112348819A (zh) 模型训练方法、图像处理及配准方法以及相关装置、设备
CN112241955A (zh) 三维图像的碎骨分割方法、装置、计算机设备及存储介质
Zeng et al. Self-attention learning network for face super-resolution
CN113538662B (zh) 一种基于rgb数据的单视角三维物体重建方法及装置
Dumka et al. Advanced digital image processing and its applications in big data
CN112560544A (zh) 一种遥感图像地物识别方法、***和计算机可读存储介质
CN112990213A (zh) 一种基于深度学习的数字万用表字符识别***和方法
CN112580645A (zh) 基于卷积稀疏编码的Unet语义分割方法
Wang et al. Unpaired image-to-image shape translation across fashion data
CN113240589A (zh) 一种多尺度特征融合的图像去雾方法及***
Dong et al. ViT-SAPS: Detail-aware transformer for mechanical assembly semantic segmentation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21860305

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21860305

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 21860305

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 25.09.2023)