WO2021057848A1 - 网络的训练方法、图像处理方法、网络、终端设备及介质 - Google Patents

网络的训练方法、图像处理方法、网络、终端设备及介质 Download PDF

Info

Publication number
WO2021057848A1
WO2021057848A1 PCT/CN2020/117470 CN2020117470W WO2021057848A1 WO 2021057848 A1 WO2021057848 A1 WO 2021057848A1 CN 2020117470 W CN2020117470 W CN 2020117470W WO 2021057848 A1 WO2021057848 A1 WO 2021057848A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample
image
mask
edge
neural network
Prior art date
Application number
PCT/CN2020/117470
Other languages
English (en)
French (fr)
Inventor
刘钰安
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2021057848A1 publication Critical patent/WO2021057848A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/12Edge-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • This application relates to the field of image processing technology, and in particular to an image segmentation network training method, image processing method, image segmentation network, terminal equipment, and computer-readable storage medium.
  • the current commonly used method is: use the trained image segmentation network to output a mask that represents the area where the target object (that is, the foreground, such as a portrait) is located, and then use the mask to segment the target object Come out, and then change the image background.
  • the mask output by the current image segmentation network cannot accurately represent the contour edge of the target object, so that the target object cannot be accurately segmented, and the effect of replacing the image background is poor. Therefore, how to enable the mask output by the image segmentation network to more accurately represent the contour edge of the target object is a technical problem that needs to be solved urgently.
  • the purpose of the embodiments of this application is to provide an image segmentation network training method, image processing method, image segmentation network, terminal equipment, and computer-readable storage medium, which can make the output of the trained image segmentation network to a certain extent
  • the mask can more accurately represent the contour edge of the target object.
  • an image segmentation network training method which includes steps S101-S105:
  • S101 Obtain each sample image containing a target object, a sample mask corresponding to each sample image, and sample edge information corresponding to each sample mask, where each sample mask is used to indicate the target in the corresponding sample image The image area where the object is located, each sample edge information is used to indicate the contour edge of the image area where the target object is indicated by the corresponding sample mask;
  • S102 For each sample image, input the sample image to an image segmentation network, and obtain a generation mask output by the image segmentation network for indicating the area where the target object in the sample image is located;
  • S103 For each generated mask, input the generated mask to the trained edge neural network to obtain generated edge information output by the edge neural network, and the generated edge information is used to indicate the target object indicated by the generated mask The contour edge of the area;
  • S104 Determine the loss function of the image segmentation network, where the loss function is used to measure the difference between the sample mask corresponding to each sample image and the generated mask, and the loss function is also used to measure each sample image Corresponding to the gap between the generated edge information and the sample edge information;
  • S105 Adjust various parameters of the image segmentation network, and then return to execute S102 until the loss function of the image segmentation network is less than a first preset threshold, thereby obtaining a trained image segmentation network.
  • an image processing method including:
  • the trained image segmentation network Obtains the image to be processed, and input the image to be processed into the trained image segmentation network to obtain the mask corresponding to the image to be processed, wherein the trained image segmentation network adopts a trained edge neural network After training, the trained edge neural network is used to output the contour edge of the area where the target object is indicated by the mask according to the input mask;
  • the target object contained in the image to be processed is segmented.
  • an image segmentation network is provided, and the image segmentation network is obtained by training using the training method described in the first aspect.
  • a terminal device including a memory, a processor, and a computer program that is stored in the memory and can run on the processor.
  • the processor executes the computer program, it implements the first aspect or the second aspect. The steps of the method described in the aspect.
  • a computer-readable storage medium stores a computer program, and when the above-mentioned computer program is executed by a processor, the steps of the method described in the first aspect or the second aspect are implemented.
  • a computer program product includes a computer program, and when the computer program is executed by one or more processors, the steps of the method described in the first aspect or the second aspect are implemented.
  • the trained edge neural network when training the image segmentation network, the trained edge neural network will be used to train the image segmentation network.
  • the trained edge neural network 001 is input to the edge neural network 001 according to the generated mask 002 indicated by the image area (pure) White area), output generated edge information 003, the edge information is used to indicate the location of the contour edge of the image area, the generated edge information 003 in Figure 1 is presented in the form of an image.
  • the training method provided by this application includes the following steps: First, for each sample, the sample image is input to the image segmentation network to obtain the generated mask output by the image segmentation network, and the generated mask is input to the training After the edge neural network, the generated edge information output by the edge neural network is obtained; secondly, the loss function of the image segmentation network is determined, and the loss function is positively correlated with the mask gap corresponding to each sample image (a sample image corresponds to The mask gap is the gap between the sample mask corresponding to the sample image and the generated mask), and the loss function and the edge gap corresponding to each sample image are also positively correlated (the edge gap corresponding to a sample image is the The difference between the sample edge information corresponding to the sample image and the generated edge information), and finally, adjust the various parameters of the image segmentation network until the loss function is less than the first preset threshold.
  • the above training method ensures that the generated mask output by the image segmentation network is close to the sample mask, it will further ensure that the contour edge of the target object represented in the generated mask output by the image segmentation network is more consistent with the actual contour edge. For approximation, therefore, the mask image output by the image segmentation network provided by this application can more accurately represent the contour edge of the target object.
  • Fig. 1 is a schematic diagram of the working principle of a trained edge neural network provided by the present application
  • FIG. 2 is a schematic diagram of a training method of an image segmentation network provided by Embodiment 1 of the present application;
  • FIG. 3 is a schematic diagram of a sample image, sample mask, and sample edge information provided in Embodiment 1 of the present application;
  • Embodiment 4 is a schematic structural diagram of an image segmentation network provided by Embodiment 1 of the present application.
  • FIG. 5 is a schematic diagram of the connection relationship between the image segmentation network provided in the first embodiment of the present application and the trained edge neural network;
  • FIG. 6 is a schematic diagram of the structure of the edge neural network provided in the first embodiment of the present application.
  • FIG. 7 is a schematic diagram of another image segmentation network training method provided in Embodiment 2 of the present application.
  • FIG. 8(a) is a schematic diagram of the training process of the edge segmentation network provided in the second embodiment of the present application.
  • Fig. 8(b) is a schematic diagram of the training process of the image segmentation network provided in the second embodiment of the present application.
  • FIG. 9 is a schematic diagram of the work flow of the image processing method provided in the third embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of an image segmentation network training device provided in the fourth embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of an image processing apparatus according to Embodiment 5 of the present application.
  • FIG. 12 is a schematic structural diagram of a terminal device according to Embodiment 6 of the present application.
  • the method provided in the embodiments of the present application may be applicable to terminal devices.
  • the terminal devices include, but are not limited to: smart phones, tablet computers, notebooks, desktop computers, cloud servers, and the like.
  • the term “if” can be interpreted as “when” or “once” or “in response to determination” or “in response to detection” depending on the context .
  • the phrase “if determined” or “if detected [described condition or event]” can be interpreted as meaning “once determined” or “in response to determination” or “once detected [described condition or event]” depending on the context ]” or “in response to detection of [condition or event described]”.
  • the training method includes:
  • step S101 each sample image containing the target object, the sample mask corresponding to each sample image, and the sample edge information corresponding to each sample mask are obtained, where each sample mask is used to indicate the corresponding sample In the image area where the target object is located, each sample edge information is used to indicate the contour edge of the image area where the target object is indicated by the corresponding sample mask.
  • a part of the sample images can be obtained from the data set first, and then the number of sample images used for training the image segmentation network can be expanded in the following ways: mirror inversion, scale scaling and/or scaling of the sample images obtained in advance Gamma changes, etc., so as to increase the number of sample images, so as to obtain each sample image described in step S101.
  • the sample mask described in this application is a binary image.
  • the method of obtaining the sample edge information corresponding to a certain sample mask in step S101 may be: performing an expansion operation on the sample mask to obtain a mask image after the expansion operation, and combining the mask image after the expansion operation with the By subtracting the sample mask, the edge information of the sample corresponding to the sample mask can be obtained.
  • the edge information of the sample obtained in this way is the same as the sample mask, which is a binary image.
  • an image 201 is a sample image containing a target object (ie, a portrait)
  • an image 202 may be a sample mask corresponding to the sample image 201
  • an image 203 may be sample edge information corresponding to the sample mask 201.
  • the above sample edge information is not necessarily a binary image, but can also be other information expression forms, as long as it can reflect the "contour edge of the image area where the target object is indicated by the sample mask". .
  • the above-mentioned target object may be any subject, such as a portrait, a dog, a cat, etc., and this application does not limit the category of the target object.
  • the image content contained in each sample image should be as different as possible.
  • the image content contained in sample image 1 can be a frontal portrait of Xiao Ming.
  • the image content contained in the sample image 2 may be a half-profile portrait of Xiaohong.
  • step S102 for each sample image, the sample image is input to the image segmentation network, and a generation mask output by the image segmentation network for indicating the area of the target object in the sample image is obtained.
  • an image segmentation network needs to be established in advance, and the image segmentation network is used to output a mask corresponding to the image (that is, to generate a mask) according to the input image.
  • the image segmentation network may be CNN (Convolutional Neural Networks, Convolutional Neural Network), or FPN (Feature Pyramid Networks, Feature Pyramid Network), and this application does not limit the specific network structure of the image segmentation network.
  • the image segmentation network using the FPN structure can be specifically referred to in Figure 4.
  • step S102 is started to train the image segmentation network.
  • each sample image needs to be input to the image segmentation network to obtain each generation mask output by the image segmentation network, where each generation mask corresponds to a sample image.
  • the "generating mask" described in this step is the same as the sample mask described in step S101, and may be a binary image.
  • step S103 for each generated mask, input the generated mask to the trained edge neural network to obtain the generated edge information output by the edge neural network, and the generated edge information is used to indicate what the generated mask indicates The contour edge of the area where the target object is located.
  • a trained edge neural network Before performing this step S103, a trained edge neural network needs to be obtained.
  • the trained edge neural network is used to generate a mask based on the input and output to generate edge information.
  • the generated edge information is used to indicate the input generated mask The contour edge of the area where the indicated target object is located.
  • the edge neural network after training may be as shown in FIG. 1.
  • FIG. 1 After inputting the generated mask shown in 002 into the trained edge neural network shown in 001, the trained edge neural network shown in 001 will output the generated edge information shown in 003.
  • each of the generated masks described in step S102 is input to the trained edge neural network, and each generated edge information output by the trained edge neural network is obtained, where each Each generation edge information corresponds to a generation mask, which is used to represent the contour edge of the image area where the target object indicated by the generation mask is located.
  • the connection between the image segmentation network and the trained edge neural network is shown in FIG. 5.
  • step S104 the loss function of the above-mentioned image segmentation network is determined.
  • the loss function is used to measure the difference between the sample mask and the generated mask corresponding to each sample image, and the loss function is also used to measure each sample image. Respectively correspond to the gap between the generated edge information and the sample edge information.
  • each sample image corresponds to a sample mask, sample edge information, mask generation, and edge information generation.
  • it is necessary to calculate the gap between the sample mask corresponding to the sample image and the generated mask for the convenience of subsequent description, it is defined that for a certain sample image, The gap between the sample mask corresponding to the sample image and the generated mask is the mask gap corresponding to the sample image), it is also necessary to calculate the gap between the sample edge information corresponding to the sample image and the generated edge information (for the convenience of subsequent description, the definition of For a sample image, the difference between the sample edge information corresponding to the sample image and the generated edge information is the edge difference corresponding to the sample image).
  • step S104 the loss function of the aforementioned image segmentation network needs to be calculated.
  • the loss function and each sample image are used to measure the difference between the sample mask and the generated mask corresponding to each sample image, and the loss function is also used To measure the gap between the generated edge information and the sample edge information corresponding to each sample image, that is, the loss function is positively correlated with the mask gap corresponding to each sample image, and the loss function is related to each sample image.
  • the corresponding marginal gaps are also positively correlated.
  • the calculation process of the aforementioned loss function may be:
  • Step A For each sample image, calculate the image difference between the generated mask corresponding to the sample image and the sample mask corresponding to the sample image (that is, it can be regarded as m1 i is the pixel value of the i-th pixel of the generated mask, m2 i is the pixel value of the i-th pixel of the sample mask, and M is the total number of pixels of the generated mask).
  • Step B If the above sample edge information and generated edge information are both images, then for each sample image, calculate the image difference between the sample edge information corresponding to the sample image and the generated edge information corresponding to the sample image (calculation of the image difference Refer to step A).
  • Step C The image differences obtained in the above step A and the image differences obtained in the image B can be averaged (if the number of sample images is N, the image differences obtained in step A and the image differences obtained in step B can be calculated And then divide by 2N) to get the loss function.
  • the calculation method of the aforementioned loss function is not limited to the aforementioned step A-step C.
  • the aforementioned loss function can also be calculated by the following formula (1):
  • LOSS 1 is the loss function of the above image segmentation network
  • N is the total number of sample images
  • F1 j is used to measure the gap between the sample mask corresponding to the jth sample image and the generated mask
  • F2 j is used to measure the The difference between the sample edge information corresponding to j sample images and the generated edge information
  • the calculation method of F1 j may be: calculating the cross entropy loss between the sample mask and the generated mask corresponding to the j-th sample image, and the specific formula is as follows:
  • M is the total number of pixels in the j-th sample image
  • the value of y ji is determined according to the sample mask corresponding to the j-th sample image
  • y ji is used to indicate the i-th sample image in the j-th sample image.
  • p ji is the probability that the i-th pixel in the j-th sample image predicted by the image segmentation network is in the image area where the target object is located
  • x is the logarithm log Bottom value.
  • the value of y ji is determined according to the sample mask corresponding to the j-th sample image. For example, if the sample mask corresponding to the j-th sample image is in the sample mask, it indicates that the j-th sample image is If i pixels are located in the image area where the target object is located, y ji can be 1. If the sample mask corresponding to the j-th sample image is in the sample mask, it indicates that the i-th pixel in the j-th sample image is not located where the target object is In the image area, y ji can be 0. Those skilled in the art should understand that the value of y ji is not limited to 1 and 0, and can also be other values. The value of y ji is preset, such as 1 or 0.
  • the value of y ji is greater than when the sample mask indicates that the i-th pixel is not located in the image area where the target object is located.
  • the value of y ji at time That is, if the sample mask indicates that the i-th pixel is located in the image area where the target object is located, y ji is 1, otherwise, y ji is 0. Or, if the sample mask indicates that the i-th pixel is located in the image area where the target object is located, y ji is 2, otherwise, y ji is 1. Or, if the sample mask indicates that the i-th pixel is located in the image area where the target object is located, y ji is 0.8, otherwise, y ji is 0.2.
  • the calculation method of F2 j may be similar to the above formula (2), that is, calculating the cross entropy loss of the sample edge information corresponding to the j-th sample image and the generated edge information.
  • mask 1 is the generation mask corresponding to the j-th sample image
  • mask 2 is the sample mask corresponding to the j-th sample image
  • h c (mask 1 ) is when the trained edge neural network input is mask 1
  • the first The output of c convolutional blocks, h c (mask 2 ) is the output of the c-th convolutional block when the trained edge neural network input is mask 2
  • ⁇ c is a constant.
  • the sample edge information can be measured by the above formula (3) The gap with generating edge information.
  • the edge neural network can be formed by cascading three convolutional blocks, and each convolutional block is a convolutional layer.
  • step S105 it is determined whether the aforementioned loss function is less than the first preset threshold, if so, step S107 is executed, otherwise, step S106 is executed.
  • step S106 adjust each parameter of the above-mentioned image segmentation network, and then return to perform step S102.
  • step S107 a trained image segmentation network is obtained.
  • the parameters of the image segmentation network are continuously adjusted until the loss function is less than the first preset threshold.
  • the parameter adjustment method is not specifically limited, and a gradient descent algorithm, a power update algorithm, etc. can be used, and the method used for adjusting the parameters is not limited here.
  • the sample image when the image segmentation network is trained, before the sample image is input to the image segmentation network, the sample image can be preprocessed first, and then the preprocessed sample image can be input to the image Split the network.
  • the above-mentioned preprocessing may include: image cropping and/or normalization processing and so on.
  • the test set can also be used to evaluate the trained image segmentation network.
  • the method of obtaining the test set can be referred to the prior art, and will not be repeated here.
  • the evaluation function can be:
  • X is the image area of the target object indicated by the generated mask output by the image segmentation network after the sample image is input to the trained image segmentation network.
  • Y is the image area of the target object indicated by the sample mask corresponding to the sample image.
  • the IoU Intersection-over-Union
  • X and Y are used to evaluate the image segmentation network after training.
  • By evaluating the trained image segmentation network we can further evaluate whether the performance of the trained image segmentation network meets the requirements. For example, if it is determined that the performance of the trained image segmentation network does not meet the requirements, the training of the trained image segmentation network is continued.
  • the training method provided in the first embodiment of the application ensures that the generated mask output by the image segmentation network is close to the sample mask, and at the same time, it will further ensure that the contour edge of the target object represented in the generated mask output by the image segmentation network is true The contour edge of is closer. Therefore, the image corresponding to the generated mask output by the image segmentation network provided by this application can more accurately represent the contour edge of the target object.
  • the training method includes the training process of the edge neural network. Please refer to Figure 7.
  • the training method includes:
  • each sample image containing the target object, the sample mask corresponding to each sample image, and the sample edge information corresponding to each sample mask are obtained, where each sample mask is used to indicate the corresponding sample In the image area where the target object is located, each sample edge information is used to indicate the contour edge of the image area where the target object is indicated by the corresponding sample mask.
  • step S301 please refer to the part of step S101 in the first embodiment, which will not be repeated here.
  • step S302 for each sample mask, the sample mask is input to the edge neural network to obtain edge information output by the edge neural network, and the edge information is used to indicate the area where the target object indicated by the sample mask is located Contour edges.
  • the step S302 to the subsequent step S306 are the training process of the edge neural network to obtain the trained edge neural network.
  • steps S302-S306 are executed before the subsequent step S308, and need not be executed before the step S307.
  • an edge neural network needs to be established in advance, and the edge neural network is used to obtain the contour edge of the area where the target object indicated by the input sample mask is located.
  • the edge neural network can be formed by cascading three convolutional layers.
  • each sample mask is input to the edge neural network to obtain each edge information output by the edge neural network, wherein each sample mask corresponds to one edge information output by the edge neural network.
  • step S303 the loss function of the edge neural network is determined, and the loss function is used to measure the difference between the sample edge information corresponding to each sample mask and the edge information output by the edge neural network.
  • step S303 determines the loss function of the above-mentioned edge neural network, where the loss function is positively correlated with the edge gap corresponding to each sample mask (the edge gap is corresponding to the sample mask). The difference between the edge information of the sample and the edge information output by the edge neural network after the sample mask is input to the edge neural network).
  • the calculation method of the loss function of the above-mentioned edge neural network may be:
  • the loss function of the edge neural network may be the difference between the edge information of the sample and the edge information output by the edge neural network.
  • Image difference the image difference calculation method can be referred to as described in step A in the first embodiment, which will not be repeated here).
  • the calculation method of the loss function of the edge neural network may be: for each sample mask, calculate the cross-entropy loss of the corresponding sample edge information and the edge information output by the edge neural network, and then calculate the average.
  • the specific calculation formula is as follows:
  • LOSS 2 is the loss function of the aforementioned edge neural network
  • N is the total number of sample masks (it is easy for those skilled in the art to understand that the total number of sample images, sample masks, and sample edge information are all the same, and they are all N )
  • M is the total number of pixels in the j-th sample mask
  • the value of r ji is determined according to the sample edge information corresponding to the j-th sample image
  • r ji is used to indicate the number of pixels in the j-th sample mask.
  • q ji is the probability that the i-th pixel in the j-th sample mask predicted by the edge neural network is the contour edge
  • x is the bottom value of the logarithm log.
  • the value of r ji is determined according to the sample edge information corresponding to the jth sample mask. For example, if the sample edge information corresponding to the jth sample mask indicates the jth sample mask If the i-th pixel in the film is a contour edge, r ji can be 1. If the sample edge information corresponding to the j-th sample mask indicates that the i-th pixel is not a contour edge, then r ji can be 0 . Those skilled in the art should understand that the value of r ji is not limited to 1 and 0, and may also be other values. The value of r ji is preset, such as 1 or 0.
  • the edge of the sample information indicates the i-th pixel as the edge contour
  • the value of r ji when the sample is greater than the edge information indicates an i-th pixel is not the value of r ji edge contour. That is, if the sample edge information indicates that the i-th pixel is a contour edge, r ji is 1, otherwise, r ji is 0. Or, if the sample edge information indicates that the i-th pixel is a contour edge, r ji is 2, otherwise, r ji is 1. Or, if the sample edge information indicates that the i-th pixel is a contour edge, r ji is 0.8, otherwise, r ji is 0.2.
  • step S304 it is determined whether the loss function of the aforementioned edge neural network is less than a second preset threshold, if not, step S305 is executed, and if yes, step S306 is executed.
  • step S305 adjust each parameter of the above-mentioned edge neural network model, and then return to step S302.
  • step S306 a trained edge neural network is obtained.
  • the parameters of the edge neural network are continuously adjusted until the loss function is less than the second preset threshold.
  • the parameter adjustment method is not specifically limited, and a gradient descent algorithm, a power update algorithm, etc. can be used, and the method used for adjusting the parameters is not limited here.
  • step S307 for each sample image, the sample image is input to the image segmentation network, and the generated mask output by the image segmentation network for indicating the area of the target object in the sample image is obtained.
  • step S308 for each generated mask, input the generated mask to the trained edge neural network to obtain the generated edge information output by the edge neural network, and the generated edge information is used to indicate what the generated mask indicates The contour edge of the area where the target object is located.
  • step S309 the loss function of the above-mentioned image segmentation network is determined.
  • the loss function is used to measure the difference between the sample mask corresponding to each sample image and the generated mask, and the loss function is also used to measure each sample image. Respectively correspond to the gap between the generated edge information and the sample edge information.
  • step S310 it is determined whether the aforementioned loss function is less than a first preset threshold, if so, step S312 is executed, otherwise, step S311 is executed.
  • step S31 adjust the various parameters of the above-mentioned image segmentation network, and then return to perform step S102.
  • step S312 a trained image segmentation network is obtained.
  • the training process of the edge neural network First, input the sample mask into the edge neural network to obtain the edge information output by the edge neural network. Secondly, the cross-entropy loss is calculated according to the edge neural network and the edge information of each sample. The edge information of the sample is obtained by the expansion operation and the subtraction operation of the sample mask. For details, please refer to the description of the first embodiment, which will not be repeated here. Then, the various cross entropy losses are averaged to obtain the loss function. Finally, continuously adjust the various parameters of the edge neural network until the loss function is small, so as to obtain the edge neural network after training.
  • the training method described in the second embodiment of the present application has an additional training process of the edge neural network, which can make the samples used for training the edge neural network consistent with the samples used for training the image segmentation network Therefore, the accuracy of the edge of the mask output by the image segmentation network can be better measured according to the output result of the edge neural network, so as to better train the image segmentation network.
  • the third embodiment of the present application provides an image processing method. Please refer to FIG. 9.
  • the image processing method includes:
  • step S401 an image to be processed is obtained, and the image to be processed is input to the trained image segmentation network to obtain a mask corresponding to the image to be processed, wherein the trained image segmentation network uses the trained edge
  • the neural network is trained, and the trained edge neural network is used to output the edge contour of the area where the target object indicated by the mask is located according to the input mask.
  • the trained edge neural network described in this step S401 is a neural network obtained by training using the method described in the first or second embodiment above.
  • step S402 the target objects contained in the image to be processed are segmented based on the mask corresponding to the image to be processed.
  • step S402 a specific operation of changing the background can also be performed. This operation is in the prior art and will not be repeated here.
  • the method described in the third embodiment can be a method applied in a terminal device (such as a mobile phone).
  • This method can facilitate the user to replace the background in the image to be processed.
  • This method can accurately segment the target object and more accurately replace the background. Can improve user experience to a certain extent.
  • the fourth embodiment of the present application provides a training device for an image segmentation network. For ease of description, only the parts related to the present application are shown. As shown in FIG. 10, the training device 500 includes:
  • the sample acquisition module 501 is used to acquire each sample image containing the target object, a sample mask corresponding to each sample image, and sample edge information corresponding to each sample mask, wherein each sample mask is used to indicate Corresponding to the image area where the target object is located in the sample image, each sample edge information is used to indicate the contour edge of the image area where the target object indicated by the corresponding sample mask is located.
  • the generation mask acquisition module 502 is configured to input the sample image to the image segmentation network for each sample image, and obtain the generation mask output by the image segmentation network for indicating the area of the target object in the sample image.
  • the generated edge acquisition module 503 is used to input the generated mask to the trained edge neural network for each generated mask to obtain the generated edge information output by the edge neural network, and the generated edge information is used to indicate the generated mask.
  • the contour edge of the area where the target object indicated by the film is located.
  • the loss determination module 504 is used to determine the loss function of the image segmentation network.
  • the loss function is used to measure the difference between the sample mask and the generated mask corresponding to each sample image, and the loss function is also used to Measure the gap between the generated edge information and the sample edge information corresponding to each sample image.
  • the parameter adjustment module 505 is used to adjust various parameters of the image segmentation network, and then trigger the generation mask acquisition module to continue to perform corresponding steps until the loss function of the image segmentation network is less than the first preset threshold, thereby Get the trained image segmentation network.
  • the aforementioned loss determination module 504 is specifically configured to:
  • LOSS 1 is the loss function of the image segmentation network
  • N is the total number of sample images
  • F1 j is used to measure the gap between the sample mask corresponding to the jth sample image and the generated mask
  • F2 j is used to measure The difference between the sample edge information corresponding to the j-th sample image and the generated edge information
  • M is the total number of pixels in the j-th sample image
  • the value of y ji is determined according to the sample mask corresponding to the j-th sample image
  • y ji is used to indicate the i-th sample image in the j-th sample image.
  • p ji is the probability that the i-th pixel in the j-th sample image predicted by the image segmentation network is in the image area where the target object is located
  • x is the logarithm log Bottom value.
  • the value of y ji is greater than when the sample mask indicates the i-th pixel of the image region of the target object is not located in the y ji value.
  • the above-mentioned trained edge neural network is formed by cascading A convolutional blocks, and each convolutional block is composed of B convolutional layers.
  • mask 1 is the generation mask corresponding to the j-th sample image
  • mask 2 is the sample mask corresponding to the j-th sample image
  • h c (mask 1 ) is when the trained edge neural network input is mask 1
  • the first The output of c convolutional blocks, h c (mask 2 ) is the output of the c-th convolutional block when the trained edge neural network input is mask 2
  • ⁇ c is a constant.
  • the above-mentioned training device further includes an edge neural network training module, and the edge neural network training module includes:
  • the edge information acquisition unit is used to input the sample mask to the edge neural network for each sample mask to obtain the edge information output by the edge neural network, and the edge information is used to indicate the target object indicated by the sample mask The contour edge of the area.
  • the edge loss determining unit is used to determine the loss function of the edge neural network, and the loss function is used to measure the difference between the sample edge information corresponding to each sample mask and the edge information output by the edge neural network.
  • the edge parameter adjustment unit is used to adjust various parameters of the edge neural network, and then trigger the edge information acquisition unit to continue to perform corresponding steps until the loss function value of the edge neural network is less than the second preset threshold, thereby obtaining training After the edge of the neural network.
  • the aforementioned edge loss determining unit is specifically used for:
  • LOSS 2 is the loss function of the edge neural network
  • N is the total number of sample images
  • M is the total number of pixels in the j-th sample mask
  • the value of r ji is based on the j-th sample image
  • the corresponding sample edge information is determined
  • r ji is used to indicate whether the i-th pixel in the j-th sample mask is a contour edge
  • q ji is the i-th pixel in the j-th sample mask predicted by the edge neural network
  • the pixel point is the probability of the edge of the contour
  • x is the base value of the logarithm log.
  • the value of r ji when the sample is greater than the edge information indicates an i-th pixel is not the value of r ji edge contour.
  • the image processing apparatus 600 includes:
  • the mask acquisition module 601 is used to acquire the image to be processed, and input the image to be processed into the trained image segmentation network to obtain the mask corresponding to the image to be processed, wherein the trained edge neural network is used After training, the trained edge neural network is used to output the edge contour of the area where the target object indicated by the mask is located according to the input mask (specifically, the trained image segmentation network adopts the method as in the first embodiment or The training method described in the second embodiment is obtained through training).
  • the target object segmentation module 602 is configured to segment the target object contained in the image to be processed based on the mask corresponding to the image to be processed.
  • FIG. 12 is a schematic diagram of a terminal device provided in Embodiment 6 of the present application.
  • the terminal device 700 of this embodiment includes a processor 701, a memory 702, and a computer program 703 that is stored in the memory 702 and can run on the processor 701.
  • the above-mentioned processor 701 implements the steps in the above-mentioned method embodiments when the above-mentioned computer program 703 is executed.
  • the processor 701 executes the computer program 703, the function of each module/unit in the foregoing device embodiments is realized.
  • the foregoing computer program 703 may be divided into one or more modules/units, and the foregoing one or more modules/units are stored in the foregoing memory 702 and executed by the foregoing processor 701 to complete the present application.
  • the foregoing one or more modules/units may be a series of computer program instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the foregoing computer program 703 in the foregoing terminal device 700.
  • the aforementioned computer program 703 can be divided into a sample acquisition module, a mask generation module, an edge generation module, a loss determination module, and a parameter adjustment module.
  • the specific functions of each module are as follows:
  • S101 Obtain each sample image containing a target object, a sample mask corresponding to each sample image, and sample edge information corresponding to each sample mask, where each sample mask is used to indicate the target in the corresponding sample image In the image area where the object is located, the edge information of each sample is used to indicate the contour edge of the image area where the target object is indicated by the corresponding sample mask.
  • S102 For each sample image, input the sample image to an image segmentation network, and obtain a generation mask output by the image segmentation network for indicating a region where a target object in the sample image is located.
  • S103 For each generated mask, input the generated mask to the trained edge neural network to obtain generated edge information output by the edge neural network, and the generated edge information is used to indicate the target object indicated by the generated mask The contour edge of the area.
  • S104 Determine the loss function of the image segmentation network, where the loss function is used to measure the difference between the sample mask corresponding to each sample image and the generated mask, and the loss function is also used to measure each sample image Respectively correspond to the gap between the generated edge information and the sample edge information.
  • S105 Adjust various parameters of the image segmentation network, and then return to execute S102 until the loss function of the image segmentation network is less than a first preset threshold, thereby obtaining a trained image segmentation network.
  • the aforementioned computer program 703 can be divided into a mask acquisition module and a target object segmentation module, and the specific functions of each module are as follows:
  • the target object contained in the image to be processed is segmented.
  • the foregoing terminal device may include, but is not limited to, a processor 701 and a memory 702.
  • FIG. 12 is only an example of the terminal device 700, and does not constitute a limitation on the terminal device 700. It may include more or less components than those shown in the figure, or a combination of certain components, or different components.
  • the aforementioned terminal device may also include input and output devices, network access devices, buses, and so on.
  • the so-called processor 701 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the foregoing memory 702 may be an internal storage unit of the foregoing terminal device 700, such as a hard disk or a memory of the terminal device 700.
  • the memory 702 may also be an external storage device of the terminal device 700, such as a plug-in hard disk, a smart media card (SMC), a secure digital (SD) card, and a flash memory equipped on the terminal device 700. Card (Flash Card), etc.
  • the aforementioned memory 702 may also include both an internal storage unit of the aforementioned terminal device 700 and an external storage device.
  • the above-mentioned memory 702 is used to store the above-mentioned computer program and other programs and data required by the above-mentioned terminal device.
  • the aforementioned memory 702 can also be used to temporarily store data that has been output or will be output.
  • the disclosed device/terminal device and method may be implemented in other ways.
  • the device/terminal device embodiments described above are only illustrative.
  • the division of the above-mentioned modules or units is only a logical function division, and there may be other division methods in actual implementation, such as multiple units or Components can be combined or integrated into another system, or some features can be omitted or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described above as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the above-mentioned integrated modules/units are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer readable storage medium. Based on this understanding, this application implements all or part of the processes in the foregoing method embodiments, and can also be completed by instructing relevant hardware through a computer program.
  • the foregoing computer program may be stored in a computer-readable storage medium. When the program is executed by the processor, it can implement the steps of the foregoing method embodiments.
  • the above-mentioned computer program includes computer program code, and the above-mentioned computer program code may be in the form of source code, object code, executable file, or some intermediate forms.
  • the above-mentioned computer-readable medium may include: any entity or device capable of carrying the above-mentioned computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory), random Access memory (RAM, Random Access Memory), electric carrier signal, telecommunications signal, software distribution medium, etc.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • electric carrier signal telecommunications signal
  • software distribution medium etc.
  • the content contained in the above-mentioned computer-readable media can be appropriately added or deleted in accordance with the requirements of the legislation and patent practice in the jurisdiction. For example, in some jurisdictions, according to the legislation and patent practice, the computer-readable media cannot Including electric carrier signal and telecommunication signal.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

本申请提供了一种网络训练方法、图像处理方法、网络、终端设备及介质。训练方法包括:S1,获取包含目标对象的样本图像、样本图像对应的样本掩膜以及样本掩膜对应的样本边缘信息;S2,将样本图像输入至图像分割网络,得到该图像分割网络输出的生成掩膜;S3,将该生成掩膜输入至训练后的边缘神经网络,得到该边缘神经网络输出的生成边缘信息;S4根据样本掩膜与生成掩膜的差距,和生成边缘信息与样本边缘信息的差距,确定损失函数;S5,调整图像分割网络的各个参数,并返回S2,直至损失函数小于阈值。本申请可使图像分割网络输出的掩膜图像更加精确地表示目标对象的轮廓边缘。

Description

网络的训练方法、图像处理方法、网络、终端设备及介质 技术领域
本申请涉及图像处理技术领域,尤其涉及一种图像分割网络的训练方法、图像处理方法、图像分割网络、终端设备及计算机可读存储介质。
背景技术
用户在拍摄完图像之后,往往希望变更图像中的背景(比如,将背景更换为室外沙滩场景,或者,将背景更换为用于拍摄证件照的纯色背景)。为了实现上述技术效果,目前常用的做法为:利用训练后的图像分割网络输出用于表示目标对象(也即是前景,比如,人像)所在区域的掩膜,然后利用该掩膜将目标对象分割出来,进而更换图像背景。
然而,目前图像分割网络输出的掩膜并不能精确表示目标对象的轮廓边缘,从而不能将目标对象进行精准分割,造成更换图像背景的效果较差。因此,如何使得图像分割网络所输出的掩膜能够更加精确地表示目标对象的轮廓边缘,是目前亟待解决的技术问题。
申请内容
本申请实施例的目的在于:提供一种图像分割网络的训练方法、图像处理方法、图像分割网络、终端设备及计算机可读存储介质,可以在一定程度上使得训练后的图像分割网络所输出的掩膜能够更加精确地表示目标对象的轮廓边缘。
本申请实施例采用的技术方案是:
第一方面,提供了一种图像分割网络的训练方法,包括步骤S101-S105:
S101,获取包含目标对象的各个样本图像、每个样本图像分别对应的样本掩膜以及每个样本掩膜分别对应的样本边缘信息,其中,每个样本掩膜均用于指示对应样本图像中目标对象所位于的图像区域,每个样本边缘信息均用于指示对应样本掩膜所指示的目标对象所在图像区域的轮廓边缘;
S102,对于每个样本图像,将该样本图像输入至图像分割网络,得到该图像分割网络输出的用于指示该样本图像中目标对象所在区域的生成掩膜;
S103,对于每个生成掩膜,将该生成掩膜输入至训练后的边缘神经网络,得到该边缘神经网络输出的生成边缘信息,该生成边缘信息用于指示该生成掩膜所指示的目标对象所在区域的轮廓边缘;
S104,确定所述图像分割网络的损失函数,所述损失函数用于衡量每个样本图像分别对应的样本掩膜与生成掩膜的差距,并且,所述损失函数还用于衡量每个样本图像分别对应的生成边缘信息与样本边缘信息的差距;
S105,调整所述图像分割网络的各个参数,然后返回执行S102,直至所述图像分割网络的损失函数小于第一预设阈值为止,从而得到训练后的图像分割网络。
第二方面,提供了一种图像处理方法,包括:
获取待处理图像,并将所述待处理图像输入至训练后的图像分割网络,得到所述待处理图像对应的掩膜,其中,所述训练后的图像分割网络是采用训练后的边缘神经网络训练得到,所述训练后的边缘神经网络用于根据输入的掩膜,输出该掩膜所指示的目标对象所在区域的轮廓边缘;
基于所述待处理图像对应的掩膜,将所述待处理图像中所包含的目标对象分割出来。
第三方面,提供一种图像分割网络,该图像分割网络是采用如上述第一方面所述的训练方法训练得到。
第四方面,提供一种终端设备,包括存储器、处理器以及存储在上述存储器中并可在上述处理器上运行的计算机程序,上述处理器执行上述计算机程序时实现如上述第一方面或者第二方面所述方法的步骤。
第五方面,提供一种计算机可读存储介质,上述计算机可读存储介质存储有计算机程序,上述计算机程序被处理器执行时实现如上述第一方面或者第二方面所述方法的步骤。
第六方面,提供一种计算机程序产品,上述计算机程序产品包括计算机程序,上述计算机程序被一个或多个处理器执行时实现如上述第一方面或者第二方面所述方法的步骤。
由上可见,本申请所提供的训练方法中,在对图像分割网络进行训练时,会利用训练后的边缘神经网络对图像分割网络进行训练。
首先利用附图1描述训练后的边缘神经网络,如图1所示,训练后的边缘神经网络001根据输入至该边缘神经网络001的生成掩膜002所指示的目标对象所在的图像区域(纯白色区域),输出生成边缘信息003,该边缘信息用于指示该图像区域的轮廓边缘所在的位置,附图1中的生成边缘信息003是以图像形式呈现的。
本申请所提供的训练方法包括如下步骤:首先,对于每个样本来说,将该样本图像输入至图像分割网络,得到该图像分割网络输出的生成掩膜,并将该生成掩膜输入至训练后的边缘神经网络,得到该边缘神经网络输出的生成边缘信息;其次,确定该图像分割网络的损失函数,该损失函数与每个样本图像对应的掩膜差距均正相关(某个样本图像对应的掩膜差距为该样本图像对应的样本掩膜与生成掩膜的差距),并且,该损失函数与每个样本图像对应的边缘差距也均正相关(某个样本图像对应的边缘差距为该样本图像对应的样本边缘信息与生成边缘信息的差距),最后,调整图像分割网络的各个参数,直至损失函数小于第一预设阈值为止。
由此可见,上述训练方法在保证图像分割网络输出的生成掩膜逼近样本掩膜的同时,会进一步保证图像分割网络输出的生成掩膜中所表示的目标对象的轮廓边缘与实际的轮廓边缘更为逼近,因此,本申请所提供的图像分割网络所输出的掩膜图像能够更加精确地表示目标对象的轮廓边缘。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例或示范性技术描述中所需要使用的 附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例。
图1是本申请提供的一种训练后的边缘神经网络的工作原理示意图;
图2是本申请实施例一提供的一种图像分割网络的训练方法示意图;
图3是本申请实施例一提供的一种样本图像、样本掩膜以及样本边缘信息的示意图;
图4是本申请实施例一提供的一种图像分割网络的结构示意图;
图5是本申请实施例一提供的图像分割网络与训练后的边缘神经网络的连接关系示意图;
图6是本申请实施例一提供的边缘神经网络的结构示意图;
图7是本申请实施例二提供的另一种图像分割网络的训练方法示意图;
图8(a)是本申请实施例二提供的边缘分割网络的训练过程示意图;
图8(b)是本申请实施例二提供的图像分割网络的训练过程示意图;
图9是本申请实施例三提供的图像处理方法的工作流程示意图;
图10是本申请实施例四提供的一种图像分割网络的训练装置的结构示意图;
图11是本申请实施例五提供的一种图像处理装置的结构示意图;
图12是本申请实施例六提供的一种终端设备的结构示意图。
具体实施方式
以下描述中,为了说明而不是为了限定,提出了诸如特定***结构、技术之类的具体细节,以便透彻理解本申请实施例。然而,本领域的技术人员应当清楚,在没有这些具体细节的其它实施例中也可以实现本申请。在其它情况中,省略对众所周知的***、装置、电路以及方法的详细说明,以免不必要的细节妨碍本申请的描述。
本申请实施例提供的方法可以适用于终端设备,示例性地,该终端设备包括但不限于:智能手机、平板电脑、笔记本、桌上型计算机、云端服务器等。
应当理解,当在本说明书和所附权利要求书中使用时,术语“包括”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。
还应当进一步理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。
如在本说明书和所附权利要求书中所使用的那样,术语“如果”可以依据上下文被解释为“当...时”或“一旦”或“响应于确定”或“响应于检测到”。类似地,短语“如果确定”或“如果检测到[所描述条件或事件]”可以依据上下文被解释为意指“一旦确定”或“响应于确定”或“一旦检测到[所描述条件或事件]”或“响应于检测到[所描述条件或事件]”。
另外,在本申请的描述中,术语“第一”、“第二”等仅用于区分描述,而不能理解为指示或暗示相 对重要性。
为了说明本申请所提供的技术方案,以下结合具体附图及实施例进行详细说明。
实施例一
下面对本申请实施例一提供的图像分割网络的训练方法进行描述,请参阅附图2,该训练方法包括:
在步骤S101中,获取包含目标对象的各个样本图像、每个样本图像分别对应的样本掩膜以及每个样本掩膜分别对应的样本边缘信息,其中,每个样本掩膜均用于指示对应样本图像中目标对象所位于的图像区域,每个样本边缘信息均用于指示对应样本掩膜所指示的目标对象所在图像区域的轮廓边缘。
在本申请实施例中,可以先从数据集中获取一部分样本图像,然后可以通过如下方式扩充用于训练图像分割网络的样本图像数量:对预先获取的样本图像进行镜面反转、尺度缩放和/或Gamma变化等,以此增大样本图像数量,从而获取该步骤S101所述的各个样本图像。
本申请所述的样本掩膜为二值图像。该步骤S101所述的某一样本掩膜对应的样本边缘信息的获取方式可以为:对该样本掩膜进行膨胀运算,得到膨胀运算后的掩膜图像,将膨胀运算后的掩膜图像与该样本掩膜做相减运算,即可得到该样本掩膜对应的样本边缘信息。采用这种方式所获取的样本边缘信息与样本掩膜相同,为二值图像。
为便于本领域技术人员对样本图像、样本掩膜以及样本边缘信息有更为直观的认识,下面利用附图3来进行说明。如图3所示,图像201是包含目标对象(即人像)的样本图像,图像202可以是该样本图像201对应的样本掩膜,图像203可以为该样本掩膜201对应的样本边缘信息。此外,本领域技术人员应该可以理解,上述样本边缘信息并非一定为二值图像,也可以是其他信息表现形式,只要能够体现“样本掩膜所指示的目标对象所在图像区域的轮廓边缘”即可。
此外,本领域技术人员应该能够理解,上述目标对象可以为人像、狗、猫等一切拍摄主体,本申请并不对目标对象的类别进行限定。
另外,为了更好的训练图像分割网络,各个样本图像所包含的图像内容要尽可能的不同,比如,若上述目标对象为人像时,则样本图像1所包含的图像内容可以为小明正面人像,样本图像2所包含的图像内容可以为小红半侧面人像。
在步骤S102中,对于每个样本图像,将该样本图像输入至图像分割网络,得到该图像分割网络输出的用于指示该样本图像中目标对象所在区域的生成掩膜。
在本申请实施例中,执行该步骤S102之前,需要事先建立一图像分割网络,该图像分割网络用于根据输入的图像,输出该图像所对应的掩膜(也即生成掩膜)。该图像分割网络可以为CNN(Convolutional Neural Networks,卷积神经网络),也可以为FPN(Feature Pyramid Networks,特征金字塔网络),本申请并不对图像分割网络的具体网络结构进行限定。采用FPN结构的图像分割网络可具体参见附图4。
在建立好上述图像分割网络之后,开始执行该步骤S102,以对该图像分割网络进行训练。
在训练过程中,需要将每一个样本图像均输入至该图像分割网络,得到该图像分割网络输出的各个生成掩膜,其中,每一个生成掩膜对应一个样本图像。另外,本领域技术人员容易理解,该步骤所述的“生成掩膜”与步骤S101所述的样本掩膜相同,可以为二值图像。
在步骤S103中,对于每个生成掩膜,将该生成掩膜输入至训练后的边缘神经网络,得到该边缘神经网络输出的生成边缘信息,该生成边缘信息用于指示该生成掩膜所指示的目标对象所在区域的轮廓边缘。
在执行该步骤S103之前,需要获取训练后的边缘神经网络,该训练后的边缘神经网络用于根据输入的生成掩膜,输出生成边缘信息,该生成边缘信息用于指示输入的该生成掩膜所指示的目标对象所在区域的轮廓边缘。在本申请实施例中,训练后的边缘神经网络可以如图1所示。在图1中,将002所示的生成掩膜输入001所示的训练后的边缘神经网络后,该001所示的训练后的边缘神经网络将输出003所示的生成边缘信息。
在获取到训练后的边缘神经网络之后,将步骤S102所述的各个生成掩膜分别输入至该训练后的边缘神经网络,得到该训练后的边缘神经网络输出的各个生成边缘信息,其中,每个生成边缘信息对应一个生成掩膜,用于表示该生成掩膜所指示的目标对象所在图像区域的轮廓边缘。
在本申请实施例中,在训练图像分割网络的过程中,图像分割网络与训练后的边缘神经网络的连接方式如图5所示。
在步骤S104中,确定上述图像分割网络的损失函数,该损失函数用于衡量每个样本图像分别对应的样本掩膜与生成掩膜的差距,并且,该损失函数还用于衡量每个样本图像分别对应的生成边缘信息与样本边缘信息的差距。
本领域技术人员容易理解,每个样本图像均对应有一个样本掩膜、样本边缘信息、生成掩膜以及生成边缘信息。为了得到该步骤S104所述的损失函数,对于每个样本图像来说,需要计算该样本图像对应的样本掩膜与生成掩膜的差距(为便于后续描述,定义对于某个样本图像来说,该样本图像对应的样本掩膜与生成掩膜的差距为该样本图像对应的掩膜差距),还需要计算该样本图像对应的样本边缘信息与生成边缘信息的差距(为便于后续描述,定义对于某个样本图像来说,该样本图像对应的样本边缘信息与生成边缘信息的差距为该样本图像对应的边缘差距)。
在步骤S104中,需要计算上述图像分割网络的损失函数,该损失函数与每个样本图像用于衡量每个样本图像分别对应的样本掩膜与生成掩膜的差距,并且,该损失函数还用于衡量每个样本图像分别对应的生成边缘信息与样本边缘信息的差距,也即是:该损失函数与每个样本图像对应的掩膜差距均正相关,并且,该损失函数与每个样本图像对应的边缘差距也均正相关。
在本申请实施例中,上述损失函数的计算过程可以为:
步骤A、对于每个样本图像,计算该样本图像对应的生成掩膜与该样本图像对应的样本掩膜的图像差(即可以为
Figure PCTCN2020117470-appb-000001
m1 i为生成掩膜第i个像素点的像素值,m2 i为样本掩膜第i个像素点的像素值,M为生成掩膜的像素点总个数)。
步骤B、若上述样本边缘信息以及生成边缘信息均为图像,则对于每个样本图像,计算该样本图像对应的样本边缘信息与该样本图像对应的生成边缘信息的图像差(该图像差的计算可参考步骤A所述)。
步骤C、可以将上述步骤A得到的各个图像差以及图像B得到的各个图像差进行平均(若样本图像个数为N,则步骤A求得的各个图像差与步骤B得到的各个图像差求和,然后除以2N),即可得到损失函数。
然而,上述损失函数的计算方式并不局限于上述步骤A-步骤C,在本申请实施例中,上述损失函数也可以通过如下公式(1)计算得到:
Figure PCTCN2020117470-appb-000002
其中,LOSS 1为上述图像分割网络的损失函数,N为样本图像的总个数,F1 j用于衡量第j个样本图像对应的样本掩膜与生成掩膜的差距,F2 j用于衡量第j个样本图像对应的样本边缘信息与生成边缘信息的差距,
Figure PCTCN2020117470-appb-000003
在本申请实施例中,上述F1 j的计算方法可以为:计算第j个样本图像所对应的样本掩膜与生成掩膜的交叉熵损失,具体公式如下:
Figure PCTCN2020117470-appb-000004
其中,M为第j个样本图像中像素点的总个数,y ji的数值是根据第j个样本图像对应的样本掩膜确定的,y ji用于指示第j个样本图像中第i个像素点是否在目标对象所位于的图像区域,p ji为所述图像分割网络预测的第j个样本图像中第i个像素点在目标对象所位于的图像区域的概率,x为对数log的底值。
在本申请实施例中,y ji的数值是根据第j个样本图像对应的样本掩膜确定的,比如:若第j个样本图像对应的样本掩膜中,指示该第j个样本图像中第i个像素点位于目标对象所在的图像区域,则y ji可以为1,若第j个样本图像对应的样本掩膜中,指示该第j个样本图像中第i个像素点没有位于目标对象所在的图像区域,则y ji可以为0。本领域技术人员应该能够理解,y ji的取值并不局限与1和0,也可以为其他数值。y ji的取值是预先设定的,比如为1或者0。
本领域技术人员应该注意,当样本掩膜指示第i个像素点位于目标对象所在的图像区域时,y ji的值要大于当样本掩膜指示第i个像素点未位于目标对象所在的图像区域时y ji的值。也即是,若样本掩膜指示第i个像素点位于目标对象所在的图像区域时,y ji为1,否则,y ji为0。或者,若样本掩膜指示第i个像素点位于目标对象所在的图像区域时,y ji为2,否则,y ji为1。或者,若样本掩膜指示第i个像素点位于目标对象所在的图像区域时,y ji为0.8,否则,y ji为0.2。
在本申请实施例中,F2 j的计算方式可以与上述公式(2)类似,即可以为:计算第j个样本图像所对应的样本边缘信息与生成边缘信息的交叉熵损失。
或者,若上述训练后的边缘神经网络由A个卷积块级联而成,每个卷积块均有B个卷积层构成,则相应地,F2 j的计算公式可以如下:
Figure PCTCN2020117470-appb-000005
其中,mask 1为第j个样本图像对应的生成掩膜,mask 2为第j个样本图像对应的样本掩膜,h c(mask 1)为训练后的边缘神经网络输入为mask 1时,第c个卷积块的输出,h c(mask 2)为训练后的边缘神经网络输入为mask 2时,第c个卷积块的输出,λ c为常数。
上述F2 j的计算公式中,当训练后的边缘神经网络输入为mask 2时,最后一个卷积块的输出可以认为等同于样本边缘信息,因此,可以通过上述公式(3)来衡量样本边缘信息与生成边缘信息的差距。
如图6所示,边缘神经网络可以由3个卷积块级联而成,每个卷积块均为一个卷积层。
在步骤S105中,判断上述损失函数是否小于第一预设阈值,若是,则执行步骤S107,否则,执行步骤S106。
在步骤S106中,调整上述图像分割网络的各个参数,然后返回执行步骤S102。
在步骤S107中,得到训练后的图像分割网络。
也即是,不断调整图像分割网络的各个参数,直至损失函数小于第一预设阈值。此外,在本申请实施例中,并不对参数调整方式进行具体限定,可以采用梯度下降算法、动力更新算法等等,此处对调整参数所使用的方法不作限定。
此外,在本申请实施例中,在对图像分割网络进行训练的时候,当将样本图像输入至图像分割网络之前,可以先将样本图像进行预处理,然后将预处理后的样本图像输入至图像分割网络。其中,上述预处理可以包括:图像裁剪和/或归一化处理等。
在上述步骤S107之后,还可以用测试集对训练后的图像分割网络进行评价。其中,测试集的获取方式可以参见现有技术,此处不再赘述。
对于测试集中的单个样本图像来说,评价函数可以为:
Figure PCTCN2020117470-appb-000006
其中,X是将该样本图像输入至训练后的图像分割网络后,该图像分割网络输出的生成掩膜所指示的目标对象的图像区域。
Y是该样本图像对应的样本掩膜所指示的目标对象的图像区域。
通过X与Y的IoU(Intersection-over-Union,交并比),来评价该训练后的图像分割网络,IoU的值越接近1,说明该训练后的图像分割网络性能越好。通过对训练后的图像分割网络进行评价,能够进一 步评估得到的训练后的图像分割网络的性能是否符合要求。例如,若判断出训练后的图像分割网络的性能不符合要求,则继续对训练后的图像分割网络进行训练。
本申请实施例一所提供的训练方法,在保证图像分割网络输出的生成掩膜逼近样本掩膜的同时,会进一步保证图像分割网络输出的生成掩膜中所表示的目标对象的轮廓边缘与真实的轮廓边缘更为逼近,因此,本申请所提供的图像分割网络所输出的生成掩膜所对应的图像能够更加精确地表示目标对象的轮廓边缘。
实施例二
下面对本申请实施例二提供的另一种图像分割网络的训练方法进行描述,该训练方法相比于实施例一所述的训练方法,包含了对边缘神经网络的训练过程。请参阅附图7,该训练方法包括:
在步骤S301中,获取包含目标对象的各个样本图像、每个样本图像分别对应的样本掩膜以及每个样本掩膜分别对应的样本边缘信息,其中,每个样本掩膜均用于指示对应样本图像中目标对象所位于的图像区域,每个样本边缘信息均用于指示对应样本掩膜所指示的目标对象所在图像区域的轮廓边缘。
该步骤S301的具体实施过程具体可参见实施例一中步骤S101部分,此处不再赘述。
在步骤S302中,对于每个样本掩膜,将该样本掩膜输入至边缘神经网络,得到该边缘神经网络输出的边缘信息,该边缘信息用于指示该样本掩膜所指示的目标对象所在区域的轮廓边缘。
在本申请实施例中,该步骤S302至后续的步骤S306为对边缘神经网络的训练过程,以得到训练后的边缘神经网络。本领域技术人员应该能够理解,该步骤S302-S306执行在后续步骤S308之前,并不必须执行在步骤S307之前。
在执行该步骤S302之前,需要事先建立边缘神经网络,该边缘神经网络用于获取输入的样本掩膜所指示的目标对象所在区域的轮廓边缘。如附图6所示,该边缘神经网络可以由3个卷积层级联而成。
在该步骤S302中,将各个样本掩膜输入至边缘神经网络,得到该边缘神经网络输出的各个边缘信息,其中,每个样本掩膜均对应有一个该边缘神经网络输出的边缘信息。
在步骤S303中,确定上述边缘神经网络的损失函数,该损失函数用于衡量每个样本掩膜分别对应的样本边缘信息与上述边缘神经网络输出的边缘信息的差距。
在本申请实施例中,该步骤S303的具体含义是:确定上述边缘神经网络的损失函数,其中该损失函数正相关于每个样本掩膜对应的边缘差距(该边缘差距为该样本掩膜对应的样本边缘信息,与将该样本掩膜输入至上述边缘神经网络后该边缘神经网络输出的边缘信息的差距)。
在本申请实施例中,上述边缘神经网络的损失函数计算方式可以为:
若上述样本边缘信息与上述边缘神经网络输出的边缘信息均为如图1中003所示的图像,则上述边缘神经网络的损失函数可以为上述样本边缘信息与上述边缘神经网络输出的边缘信息的图像差(该图像差计算方式可参见实施例一中的步骤A所述,此处不再赘述)。
此外,上述边缘神经网络的损失函数计算方式可以为:对于每个样本掩膜来说,计算对应的样本边缘信息与上述边缘神经网络输出的边缘信息的交叉熵损失,然后求取平均。具体计算公式如下:
Figure PCTCN2020117470-appb-000007
其中,LOSS 2为上述边缘神经网络的损失函数,N为样本掩膜的总个数(本领域技术人员容易理解,样本图像、样本掩膜以及样本边缘信息的总个数均相同,均为N),M为第j个样本掩膜中像素点的总个数,r ji的数值是根据第j个样本图像对应的样本边缘信息确定的,r ji用于指示第j个样本掩膜中第i个像素点是否为轮廓边缘,q ji为上述边缘神经网络预测的第j个样本掩膜中第i个像素点为轮廓边缘的概率,x为对数log的底值。
在本申请实施例中,r ji的数值是根据第j个样本掩膜对应的样本边缘信息确定的,比如:若第j个样本掩膜对应的样本边缘信息中,指示该第j个样本掩膜中第i个像素点为轮廓边缘,则r ji可以为1,若第j个样本掩膜对应的样本边缘信息中,指示该第i个像素点不为轮廓边缘,则r ji可以为0。本领域技术人员应该能够理解,r ji的取值并不局限与1和0,也可以为其他数值。r ji的取值是预先设定的,比如为1或者0。
本领域技术人员应该注意,当样本边缘信息指示第i个像素点为轮廓边缘时,r ji的值要大于当样本边缘信息指示第i个像素点不为轮廓边缘时r ji的值。也即是,若样本边缘信息指示第i个像素点为轮廓边缘时,r ji为1,否则,r ji为0。或者,若样本边缘信息指示第i个像素点为轮廓边缘时,r ji为2,否则,r ji为1。或者,若样本边缘信息指示第i个像素点为轮廓边缘时,r ji为0.8,否则,r ji为0.2。
在步骤S304中,判断上述边缘神经网络的损失函数是否小于第二预设阈值,若否,则执行步骤S305,若是,则执行步骤S306。
在步骤S305中,调整上述边缘神经网络模型的各个参数,然后返回执行步骤S302。
在步骤S306中,得到训练后的边缘神经网络。
也即是,不断调整边缘神经网络的各个参数,直至损失函数小于第二预设阈值。此外,在本申请实施例中,并不对参数调整方式进行具体限定,可以采用梯度下降算法、动力更新算法等等,此处对调整参数所使用的方法不作限定。
在步骤S307中,对于每个样本图像,将该样本图像输入至图像分割网络,得到该图像分割网络输出的用于指示该样本图像中目标对象所在区域的生成掩膜。
在步骤S308中,对于每个生成掩膜,将该生成掩膜输入至训练后的边缘神经网络,得到该边缘神经网络输出的生成边缘信息,该生成边缘信息用于指示该生成掩膜所指示的目标对象所在区域的轮廓边缘。
在步骤S309中,确定上述图像分割网络的损失函数,该损失函数用于衡量每个样本图像分别对应 的样本掩膜与生成掩膜的差距,并且,该损失函数还用于衡量每个样本图像分别对应的生成边缘信息与样本边缘信息的差距。
在步骤S310中,判断上述损失函数是否小于第一预设阈值,若是,则执行步骤S312,否则,执行步骤S311。
在步骤S311中,调整上述图像分割网络的各个参数,然后返回执行步骤S102。
在步骤S312中,得到训练后的图像分割网络。
上述步骤S307-S312的具体实施方式与实施例一中步骤S102-S107的具体实施方式完全相同,具体可参见实施例一的描述,此处不再赘述。
下面利用附图8对本申请实施例二所述的训练过程进行简要说明。
如图8(a)所示,为边缘神经网络的训练过程。首先,将样本掩膜输入至边缘神经网络中,得到该边缘神经网络输出的边缘信息。其次,根据边缘神经网络与各个样本边缘信息,计算交叉熵损失,其中,该样本边缘信息为样本掩膜通过膨胀运算及相减运算得到,具体可参见实施例一的描述,此处不在赘述。然后,将各个交叉熵损失进行平均即得到损失函数。最后,不断调整该边缘神经网络的各个参数,直至损失函数较小为止,从而得到训练后的边缘神经网络。
在得到训练后的边缘神经网络之后,可参见图8(b)实现对图像分割网络的训练。
如图8(b)所示,为图像分割网络的训练过程,首先,将样本图像输入至图像分割网络中,得到该图像分割网络输出的生成掩膜,并计算该生成掩膜与样本掩膜的交叉熵损失。其次,将该生成掩膜输入至训练后的边缘神经网络,得到每个卷积层的输出,并将样本掩膜输入至该训练后的边缘神经网络,得到每个卷积层的输出。然后,根据计算的上述交叉熵损失、将生成掩膜输入时,各个卷积层的输出以及将样本掩膜输入时,各个卷积层的输出,计算图像分割网络的损失函数(具体计算方式可参见实施例一的描述)。最后,不断调整该图像分割网络的各个参数,直至损失函数较小为止,从而得到训练后的图像分割网络。
本申请实施例二所述的训练方法,相比于实施例一多出了边缘神经网络的训练过程,这可以使得训练边缘神经网络所采用的样本与训练图像分割网络所采用的样本是一致的,从而可以更好的根据边缘神经网络的输出结果,来衡量图像分割网络输出的掩膜的边缘的准确度,从而更好地训练图像分割网络。
实施例三
本申请实施例三提供了一种图像处理方法,请参阅附图9,该图像处理方法包括:
在步骤S401中,获取待处理图像,并将该待处理图像输入至训练后的图像分割网络,得到该待处理图像对应的掩膜,其中,该训练后的图像分割网络是采用训练后的边缘神经网络训练得到,该训练后的边缘神经网络用于根据输入的掩膜输出该掩膜所指示的目标对象所在区域的边缘轮廓。
具体地,该步骤S401所述的训练后的边缘神经网络是采用如上述实施例一或实施例二所述方法训 练得到的神经网络。
在步骤S402中,基于上述待处理图像对应的掩膜,将该待处理图像中所包含的目标对象分割出来。
本领域技术人员容易理解,在上述步骤S402之后,还可以执行更换背景的具体操作,该操作是现有技术,此处不在赘述。
该实施例三所述的方法可以是应用在终端设备(比如手机)中的方法,该方法可以便于用户更换待处理图像中的背景,该方法能够准确的分割目标对象,更加精确的更换背景,能够在一定程度上提高用户体验。
实施例四
本申请实施例四提供了一种图像分割网络的训练装置。为了便于说明,仅示出与本申请相关的部分,如图10所示,该训练装置500包括:
样本获取模块501,用于获取包含目标对象的各个样本图像、每个样本图像分别对应的样本掩膜以及每个样本掩膜分别对应的样本边缘信息,其中,每个样本掩膜均用于指示对应样本图像中目标对象所位于的图像区域,每个样本边缘信息均用于指示对应样本掩膜所指示的目标对象所在图像区域的轮廓边缘。
生成掩膜获取模块502,用于对于每个样本图像,将该样本图像输入至图像分割网络,得到该图像分割网络输出的用于指示该样本图像中目标对象所在区域的生成掩膜。
生成边缘获取模块503,用于对于每个生成掩膜,将该生成掩膜输入至训练后的边缘神经网络,得到该边缘神经网络输出的生成边缘信息,该生成边缘信息用于指示该生成掩膜所指示的目标对象所在区域的轮廓边缘。
损失确定模块504,用于确定所述图像分割网络的损失函数,所述损失函数用于衡量每个样本图像分别对应的样本掩膜与生成掩膜的差距,并且,所述损失函数还用于衡量每个样本图像分别对应的生成边缘信息与样本边缘信息的差距。
调参模块505,用于调整所述图像分割网络的各个参数,然后触发所述生成掩膜获取模块继续执行相应的步骤,直至所述图像分割网络的损失函数小于第一预设阈值为止,从而得到训练后的图像分割网络。
可选地,上述损失确定模块504具体用于:
确定所述图像分割网络的损失函数,该损失函数的计算公式为:
Figure PCTCN2020117470-appb-000008
其中,LOSS 1为所述图像分割网络的损失函数,N为样本图像的总个数,F1 j用于衡量第j个样本图像对应的样本掩膜与生成掩膜的差距,F2 j用于衡量第j个样本图像对应的样本边缘信息与生成边缘信息的 差距,
Figure PCTCN2020117470-appb-000009
可选地,上述F1 j的计算公式如下:
Figure PCTCN2020117470-appb-000010
其中,M为第j个样本图像中像素点的总个数,y ji的数值是根据第j个样本图像对应的样本掩膜确定的,y ji用于指示第j个样本图像中第i个像素点是否在目标对象所位于的图像区域,p ji为所述图像分割网络预测的第j个样本图像中第i个像素点在目标对象所位于的图像区域的概率,x为对数log的底值。
此外,当样本掩膜指示第i个像素点位于目标对象所在的图像区域时,该y ji的值要大于当样本掩膜指示第i个像素点未位于目标对象所在的图像区域时y ji的值。
可选地,上述训练后的边缘神经网络由A个卷积块级联而成,每个卷积块均有B个卷积层构成。
相应地,上述F2 j的计算公式如下:
Figure PCTCN2020117470-appb-000011
其中,mask 1为第j个样本图像对应的生成掩膜,mask 2为第j个样本图像对应的样本掩膜,h c(mask 1)为训练后的边缘神经网络输入为mask 1时,第c个卷积块的输出,h c(mask 2)为训练后的边缘神经网络输入为mask 2时,第c个卷积块的输出,λ c为常数。
可选地,上述训练装置还包括边缘神经网络训练模块,该边缘神经网络训练模块包括:
边缘信息获取单元,用于对于每个样本掩膜,将该样本掩膜输入至边缘神经网络,得到该边缘神经网络输出的边缘信息,该边缘信息用于指示该样本掩膜所指示的目标对象所在区域的轮廓边缘。
边缘损失确定单元,用于确定所述边缘神经网络的损失函数,该损失函数用于衡量每个样本掩膜分别对应的样本边缘信息与所述边缘神经网络输出的边缘信息的差距。
边缘调参单元,用于调整所述边缘神经网络的各个参数,然后触发所述边缘信息获取单元继续执行相应步骤,直至所述边缘神经网络的损失函数值小于第二预设阈值,从而得到训练后的边缘神经网络。
可选地,上述边缘损失确定单元具体用于:
确定所述边缘神经网络的损失函数,该损失函数的计算公式为:
Figure PCTCN2020117470-appb-000012
其中,LOSS 2为所述边缘神经网络的损失函数,N为样本图像的总个数,M为第j个样本掩膜中像素点的总个数,r ji的数值是根据第j个样本图像对应的样本边缘信息确定的,r ji用于指示第j个样本掩膜中第i个像素点是否为轮廓边缘,q ji为所述边缘神经网络预测的第j个样本掩膜中第i个像素点为轮廓边缘的概率,x为对数log的底值。
此外,当样本边缘信息指示第i个像素点为轮廓边缘时,该r ji的值要大于当样本边缘信息指示第i个像素点不为轮廓边缘时r ji的值。
需要说明的是,上述装置/单元之间的信息交互、执行过程等内容,由于与本申请方法实施例一以及方法实施例二基于同一构思,其具体功能及带来的技术效果,具体可参见相应方法实施例部分,此处不再赘述。
实施例五
本申请实施例五提供了一种图像处理装置。为了便于说明,仅示出与本申请相关的部分,如图11所示,该图像处理装置600包括:
掩膜获取模块601,用于获取待处理图像,并将所述待处理图像输入至训练后的图像分割网络,得到所述待处理图像对应的掩膜,其中,是采用训练后的边缘神经网络训练得到,该训练后的边缘神经网络用于根据输入的掩膜输出该掩膜所指示的目标对象所在区域的边缘轮廓(具体地,所述训练后的图像分割网络是采用如实施例一或实施例二所述训练方法训练得到)。
目标对象分割模块602,用于基于所述待处理图像对应的掩膜,将所述待处理图像中所包含的目标对象分割出来。
需要说明的是,上述装置/单元之间的信息交互、执行过程等内容,由于与本申请方法实施例三基于同一构思,其具体功能及带来的技术效果,具体可参见方法实施例三部分,此处不再赘述。
实施例六
图12是本申请实施例六提供的终端设备的示意图。如图12所示,该实施例的终端设备700包括:处理器701、存储器702以及存储在上述存储器702中并可在上述处理器701上运行的计算机程序703。上述处理器701执行上述计算机程序703时实现上述各个方法实施例中的步骤。或者,上述处理器701执行上述计算机程序703时实现上述各装置实施例中各模块/单元的功能。
示例性的,上述计算机程序703可以被分割成一个或多个模块/单元,上述一个或者多个模块/单元被存储在上述存储器702中,并由上述处理器701执行,以完成本申请。上述一个或多个模块/单元可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述上述计算机程序703在上述终端设备700中的执行过程。例如,上述计算机程序703可以被分割成样本获取模块、生成掩膜获取模块、生成边缘获取模块、损失确定模块以及调参模块,各模块具体功能如下:
S101,获取包含目标对象的各个样本图像、每个样本图像分别对应的样本掩膜以及每个样本掩膜分别对应的样本边缘信息,其中,每个样本掩膜均用于指示对应样本图像中目标对象所位于的图像区域,每个样本边缘信息均用于指示对应样本掩膜所指示的目标对象所在图像区域的轮廓边缘。
S102,对于每个样本图像,将该样本图像输入至图像分割网络,得到该图像分割网络输出的用于指示该样本图像中目标对象所在区域的生成掩膜。
S103,对于每个生成掩膜,将该生成掩膜输入至训练后的边缘神经网络,得到该边缘神经网络输出的生成边缘信息,该生成边缘信息用于指示该生成掩膜所指示的目标对象所在区域的轮廓边缘。
S104,确定所述图像分割网络的损失函数,所述损失函数用于衡量每个样本图像分别对应的样本掩膜与生成掩膜的差距,并且,所述损失函数还用于衡量每个样本图像分别对应的生成边缘信息与样本边缘信息的差距。
S105,调整所述图像分割网络的各个参数,然后返回执行S102,直至所述图像分割网络的损失函数小于第一预设阈值为止,从而得到训练后的图像分割网络。
或者,上述计算机程序703可以被分割成掩膜获取模块以及目标对象分割模块,各模块具体功能如下:
获取待处理图像,并将所述待处理图像输入至训练后的图像分割网络,得到所述待处理图像对应的掩膜,其中,所述训练后的图像分割网络是采用实施例一或实施例二所述的训练方法训练得到。
基于所述待处理图像对应的掩膜,将所述待处理图像中所包含的目标对象分割出来。
上述终端设备可包括,但不仅限于,处理器701、存储器702。本领域技术人员可以理解,图12仅仅是终端设备700的示例,并不构成对终端设备700的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如上述终端设备还可以包括输入输出设备、网络接入设备、总线等。
所称处理器701可以是中央处理单元(Central Processing Unit,CPU),还可以是其它通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其它可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
上述存储器702可以是上述终端设备700的内部存储单元,例如终端设备700的硬盘或内存。上述存储器702也可以是上述终端设备700的外部存储设备,例如上述终端设备700上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,上述存储器702还可以既包括上述终端设备700的内部存储单元也包括外部存储设备。上述存储器702用于存储上述计算机程序以及上述终端设备所需的其它程序和数据。上述存储器702还可以用于暂时地存储已经输出或者将要输出的数据。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将上述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以 上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述***中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的实施例中,应该理解到,所揭露的装置/终端设备和方法,可以通过其它的方式实现。例如,以上所描述的装置/终端设备实施例仅仅是示意性的,例如,上述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。
上述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
上述集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实现上述各个方法实施例中的全部或部分流程,也可以通过计算机程序来指令相关的硬件来完成,上述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,上述计算机程序包括计算机程序代码,上述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。上述计算机可读介质可以包括:能够携带上述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是,上述计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,计算机可读介质不包括电载波信号和电信信号。
以上上述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进 行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。

Claims (20)

  1. 一种图像分割网络的训练方法,其特征在于,包括:
    S101,获取包含目标对象的各个样本图像、每个样本图像分别对应的样本掩膜以及每个样本掩膜分别对应的样本边缘信息,其中,每个样本掩膜均用于指示对应样本图像中目标对象所位于的图像区域,每个样本边缘信息均用于指示对应样本掩膜所指示的目标对象所在图像区域的轮廓边缘;
    S102,对于每个样本图像,将该样本图像输入至图像分割网络,得到该图像分割网络输出的用于指示该样本图像中目标对象所在区域的生成掩膜;
    S103,对于每个生成掩膜,将该生成掩膜输入至训练后的边缘神经网络,得到该边缘神经网络输出的生成边缘信息,该生成边缘信息用于指示该生成掩膜所指示的目标对象所在区域的轮廓边缘;
    S104,确定所述图像分割网络的损失函数,所述损失函数用于衡量每个样本图像分别对应的样本掩膜与生成掩膜的差距,并且,所述损失函数还用于衡量每个样本图像分别对应的生成边缘信息与样本边缘信息的差距;
    S105,调整所述图像分割网络的各个参数,然后返回执行S102,直至所述图像分割网络的损失函数小于第一预设阈值为止,从而得到训练后的图像分割网络。
  2. 如权利要求1所述的训练方法,其特征在于,所述确定所述图像分割网络的损失函数,所述损失函数用于衡量每个样本图像分别对应的样本掩膜与生成掩膜的差距,并且,所述损失函数还用于衡量每个样本图像分别对应的生成边缘信息与样本边缘信息的差距,包括:
    确定所述图像分割网络的损失函数,所述损失函数正相关于每个样本图像分别对应的掩膜差距,并且,所述损失函数正相关于每个样本图像分别对应的边缘差距,其中,每个样本图像对应的掩膜差距为该样本图像对应的样本掩膜与生成掩膜的差距,每个样本图像对应的边缘差距为该样本图像对应的样本边缘信息与生成边缘信息的差距。
  3. 如权利要求2所述的训练方法,其特征在于,所述确定所述图像分割网络的损失函数,所述损失函数正相关于每个样本图像分别对应的掩膜差距,并且,所述损失函数正相关于每个样本图像分别对应的边缘差距,包括:
    确定所述图像分割网络的损失函数,该损失函数的计算公式为:
    Figure PCTCN2020117470-appb-100001
    其中,LOSS 1为所述图像分割网络的损失函数,N为样本图像的总个数,F1 j用于衡量第j个样本图像对应的样本掩膜与生成掩膜的差距,F2 j用于衡量第j个样本图像对应的样本边缘信息与生成边缘信息的差距,
    Figure PCTCN2020117470-appb-100002
  4. 如权利要求3所述的训练方法,其特征在于,F1 j的计算公式如下:
    Figure PCTCN2020117470-appb-100003
    其中,M为第j个样本图像中像素点的总个数,y ji的数值是根据第j个样本图像对应的样本掩膜确定的,y ji用于指示第j个样本图像中第i个像素点是否在目标对象所位于的图像区域,p ji为所述图像分割网络预测的第j个样本图像中第i个像素点在目标对象所位于的图像区域的概率,x为对数log的底值;
    此外,当样本掩膜指示第i个像素点位于目标对象所在的图像区域时,该y ji的值要大于当样本掩膜指示第i个像素点未位于目标对象所在的图像区域时y ji的值。
  5. 如权利要求1所述的训练方法,其特征在于,所述确定图像分割网络的损失函数,包括:
    对于每个样本图像,计算所述样本图像对应的生成掩膜与所述样本图像对应的样本掩膜的图像差;
    若上述样本边缘信息以及生成边缘信息均为图像,则对于每个样本图像,计算该样本图像对应的样本边缘信息与该样本图像对应的生成边缘信息的图像差;
    将所述样本掩膜的图像差与所述生成边缘信息的图像差进行平均,得到所述图像分割网络的损失函数。
  6. 如权利要求3所述的训练方法,其特征在于,所述训练后的边缘神经网络由A个卷积块级联而成,每个卷积块均有B个卷积层构成;
    相应地,F2 j的计算公式如下:
    Figure PCTCN2020117470-appb-100004
    其中,mask 1为第j个样本图像对应的生成掩膜,mask 2为第j个样本图像对应的样本掩膜,h c(mask 1)为训练后的边缘神经网络输入为mask 1时,第c个卷积块的输出,h c(mask 2)为训练后的边缘神经网络输入为mask 2时,第c个卷积块的输出,λ c为常数。
  7. 如权利要求1至6中任一项所述的训练方法,其特征在于,在所述步骤S103之前,所述训练方法还包括对边缘神经网络的训练过程,所述对边缘神经网络的训练过程如下:
    对于每个样本掩膜,将该样本掩膜输入至边缘神经网络,得到该边缘神经网络输出的边缘信息,该边缘信息用于指示该样本掩膜所指示的目标对象所在区域的轮廓边缘;
    确定所述边缘神经网络的损失函数,该损失函数用于衡量每个样本掩膜分别对应的样本边缘信息与所述边缘神经网络输出的边缘信息的差距;
    调整所述边缘神经网络的各个参数,然后返回执行所述对于每个样本掩膜,将该样本掩膜输入至边缘神经网络,得到该边缘神经网络输出的边缘信息的步骤以及后续步骤,直至所述边缘神经网络的损失函数值小于第二预设阈值,从而得到训练后的边缘神经网络。
  8. 如权利要求7所述的训练方法,其特征在于,所述确定所述边缘神经网络的损失函数,该损失函数用于衡量每个样本掩膜分别对应的样本边缘信息与所述边缘神经网络输出的边缘信息的差距,包括:
    确定所述边缘神经网络的损失函数,该损失函数的计算公式为:
    Figure PCTCN2020117470-appb-100005
    其中,LOSS 2为所述边缘神经网络的损失函数,N为样本图像的总个数,M为第j个样本掩膜中像素点的总个数,r ji的数值是根据第j个样本图像对应的样本边缘信息确定的,r ji用于指示第j个样本掩膜中第i个像素点是否为轮廓边缘,q ji为所述边缘神经网络预测的第j个样本掩膜中第i个像素点为轮廓边缘的概率,x为对数log的底值;
    此外,当样本边缘信息指示第i个像素点为轮廓边缘时,该r ji的值要大于当样本边缘信息指示第i个像素点不为轮廓边缘时r ji的值。
  9. 如权利要求1至6任一项所述的训练方法,其特征在于,在所述得到训练后的图像分割网络之后,包括:
    根据测试集的样本图像以及评价函数对所述训练后的图像分割网络进行评价,其中,评价函数为:
    Figure PCTCN2020117470-appb-100006
    其中,X是将所述测试集中的任一样本图像输入至所述训练后的图像分割网络后,所述图像分割网络输出的生成掩膜所指示的目标对象的图像区域;
    Y是输入所述训练后的图像分割网络的样本图像所对应的样本掩膜所指示的目标对象的图像区域;
    IoU的值越接近1,说明所述训练后的图像分割网络性能越好。
  10. 一种图像处理方法,其特征在于,包括:
    获取待处理图像,并将所述待处理图像输入至训练后的图像分割网络,得到所述待处理图像对应的掩膜,其中,所述训练后的图像分割网络是采用训练后的边缘神经网络训练得到,所述训练后的边缘神经网络用于根据输入的掩膜,输出该掩膜所指示的目标对象所在区域的轮廓边缘;
    基于所述待处理图像对应的掩膜,将所述待处理图像中所包含的目标对象分割出来。
  11. 如权利要求10所述的图像处理方法,其特征在于,所述训练后的图像分割网络是采用训练后的边缘神经网络训练得到,所述训练后的边缘神经网络用于根据输入的掩膜,输出该掩膜所指示的目标对象所在区域的轮廓边缘,包括:
    所述训练后的图像分割网络是采用如权利要求1至9中任一项所述的训练方法训练得到。
  12. 一种图像分割网络,其特征在于,所述图像分割网络采用如权利要求1至9中任一项所述的训练方法训练得到。
  13. 一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如下步骤:
    S101,获取包含目标对象的各个样本图像、每个样本图像分别对应的样本掩膜以及每个样本掩膜分别对应的样本边缘信息,其中,每个样本掩膜均用于指示对应样本图像中目标对象所位于的图像区域,每个样本边缘信息均用于指示对应样本掩膜所指示的目标对象所在图像区域的轮廓边缘;
    S102,对于每个样本图像,将该样本图像输入至图像分割网络,得到该图像分割网络输出的用于指示该样本图像中目标对象所在区域的生成掩膜;
    S103,对于每个生成掩膜,将该生成掩膜输入至训练后的边缘神经网络,得到该边缘神经网络输出的生成边缘信息,该生成边缘信息用于指示该生成掩膜所指示的目标对象所在区域的轮廓边缘;
    S104,确定所述图像分割网络的损失函数,所述损失函数用于衡量每个样本图像分别对应的样本掩膜与生成掩膜的差距,并且,所述损失函数还用于衡量每个样本图像分别对应的生成边缘信息与样本边缘信息的差距;
    S105,调整所述图像分割网络的各个参数,然后返回执行S102,直至所述图像分割网络的损失函数小于第一预设阈值为止,从而得到训练后的图像分割网络。
  14. 如权利要求13所述的终端设备,其特征在于,所述处理器执行所述确定所述图像分割网络的损失函数时,包括:
    确定所述图像分割网络的损失函数,所述损失函数正相关于每个样本图像分别对应的掩膜差距,并且,所述损失函数正相关于每个样本图像分别对应的边缘差距,其中,每个样本图像对应的掩膜差距为该样本图像对应的样本掩膜与生成掩膜的差距,每个样本图像对应的边缘差距为该样本图像对应的样本边缘信息与生成边缘信息的差距。
  15. 如权利要求14所述的终端设备,其特征在于,所述处理器执行所述确定所述图像分割网络的损失函数,所述损失函数正相关于每个样本图像分别对应的掩膜差距,并且,所述损失函数正相关于每个样本图像分别对应的边缘差距时,包括:
    确定所述图像分割网络的损失函数,该损失函数的计算公式为:
    Figure PCTCN2020117470-appb-100007
    其中,LOSS 1为所述图像分割网络的损失函数,N为样本图像的总个数,F1 j用于衡量第j个样本图像对应的样本掩膜与生成掩膜的差距,F2 j用于衡量第j个样本图像对应的样本边缘信息与生成边缘信息的差距,
    Figure PCTCN2020117470-appb-100008
  16. 如权利要求15所述的终端设备,其特征在于,F1 j的计算公式如下:
    Figure PCTCN2020117470-appb-100009
    其中,M为第j个样本图像中像素点的总个数,y ji的数值是根据第j个样本图像对应的样本掩膜确定的,y ji用于指示第j个样本图像中第i个像素点是否在目标对象所位于的图像区域,p ji为所述图像分割网络预测的第j个样本图像中第i个像素点在目标对象所位于的图像区域的概率,x为对数log的底值;
    此外,当样本掩膜指示第i个像素点位于目标对象所在的图像区域时,该y ji的值要大于当样本掩膜指示第i个像素点未位于目标对象所在的图像区域时y ji的值。
  17. 如权利要求15所述的终端设备,其特征在于,所述训练后的边缘神经网络由A个卷积块级联而成,每个卷积块均有B个卷积层构成;
    相应地,F2 j的计算公式如下:
    Figure PCTCN2020117470-appb-100010
    其中,mask 1为第j个样本图像对应的生成掩膜,mask 2为第j个样本图像对应的样本掩膜,h c(mask 1)为训练后的边缘神经网络输入为mask 1时,第c个卷积块的输出,h c(mask 2)为训练后的边缘神经网络输入为mask 2时,第c个卷积块的输出,λ c为常数。
  18. 如权利要求13至17任一项所述的终端设备,其特征在于,所述处理器执行所述计算机程序包括对边缘神经网络的训练过程,所述对边缘神经网络的训练过程如下:
    对于每个样本掩膜,将该样本掩膜输入至边缘神经网络,得到该边缘神经网络输出的边缘信息,该边缘信息用于指示该样本掩膜所指示的目标对象所在区域的轮廓边缘;
    确定所述边缘神经网络的损失函数,该损失函数用于衡量每个样本掩膜分别对应的样本边缘信息与所述边缘神经网络输出的边缘信息的差距;
    调整所述边缘神经网络的各个参数,然后返回执行所述对于每个样本掩膜,将该样本掩膜输入至边缘神经网络,得到该边缘神经网络输出的边缘信息的步骤以及后续步骤,直至所述边缘神经网络的损失函数值小于第二预设阈值,从而得到训练后的边缘神经网络。
  19. 如权利要求18所述的终端设备,其特征在于,所述处理器在执行所述确定所述边缘神经网络的损失函数,该损失函数用于衡量每个样本掩膜分别对应的样本边缘信息与所述边缘神经网络输出的边缘信息的差距时,包括:
    确定所述边缘神经网络的损失函数,该损失函数的计算公式为:
    Figure PCTCN2020117470-appb-100011
    其中,LOSS 2为所述边缘神经网络的损失函数,N为样本图像的总个数,M为第j个样本掩膜中 像素点的总个数,r ji的数值是根据第j个样本图像对应的样本边缘信息确定的,r ji用于指示第j个样本掩膜中第i个像素点是否为轮廓边缘,q ji为所述边缘神经网络预测的第j个样本掩膜中第i个像素点为轮廓边缘的概率,x为对数log的底值;
    此外,当样本边缘信息指示第i个像素点为轮廓边缘时,该r ji的值要大于当样本边缘信息指示第i个像素点不为轮廓边缘时r ji的值。
  20. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至11中任一项所述方法的步骤。
PCT/CN2020/117470 2019-09-29 2020-09-24 网络的训练方法、图像处理方法、网络、终端设备及介质 WO2021057848A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910931784.2 2019-09-29
CN201910931784.2A CN110660066B (zh) 2019-09-29 2019-09-29 网络的训练方法、图像处理方法、网络、终端设备及介质

Publications (1)

Publication Number Publication Date
WO2021057848A1 true WO2021057848A1 (zh) 2021-04-01

Family

ID=69039787

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/117470 WO2021057848A1 (zh) 2019-09-29 2020-09-24 网络的训练方法、图像处理方法、网络、终端设备及介质

Country Status (2)

Country Link
CN (1) CN110660066B (zh)
WO (1) WO2021057848A1 (zh)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113177606A (zh) * 2021-05-20 2021-07-27 上海商汤智能科技有限公司 图像处理方法、装置、设备及存储介质
US20210279883A1 (en) * 2020-03-05 2021-09-09 Alibaba Group Holding Limited Image processing method, apparatus, electronic device, and storage medium
CN113378948A (zh) * 2021-06-21 2021-09-10 梅卡曼德(北京)机器人科技有限公司 图像掩膜生成方法、装置、电子设备和存储介质
CN113724163A (zh) * 2021-08-31 2021-11-30 平安科技(深圳)有限公司 基于神经网络的图像矫正方法、装置、设备及介质
CN115223171A (zh) * 2022-03-15 2022-10-21 腾讯科技(深圳)有限公司 文本识别方法、装置、设备及存储介质
CN116823864A (zh) * 2023-08-25 2023-09-29 锋睿领创(珠海)科技有限公司 基于平衡损失函数的数据处理方法、装置、设备及介质
CN117315263A (zh) * 2023-11-28 2023-12-29 杭州申昊科技股份有限公司 一种目标轮廓分割装置、训练方法、分割方法、电子设备

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110660066B (zh) * 2019-09-29 2023-08-04 Oppo广东移动通信有限公司 网络的训练方法、图像处理方法、网络、终端设备及介质
CN111311485B (zh) * 2020-03-17 2023-07-04 Oppo广东移动通信有限公司 图像处理方法及相关装置
CN111415358B (zh) * 2020-03-20 2024-03-12 Oppo广东移动通信有限公司 图像分割方法、装置、电子设备及存储介质
CN111462086B (zh) * 2020-03-31 2024-04-26 推想医疗科技股份有限公司 图像分割方法及装置、神经网络模型的训练方法及装置
CN113744293A (zh) * 2020-05-13 2021-12-03 Oppo广东移动通信有限公司 图像处理方法、图像处理装置、电子设备和可读存储介质
CN111899273A (zh) * 2020-06-10 2020-11-06 上海联影智能医疗科技有限公司 图像分割方法、计算机设备和存储介质
CN113808003B (zh) * 2020-06-17 2024-02-09 北京达佳互联信息技术有限公司 图像处理模型的训练方法、图像处理方法及装置
CN111754521B (zh) * 2020-06-17 2024-06-25 Oppo广东移动通信有限公司 图像处理方法和装置、电子设备及存储介质
CN111488876B (zh) * 2020-06-28 2020-10-23 平安国际智慧城市科技股份有限公司 基于人工智能的车牌识别方法、装置、设备及介质
CN112070793A (zh) * 2020-09-11 2020-12-11 北京邮电大学 一种目标提取方法及装置
CN112132847A (zh) * 2020-09-27 2020-12-25 北京字跳网络技术有限公司 模型训练方法、图像分割方法、装置、电子设备和介质
CN112465843A (zh) * 2020-12-22 2021-03-09 深圳市慧鲤科技有限公司 图像分割方法及装置、电子设备和存储介质
CN112669228B (zh) * 2020-12-22 2024-05-31 厦门美图之家科技有限公司 图像处理方法、***、移动终端及存储介质
CN112580567B (zh) * 2020-12-25 2024-04-16 深圳市优必选科技股份有限公司 一种模型获取方法、模型获取装置及智能设备
CN113159074B (zh) * 2021-04-26 2024-02-09 京东科技信息技术有限公司 图像处理方法、装置、电子设备和存储介质
CN113643311B (zh) * 2021-06-28 2024-04-09 清华大学 一种对边界误差鲁棒的图像分割方法和装置
CN113327210B (zh) * 2021-06-30 2023-04-07 中海油田服务股份有限公司 测井图像填补方法、装置、介质及电子设备
CN113822287B (zh) * 2021-11-19 2022-02-22 苏州浪潮智能科技有限公司 一种图像处理方法、***、设备以及介质
WO2023097479A1 (zh) * 2021-11-30 2023-06-08 华为技术有限公司 一种训练模型的方法、构建三维耳廓结构的方法和装置
CN114419086A (zh) * 2022-01-20 2022-04-29 北京字跳网络技术有限公司 边缘提取方法、装置、电子设备及存储介质
CN114758136B (zh) * 2022-06-13 2022-10-18 深圳比特微电子科技有限公司 目标去除模型建立方法、装置及可读存储介质
CN117237397B (zh) * 2023-07-13 2024-05-28 天翼爱音乐文化科技有限公司 基于特征融合的人像分割方法、***、设备及存储介质
CN117474932B (zh) * 2023-12-27 2024-03-19 苏州镁伽科技有限公司 对象分割方法和装置、电子设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109325954A (zh) * 2018-09-18 2019-02-12 北京旷视科技有限公司 图像分割方法、装置及电子设备
CN109726644A (zh) * 2018-12-14 2019-05-07 重庆邮电大学 一种基于生成对抗网络的细胞核分割方法
US20190156154A1 (en) * 2017-11-21 2019-05-23 Nvidia Corporation Training a neural network to predict superpixels using segmentation-aware affinity loss
CN110660066A (zh) * 2019-09-29 2020-01-07 Oppo广东移动通信有限公司 网络的训练方法、图像处理方法、网络、终端设备及介质

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106846336B (zh) * 2017-02-06 2022-07-15 腾讯科技(上海)有限公司 提取前景图像、替换图像背景的方法及装置
CN109493347B (zh) * 2017-09-12 2021-03-23 深圳科亚医疗科技有限公司 在图像中对稀疏分布的对象进行分割的方法和***
CN108647588A (zh) * 2018-04-24 2018-10-12 广州绿怡信息科技有限公司 物品类别识别方法、装置、计算机设备和存储介质
CN109377445B (zh) * 2018-10-12 2023-07-04 北京旷视科技有限公司 模型训练方法、替换图像背景的方法、装置和电子***
CN109685067B (zh) * 2018-12-26 2022-05-03 江西理工大学 一种基于区域和深度残差网络的图像语义分割方法
CN110084234B (zh) * 2019-03-27 2023-04-18 东南大学 一种基于实例分割的声呐图像目标识别方法
CN110188760B (zh) * 2019-04-01 2021-10-22 上海卫莎网络科技有限公司 一种图像处理模型训练方法、图像处理方法及电子设备
CN110176016B (zh) * 2019-05-28 2021-04-30 招远市国有资产经营有限公司 一种基于人体轮廓分割与骨骼识别的虚拟试衣方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190156154A1 (en) * 2017-11-21 2019-05-23 Nvidia Corporation Training a neural network to predict superpixels using segmentation-aware affinity loss
CN109325954A (zh) * 2018-09-18 2019-02-12 北京旷视科技有限公司 图像分割方法、装置及电子设备
CN109726644A (zh) * 2018-12-14 2019-05-07 重庆邮电大学 一种基于生成对抗网络的细胞核分割方法
CN110660066A (zh) * 2019-09-29 2020-01-07 Oppo广东移动通信有限公司 网络的训练方法、图像处理方法、网络、终端设备及介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CHEN XU; WILLIAMS BRYAN M.; VALLABHANENI SRINIVASA R.; CZANNER GABRIELA; WILLIAMS RACHEL; ZHENG YALIN: "Learning Active Contour Models for Medical Image Segmentation", 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE, 15 June 2019 (2019-06-15), pages 11624 - 11632, XP033686570, DOI: 10.1109/CVPR.2019.01190 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210279883A1 (en) * 2020-03-05 2021-09-09 Alibaba Group Holding Limited Image processing method, apparatus, electronic device, and storage medium
US11816842B2 (en) * 2020-03-05 2023-11-14 Alibaba Group Holding Limited Image processing method, apparatus, electronic device, and storage medium
CN113177606A (zh) * 2021-05-20 2021-07-27 上海商汤智能科技有限公司 图像处理方法、装置、设备及存储介质
CN113177606B (zh) * 2021-05-20 2023-11-28 上海商汤智能科技有限公司 图像处理方法、装置、设备及存储介质
CN113378948A (zh) * 2021-06-21 2021-09-10 梅卡曼德(北京)机器人科技有限公司 图像掩膜生成方法、装置、电子设备和存储介质
CN113724163A (zh) * 2021-08-31 2021-11-30 平安科技(深圳)有限公司 基于神经网络的图像矫正方法、装置、设备及介质
CN113724163B (zh) * 2021-08-31 2024-06-07 平安科技(深圳)有限公司 基于神经网络的图像矫正方法、装置、设备及介质
CN115223171A (zh) * 2022-03-15 2022-10-21 腾讯科技(深圳)有限公司 文本识别方法、装置、设备及存储介质
CN116823864A (zh) * 2023-08-25 2023-09-29 锋睿领创(珠海)科技有限公司 基于平衡损失函数的数据处理方法、装置、设备及介质
CN116823864B (zh) * 2023-08-25 2024-01-05 锋睿领创(珠海)科技有限公司 基于平衡损失函数的数据处理方法、装置、设备及介质
CN117315263A (zh) * 2023-11-28 2023-12-29 杭州申昊科技股份有限公司 一种目标轮廓分割装置、训练方法、分割方法、电子设备
CN117315263B (zh) * 2023-11-28 2024-03-22 杭州申昊科技股份有限公司 一种目标轮廓装置、训练方法、分割方法、电子设备及存储介质

Also Published As

Publication number Publication date
CN110660066B (zh) 2023-08-04
CN110660066A (zh) 2020-01-07

Similar Documents

Publication Publication Date Title
WO2021057848A1 (zh) 网络的训练方法、图像处理方法、网络、终端设备及介质
CN108765278B (zh) 一种图像处理方法、移动终端及计算机可读存储介质
WO2020207190A1 (zh) 一种三维信息确定方法、三维信息确定装置及终端设备
US20180204094A1 (en) Image recognition method and apparatus
CN109345553B (zh) 一种手掌及其关键点检测方法、装置和终端设备
CN109166156B (zh) 一种摄像头标定图像的生成方法、移动终端及存储介质
CN111489290B (zh) 一种人脸图像超分辨重建方法、装置及终端设备
WO2021164269A1 (zh) 基于注意力机制的视差图获取方法和装置
WO2022105608A1 (zh) 一种快速人脸密度预测和人脸检测方法、装置、电子设备及存储介质
WO2018228310A1 (zh) 图像处理方法、装置及终端
CN108898082B (zh) 图片处理方法、图片处理装置及终端设备
WO2021098618A1 (zh) 数据分类方法、装置、终端设备及可读存储介质
CN110853068B (zh) 图片处理方法、装置、电子设备及可读存储介质
CN109657543B (zh) 人流量监控方法、装置及终端设备
CN110956131A (zh) 单目标追踪方法、装置及***
CN110717405B (zh) 人脸特征点定位方法、装置、介质及电子设备
US8218823B2 (en) Determining main objects using range information
CN108932703B (zh) 图片处理方法、图片处理装置及终端设备
WO2022127333A1 (zh) 图像分割模型的训练方法、图像分割方法、装置、设备
WO2024041108A1 (zh) 图像矫正模型训练及图像矫正方法、装置和计算机设备
CN109165648B (zh) 一种图像处理方法、图像处理装置及移动终端
CN111062279B (zh) 照片处理方法及照片处理装置
CN108776959B (zh) 图像处理方法、装置及终端设备
CN115731442A (zh) 图像处理方法、装置、计算机设备和存储介质
CN111754411B (zh) 图像降噪方法、图像降噪装置及终端设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20867569

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20867569

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20867569

Country of ref document: EP

Kind code of ref document: A1