CN113705666B

CN113705666B - Split network training method, use method, device, equipment and storage medium

Info

Publication number: CN113705666B
Application number: CN202110991125.5A
Authority: CN
Inventors: 曾婵; 李葛; 郑强; 高鹏; 谢国彤
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2021-08-26
Filing date: 2021-08-26
Publication date: 2023-10-27
Anticipated expiration: 2041-08-26
Also published as: CN113705666A; WO2023024424A1

Abstract

The application relates to the technical field of artificial intelligence and discloses a method, a device, equipment and a storage medium for training a segmentation network, wherein the method comprises the following steps: performing supervised training on the first segmentation network according to the first picture training set to obtain a preliminary image segmentation network; extracting the characteristics of the pictures in the first picture training set and the second picture training set by utilizing a preliminary image segmentation network to obtain a corresponding first picture characteristic set and a corresponding second picture characteristic set; training a second segmentation network according to the first picture feature set and the second picture feature set to obtain an image source identification network; performing unsupervised iterative training on the preliminary image segmentation network according to the second picture training set, and acquiring an output segmentation result graph; evaluating whether the preliminary image segmentation network is trained according to the image source recognition network and the segmentation result graph; and outputting the target image segmentation network when the primary image segmentation network training is completed.

Description

Split network training method, use method, device, equipment and storage medium

Technical Field

The present application relates to the field of artificial intelligence, and in particular, to a method, apparatus, device, and storage medium for training a split network.

Background

The image segmentation is a widely used technology, such as changing the background color of a certificate, a special effect of a film and a television, a video conference and the like, and can be used for segmenting target portrait images in a scene from a background, so that the image segmentation technology not only can bring entertainment value, but also can ensure the privacy of a user in certain scenes. However, in the prior art, because the image data set disclosed in the image segmentation field is limited and the difference between the image data set and the image taken in practice is large, the image segmentation network obtained by training the disclosed data set has poor effect of image segmentation on the image taken in practice.

Disclosure of Invention

The application mainly aims to provide a segmentation network training method, a using method, a device, equipment and a storage medium, and aims to train a segmentation network by combining supervised learning and unsupervised learning, so that the segmentation capability of the segmentation network on pictures shot in daily life is improved.

In a first aspect, the present application provides a method for training a split network, including:

acquiring a first picture training set and a second picture training set, wherein a first picture in the first picture training set is provided with a semantic segmentation tag, a second picture in the second picture training set is not provided with a semantic segmentation tag, and the main element category of the first picture is the same as the main element category in the second picture;

Training a preset first segmentation network according to the first picture training set to obtain a preliminary image segmentation network;

extracting the characteristics of the pictures in the first picture training set and the second picture training set by utilizing the preliminary image segmentation network so as to obtain a first picture characteristic set corresponding to the first picture training set and a second picture characteristic set corresponding to the second picture training set;

training a preset second segmentation network according to the first picture feature set and the second picture feature set to obtain an image source identification network;

performing iterative training on the preliminary image segmentation network according to the second picture training set, and acquiring a segmentation result graph output by the preliminary image segmentation network;

evaluating whether the preliminary image segmentation network is trained according to the image source recognition network and the segmentation result diagram;

and outputting a target image segmentation network when the preliminary image segmentation network training is completed.

In a second aspect, the present application also provides a split network training apparatus, including:

the first training picture acquisition module: the method comprises the steps of acquiring a first picture training set and a second picture training set, wherein a first picture in the first picture training set is provided with a semantic segmentation label, a second picture in the second picture training set is not provided with the semantic segmentation label, and the main element category of the first picture is the same as the main element category in the second picture;

A first network training module: training a preset first segmentation network according to the first picture training set to obtain a preliminary image segmentation network;

the second training picture acquisition module: the image segmentation method comprises the steps of extracting features of pictures in a first picture training set and a second picture training set by utilizing a preliminary image segmentation network to obtain a first picture feature set corresponding to the first picture training set and a second picture feature set corresponding to the second picture training set;

the second network training module: training a preset second segmentation network according to the first picture feature set and the second picture feature set to obtain an image source identification network;

and a third network training module: the first image segmentation network is used for carrying out iterative training on the preliminary image segmentation network according to the second image training set, and obtaining a segmentation result graph output by the preliminary image segmentation network;

and the target network checking module: the image segmentation method comprises the steps of evaluating whether the preliminary image segmentation network is trained according to the image source recognition network and the segmentation result diagram;

the target network acquisition module: and the method is used for outputting a target image segmentation network when the preliminary image segmentation network training is completed.

In a third aspect, the present application also provides a computer device comprising a processor, a memory, and a computer program stored on the memory and executable by the processor, wherein the computer program when executed by the processor implements the steps of a split network training method as described above.

In a fourth aspect, the present application also provides a storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the steps of a split network training method as described above.

In the application, when the backlog reminding is needed, whether the user is in a travel state or not is judged by analyzing the current positioning information and the historical positioning information of the user, and when the user is in the travel state, the training time of the split network is adjusted backwards. When the user is not in a travel state, the environment complexity of the environment is judged by collecting the environment information of the environment where the user is located, and the user is reminded according to the environment complexity and the overdue risk coefficient matching corresponding reminding modes. The application can improve the training effect of the segmentation network.

In the application, a first picture with a semantic segmentation label is obtained, a first picture training set is obtained, and a second picture without the semantic segmentation label is obtained, so that a second picture training set is obtained. And performing supervised training on the first segmentation network according to the first picture training set to obtain a preliminary image segmentation network. And extracting the characteristics of the pictures in the first picture training set and the second picture training set by utilizing the preliminary image segmentation network to obtain a first picture characteristic set corresponding to the first picture training set and a second picture characteristic set corresponding to the second picture training set. And training the second segmentation network according to the first picture feature set and the second picture feature set to obtain an image source identification network. And the preliminary image segmentation network performs unsupervised learning training according to the second picture training set, and outputs a segmentation result graph in the process of unsupervised learning training. And verifying the segmentation result graph according to the image source identification network, and outputting a target image segmentation network after the preliminary image segmentation network training is completed when the verification passes. According to the application, the target image segmentation network has a good image segmentation effect on images shot in daily life.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of steps of a training method for a split network according to an embodiment of the present application;

FIG. 2 is a flowchart corresponding to one embodiment of step S11 in FIG. 1;

FIG. 3 is a flowchart corresponding to one embodiment of step S113 in FIG. 2;

FIG. 4 is a flowchart of steps corresponding to one embodiment of step S13 in FIG. 1;

FIG. 5 is a schematic block diagram of a split network training apparatus according to an embodiment of the present application;

fig. 6 is a schematic block diagram of a computer device according to an embodiment of the present application.

The achievement of the objects, functional features and advantages of the present application will be further described with reference to the accompanying drawings, in conjunction with the embodiments.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

The flow diagrams depicted in the figures are merely illustrative and not necessarily all of the elements and operations/steps are included or performed in the order described. For example, some operations/steps may be further divided, combined, or partially combined, so that the order of actual execution may be changed according to actual situations. In addition, although the division of the functional modules is performed in the apparatus schematic, in some cases, the division of the modules may be different from that in the apparatus schematic.

The embodiment of the application provides a segmentation network training method, a using method, a device, equipment and a storage medium. The segmentation network training method can be applied to terminal equipment or a server, wherein the terminal equipment can be electronic equipment such as mobile phones, tablet computers, notebook computers, desktop computers, personal digital assistants, wearable equipment and the like; the server may be a single server or a server cluster composed of a plurality of servers. The following explanation will be made taking the application of the fraud recognition method to a server as an example.

Some embodiments of the present application are described in detail below with reference to the accompanying drawings. The following embodiments and features of the embodiments may be combined with each other without conflict.

Referring to fig. 1, fig. 1 is a schematic flow chart of steps of a split network training method according to an embodiment of the application.

As shown in fig. 1, the split network training method includes steps S10 to S16.

Step S10, a first picture training set and a second picture training set are obtained, wherein a first picture in the first picture training set is provided with a semantic segmentation label, a second picture in the second picture training set is not provided with the semantic segmentation label, and the main element category of the first picture is the same as the main element category in the second picture.

In some embodiments, the pictures in the first picture training set are pictures from the online disclosure and specially used for image semantic segmentation network training, and each picture in the first picture training set is provided with a pixel-level semantic segmentation label corresponding to the picture. The pictures in the second picture training set are pictures shot in real life, and the pictures in the second picture training set are not provided with semantic segmentation tags.

The main element categories corresponding to the pictures in the first picture training set and the second picture training set are the same. For example, assuming that the pictures in the first picture training set are all pictures related to the portrait, the pictures in the second picture training set need to be the same as the main element category of the pictures in the first picture training set. That is, the pictures in the second picture training set are identical to the pictures in terms of portrait.

In some embodiments, the first training set of pictures needs to include more than 30000 portrait pictures with pixel level semantic segmentation labels, and the second training set of pictures needs to include more than 2000 portrait pictures from life shots.

It will be appreciated that it is assumed that the first and second training sets are pictures of a portrait category, because the pictures in the first training set are dedicated to training the semantic segmentation network of images, and typically the differences in color and illumination between the portrait and the background in the pictures are more pronounced. While the pictures in the second picture training set are derived from life shots, various complex situations may exist, such as color similarity between the person image and the background in the pictures.

And step S11, training a preset first segmentation network according to the first picture training set to obtain a preliminary image segmentation network.

It can be understood that, because the pictures of the first picture training set are all provided with the corresponding pixel-level semantic segmentation labels, the first segmentation network uses the first picture training set to perform iterative training, which is a supervised training learning process.

As shown in fig. 2, in some embodiments, step S11 includes: step S110 to step S115.

Step S110, sequentially acquiring first pictures in the first picture training set and inputting the first pictures into a preset first segmentation network;

step S111, extracting background features of the first picture by using a first convolution layer of the first segmentation network to obtain a first score map, wherein the first score map is provided with scores of background categories corresponding to pixel points of the first picture;

step S112, extracting main element category characteristics of the first picture by using a second convolution layer of the first segmentation network to obtain a second score, wherein the first score and the second score have the same size, and the second score is provided with scores of main element categories corresponding to pixel points of the first picture;

step S113, setting training weight values corresponding to all pixel points in the first picture according to the first score diagram and the second score diagram to obtain training weight information corresponding to the first picture;

step S114, training the first segmentation network according to the first picture and training weight information corresponding to the first picture;

and step S115, when the number of times that the first segmentation network performs training according to the first picture training set reaches a preset value, the first segmentation network is trained, and a preliminary image segmentation network is output.

In this embodiment, the first segmentation network extracts the background feature of the first picture through the first convolution layer to obtain the first score map. And extracting main element characteristics of the first picture through the second convolution layer to obtain a second score graph.

The higher the score of the pixel point corresponding to the first score map, the greater the possibility that the pixel point in the first picture is background. Correspondingly, the higher the score of the pixel point corresponding to the second score image, the greater the possibility that the pixel point corresponding to the first image is the main pixel.

It can be understood that, in the first score chart and the second score chart, the pixels with high scores indicate that the recognition degree of the first segmentation network on the areas is high. If some pixels of the first picture correspond to the first score map and the second score map, the score is very low, which indicates that the recognition degree of the first segmentation network on the pixels is very low at present, so that the pixels cannot be recognized as belonging to the background of the first picture or as belonging to the main element of the first picture. At this time, training weight values of the pixels need to be increased, so that the first segmentation network can strengthen training on the pixels with low recognition degree in subsequent training.

In this embodiment, the pixel point with low recognition degree of the first segmentation network to the first picture is first de-recognized through the convolution layer, and then training weight values of the pixels of the first picture are correspondingly adjusted to obtain training weight information corresponding to the first picture. After training weight information corresponding to the first picture is obtained, the first segmentation network trains according to the first picture and the training weight information corresponding to the first picture, so that training effect can be improved.

In some embodiments, the first split network is a split network based on a MobileNetV2 network structural component, and in this way, the model size can be reduced while the model training speed is improved on the premise of maintaining the model performance.

The first segmentation network calculates segmentation loss according to the loss function in the training process, and optimizes parameters of the first segmentation network through back propagation. In some embodiments, the number of times the first segmentation model is trained using the first picture training set may be set by setting an epoch parameter of the first segmentation model, and when the number of times of training reaches the set epoch value, the first segmentation network training is completed. At this time, the parameters of the first segmentation network are locked, and the preliminary image segmentation network is obtained.

In some embodiments, the preset value may be set to 300, that is, the epoch parameter of the first segmentation network is set to 300, and when the number of cycles of training performed by the first segmentation network according to the first picture training set reaches 300 rounds, training is completed. When training is completed, the first segmentation network has good semantic segmentation capability for the pictures in the first picture training set.

As shown in fig. 3, in some embodiments, step S113 includes: step S1130 to step S1133.

Step 1130, obtaining a pixel point with the highest score in the pixel points corresponding to the first score map and the second score map according to a preset function, and merging the pixel point with the highest score into the preset score map to obtain a segmentation score map corresponding to the first picture;

step S1131, obtaining initial training weight information corresponding to the first picture according to the segmentation score map;

step S1132, identifying pixels with scores lower than preset score values in the segmentation score map to obtain an undesirable score pixel set;

step S1133, improving the training weight value corresponding to the non-ideal score pixel set in the initial training weight information, to obtain the training weight information corresponding to the first picture.

In some embodiments, the process of merging the first score map and the second score map is to create a picture with the same size as the first score map and the second score map, i.e. a preset score map, for recording the merging result of the first score map and the second score map. Sequentially traversing pixel points corresponding to the first score map and the second score map, wherein the first score map records a first score corresponding to the pixel point, the second score map records a second score corresponding to the pixel point, acquiring the maximum score between the first score and the second score through a preset function, and filling the maximum score into the pixel point corresponding to the newly built picture. When the traversing is completed, the maximum score of the corresponding pixel in the first score map and the second score map is recorded in the newly-built picture, and at the moment, the newly-built picture is the segmentation score map of the first picture. In some embodiments, the preset function may be a Max (a, b) function, through which the maximum value of a and b may be obtained.

It can be understood that the segmentation score map not only reflects the recognition condition of the first segmentation network on each pixel point of the first picture, but also reflects the training weight value of the corresponding pixel point. The higher the score corresponding to a pixel point, the higher the training weight value corresponding to the pixel point. According to the segmentation score graph, initial training weight information corresponding to the first picture can be obtained.

In some embodiments, the preset score value may be set to 0.5, and the score interval of the corresponding pixel point in the split score map is 0 to 1. At this time, in the segmentation score map, if the score of the corresponding pixels reaches 0.5, it means that the recognition degree of the first segmentation network for the corresponding pixels in the first picture is high. Correspondingly, if the score of the corresponding pixel points is lower than 0.5, the first segmentation network has low recognition degree on the corresponding pixel points in the first picture, and the set formed by the pixel points with low recognition degree is the non-ideal score pixel. And (3) improving the training weight value of the corresponding non-ideal score pixel set in the initial training weight information to obtain the training weight information of the first picture.

In this embodiment, by increasing the weight value corresponding to the pixel with low recognition degree of the first image by the first segmentation network, the pixel with poor recognition degree of the first image can be subjected to reinforcement training learning in the subsequent process that the first segmentation network trains through the first image. According to the embodiment, the training efficiency and the training effect of the first segmentation network can be improved.

And S12, extracting the characteristics of the pictures in the first picture training set and the second picture training set by utilizing the preliminary image segmentation network so as to obtain a first picture characteristic set corresponding to the first picture training set and a second picture characteristic set corresponding to the second picture training set.

And inputting the first picture training set as an input picture set into a first segmentation network, and extracting the characteristics of the pictures in the first picture training set through the first segmentation network, wherein the obtained characteristic picture is a set formed by the characteristic pictures, namely the first picture characteristic set.

It can be understood that, because the pictures in the first picture training set are pictures specially used for training the image semantic segmentation network, in the first picture training set, the difference between the main element of the picture and the color and the illumination of the background is obvious, and the preliminary image segmentation network is obtained by performing supervised learning training according to the pictures in the first picture training set, so that the preliminary image segmentation network has a good picture segmentation effect on the pictures in the first picture training set.

And inputting the second picture training set as an input picture set into a first segmentation network, and segmenting the pictures in the second picture training set through the first segmentation network to obtain a set formed by the characteristic pictures, namely the second picture characteristic set.

It can be understood that the pictures in the second picture training set originate from daily life shooting, are limited by shooting equipment, shooting environment, shooting targets and the like, and may have the condition that the color and illumination difference of the main element of the picture and the background are similar. Moreover, the preliminary image segmentation network is not obtained according to the picture training in the second picture training set, so that the segmentation effect of the preliminary image segmentation network on the pictures in the second picture training set is poor, and the segmentation capability level of the preliminary image segmentation network on the pictures in the first picture training set is not achieved.

It will be appreciated that the primary image segmentation network is better than the second image training set in terms of its picture recognition capability for the first image training set. The first picture feature set is obtained by inputting a first picture training set through a preliminary image segmentation network, and the second picture feature set is obtained by inputting a second picture training set through the preliminary image segmentation network, namely the image segmentation effect of the pictures in the first picture feature set is better than that of the pictures in the second picture feature set.

And step S13, training a preset second segmentation network according to the first picture feature set and the second picture feature set to obtain an image source identification network.

It can be understood that the primary image segmentation network has different segmentation capacities on the pictures in the first picture training set and the second picture training set, so that the first picture feature set and the picture feature in the second picture feature set have differences, that is, the picture features corresponding to the pictures in the first picture feature set and the picture feature set are different, and the picture segmentation quality of the pictures in the first picture feature set is higher than that of the pictures in the second picture feature set. According to the image source identification network obtained through training of the first image feature set and the second image feature set, image features corresponding to the input image can be identified, and therefore whether the image segmentation quality of the input image corresponds to the first image feature set or the second image feature set is determined.

As shown in fig. 4, in some embodiments, step S13 includes: step S130 to step S134.

Step S130, setting a first label for the pictures in the first picture feature set and setting a second label for the pictures in the second picture feature set;

step S131, performing iterative training on a preset second segmentation network according to the first picture feature set, and acquiring a first output picture output in the training process of the second segmentation network;

Step S132, performing iterative training on the second segmentation network according to the second picture feature set, and acquiring a second output picture output in the training process of the second segmentation network;

step S133, according to the first output picture and the second output picture, whether the second segmentation network is trained or not is evaluated;

and step S134, outputting an image source identification network when the second segmentation network training is completed.

In some embodiments, setting a first label for the pictures in the first picture feature set and setting a second label for the pictures in the second picture feature set, and performing iterative training on the second segmentation network according to the first picture feature set and the second picture feature set, thereby obtaining a supervised training learning process.

By setting different labels for different image feature sets, the second segmentation network can distinguish the input images according to whether the first label or the second label is set for the input images, so that the second segmentation network is helped to identify the distinguishing image features of the first image feature set and the second image feature set in the training process.

It will be appreciated that the second segmentation network calculates segmentation losses during training from the second loss function and optimizes parameters of the second segmentation network by back propagation. Correspondingly, the second segmentation network has a stronger capability of identifying distinguishing features of the pictures in the first picture feature set and the second picture feature set. In the training process of the second segmentation network, according to the first output picture obtained by inputting the first picture feature set and the second output picture obtained by inputting the second picture feature set, the learning progress of the second segmentation network can be deduced.

And when the training of the second segmentation network is completed, locking parameters of the second segmentation network to obtain the image source identification network. According to the image source identification network, the image segmentation feature corresponding to the input image can be identified, so that whether the image segmentation feature of the input image corresponds to the first image feature set or the second image feature set is judged.

Further, the second dividing network sets a feature tag of the output picture according to the picture dividing feature corresponding to the input picture, and step S134 includes: and when the feature labels corresponding to the first output pictures are all first feature labels and the feature labels corresponding to the second output pictures are all second feature labels, the second segmentation network training is completed.

It can be understood that when the feature labels corresponding to the first output picture are both first feature labels and the feature labels corresponding to the second output picture are both second feature labels, it is explained that the second segmentation network can recognize whether the picture segmentation feature corresponding to the input picture corresponds to the picture in the first picture feature set or the picture in the second picture feature set through training learning, and at this time, the second segmentation network training is completed.

It will be appreciated that the input picture corresponding to the first output picture is from the first picture feature set and the input picture corresponding to the second output picture is from the second picture feature set. Because the picture segmentation capability of the preliminary image segmentation network on the picture of the first picture training set is hesitant to be higher than that of the picture of the second picture training set, namely, the picture segmentation quality in the first picture feature set is higher than that in the second picture feature set, the second segmentation network can judge the segmentation quality of the input picture by identifying the picture segmentation feature corresponding to the input picture in the training process, and then set a feature label for the output picture corresponding to the input picture.

If the input picture reaches the picture segmentation quality corresponding to the picture in the first picture feature set, the second segmentation network can set the feature tag of the output picture as the first feature tag, otherwise, the feature tag of the output picture is set as the second feature tag, and then the second segmentation network training is completed.

And S14, performing iterative training on the preliminary image segmentation network according to the second picture training set, and acquiring a segmentation result graph output by the preliminary image segmentation network.

It can be understood that the parameters of the preliminary image segmentation network are unlocked, the corresponding segmentation loss function is set for the preliminary image segmentation network, the pictures in the second picture training set are input into the preliminary image segmentation network, the preliminary image segmentation network can be trained, in the training process, the segmentation loss is calculated according to the loss function, and the parameters of the preliminary image segmentation network are optimized through back propagation. And outputting a picture in the training process by the preliminary image segmentation network, namely a segmentation result picture.

Because the pictures in the second picture training set are not provided with semantic segmentation labels, the process of iterative training by the primary image segmentation network through the pictures in the second picture training set is an unsupervised learning process.

And S15, evaluating whether the preliminary image segmentation network is trained or not according to the image source recognition network and the segmentation result diagram.

It can be appreciated that, according to the image source identification network, it can be identified whether the image segmentation feature corresponding to the input image corresponds to the image in the first image feature set or the image in the second image feature set.

Before the preliminary image segmentation network does not use the second picture training set for training, the preliminary image segmentation network has different segmentation capacities on the pictures in the first picture training set and the second picture training set, and particularly has better picture segmentation effect on the pictures in the first picture training set than on the pictures in the second picture training set.

In the process that the preliminary image segmentation network uses the second picture training set for training, the preliminary image segmentation network gradually improves the picture segmentation capability of pictures in the second picture training set through training learning.

In some embodiments, when the picture segmentation capability of the primary image segmentation network on the pictures in the second picture training set reaches the picture segmentation capability level on the pictures in the first picture training set, inputting the pictures in the second picture training set to the primary image segmentation network, inputting the corresponding output segmentation result graph to the image source identification network, and completing the primary image segmentation network training after the image source identification network identifies that the picture segmentation features of the segmentation result graph correspond to the pictures in the first picture feature set.

Further, step S15 includes: step S150 to step S151.

Step S150, carrying out feature extraction on the segmentation result diagram by utilizing the image source identification network to obtain a segmentation result feature diagram corresponding to the segmentation result diagram;

and step 151, when the label corresponding to the segmentation result feature map is the first feature label, completing the preliminary image non-segmentation network training.

It can be understood that the image source identification network sets the feature tag of the output image according to the image segmentation feature corresponding to the input image.

Inputting a segmentation result diagram into an image source identification network, and when the label corresponding to the output segmentation result characteristic diagram is a first characteristic label, explaining that the picture segmentation characteristic of the segmentation result diagram corresponds to a first picture characteristic set, namely, performing unsupervised learning by using pictures in a second picture training set through a preliminary image segmentation network, and performing supervised learning on pictures without semantic segmentation labels by using the pictures without semantic segmentation labels, so that the picture segmentation capability level of the pictures with the semantic segmentation labels is achieved after the first segmentation network performs supervised learning according to the first picture training set. At this point, the preliminary image separation network training is completed.

And S16, outputting a target image segmentation network when the preliminary image segmentation network training is completed.

And stopping training and locking parameters of the preliminary image segmentation network when the preliminary image segmentation network training is completed, and obtaining the target image segmentation network.

In the application, a first picture with a semantic segmentation label is obtained, a first picture training set is obtained, and a second picture without the semantic segmentation label is obtained, so that a second picture training set is obtained. And performing supervised training on the first segmentation network according to the first picture training set to obtain a preliminary image segmentation network. And extracting the characteristics of the pictures in the first picture training set and the second picture training set by utilizing the preliminary image segmentation network to obtain a first picture characteristic set corresponding to the first picture training set and a second picture characteristic set corresponding to the second picture training set. And training the second segmentation network according to the first picture feature set and the second picture feature set to obtain an image source identification network. And performing unsupervised training on the preliminary image segmentation network according to the second picture training set, and acquiring a segmentation result graph output by the preliminary image segmentation network in the training process. And verifying the segmentation result graph according to the image source identification network, and outputting a target image segmentation network after the preliminary image segmentation network training is completed when the verification passes.

According to the method and the device for obtaining the label-free second picture, the target image segmentation network obtained through training can obtain the segmentation effect which is equal to that of the label-free first picture.

The embodiment of the application also provides a method for using the split network, which comprises the steps of S20 to S21.

Step S20, obtaining a picture to be processed;

and S21, performing image segmentation processing on the picture to be processed by using an image segmentation network to obtain a target result picture corresponding to the picture to be processed, wherein the image segmentation network is trained by the segmentation network training method.

It can be understood that the image segmentation network trained by the segmentation network training method according to the application can obtain very good image segmentation effect even if the unlabeled image is input.

Referring to fig. 5, fig. 5 is a schematic block diagram of a split network training apparatus according to an embodiment of the present application.

As shown in fig. 5, the split network training apparatus 201 includes:

the first training picture acquisition module 2011: the method comprises the steps of acquiring a first picture training set and a second picture training set, wherein a first picture in the first picture training set is provided with a semantic segmentation label, a second picture in the second picture training set is not provided with the semantic segmentation label, and the main element category of the first picture is the same as the main element category in the second picture;

First network training module 2012: training a preset first segmentation network according to the first picture training set to obtain a preliminary image segmentation network;

a second training picture acquisition module 2013: the image segmentation method comprises the steps of extracting features of pictures in a first picture training set and a second picture training set by utilizing a preliminary image segmentation network to obtain a first picture feature set corresponding to the first picture training set and a second picture feature set corresponding to the second picture training set;

the second network training module 2014: training a preset second segmentation network according to the first picture feature set and the second picture feature set to obtain an image source identification network;

third network training module 2015: the first image segmentation network is used for carrying out iterative training on the preliminary image segmentation network according to the second image training set, and obtaining a segmentation result graph output by the preliminary image segmentation network;

target network verification module 2016: the image segmentation method comprises the steps of evaluating whether the preliminary image segmentation network is trained according to the image source recognition network and the segmentation result diagram;

the target network acquisition module 2017: and the method is used for outputting a target image segmentation network when the preliminary image segmentation network training is completed.

In some embodiments, when training the preset first segmentation network according to the first picture training set, the first network training module 2012 includes:

sequentially acquiring first pictures in the first picture training set and inputting the first pictures into a preset first segmentation network;

extracting background features of the first picture by using a first convolution layer of the first segmentation network to obtain a first score map, wherein the first score map is provided with scores of background categories corresponding to pixel points of the first picture;

extracting main element category characteristics of the first picture by using a second convolution layer of the first segmentation network to obtain a second score map, wherein the first score map and the second score map have the same size, and the second score map is provided with scores of main element categories corresponding to pixel points of the first picture;

setting training weight values corresponding to all pixel points in the first picture according to the first score diagram and the second score diagram to obtain training weight information corresponding to the first picture;

training the first segmentation network according to the first picture and training weight information corresponding to the first picture;

When the first segmentation network reaches a preset value according to the training times of the first picture training set, the first segmentation network is trained, and a preliminary image segmentation network is output.

In some embodiments, when setting training weight values corresponding to each pixel point in the first picture according to the first score map and the second score map, the first network training module 2012 obtains training weight information corresponding to the first picture, the first network training module includes:

obtaining the pixel point with the highest score in the pixel points corresponding to the first score graph and the second score graph according to a preset function, and merging the pixel point with the highest score into the preset score graph to obtain a segmentation score graph corresponding to the first picture;

obtaining initial training weight information corresponding to the first picture according to the segmentation score graph;

identifying pixels with scores lower than preset score values in the segmentation score map to obtain an undesirable score pixel set;

and improving the training weight value corresponding to the non-ideal scoring pixel set in the initial training weight information to obtain the training weight information corresponding to the first picture.

In some embodiments, the second network training module 2014, when training the preset second segmentation network according to the first picture feature set and the second picture feature set to obtain an image source identification network, includes:

Setting a first label for the pictures in the first picture feature set and setting a second label for the pictures in the second picture feature set;

performing iterative training on a preset second segmentation network according to the first picture feature set, and acquiring a first output picture output in the training process of the second segmentation network;

performing iterative training on the second segmentation network according to the second picture feature set, and acquiring a second output picture output in the second segmentation network training process;

evaluating whether the second segmentation network is trained according to the first output picture and the second output picture;

and outputting an image source identification network when the second segmentation network training is completed.

In some embodiments, the second segmentation network sets a feature tag of an output picture according to a picture segmentation feature corresponding to an input picture, and the second network training module 2014 when evaluating whether the second segmentation network is trained according to the first output picture and the second output picture, includes:

and when the feature labels corresponding to the first output pictures are all first feature labels and the feature labels corresponding to the second output pictures are all second feature labels, the second segmentation network training is completed.

In some implementations, the target network verification module 2016, when evaluating whether the preliminary image segmentation network is trained based on the image source recognition network and the segmentation result map, includes:

extracting features of the segmentation result graph by using the image source identification network to obtain a segmentation result feature graph corresponding to the segmentation result graph;

and when the label corresponding to the segmentation result feature map is the first feature label, completing the preliminary image non-segmentation network training.

It should be noted that, for convenience and brevity of description, specific working processes of the above-described apparatus and each module and unit may refer to corresponding processes in the foregoing embodiment of the split network training method, which are not described herein again.

The apparatus provided by the above embodiments may be implemented in the form of a computer program which may be run on a computer device as shown in fig. 6.

Referring to fig. 6, fig. 6 is a schematic block diagram of a computer device according to an embodiment of the present application. The computer device includes, but is not limited to, a server.

As shown in fig. 6, the computer device 301 includes a processor 3011, a memory, and a network interface connected via a system bus, wherein the memory may include a storage medium 3012 and an internal memory 3015, and the storage medium 3012 may be non-volatile or volatile.

The storage medium 3012 may store an operating system and computer programs. The computer program includes program instructions that, when executed, cause the processor 3011 to perform any of the split network training methods.

The processor 3011 is used to provide computing and control capabilities to support the operation of the overall computer device.

The internal memory 3015 provides an environment for the execution of a computer program in the storage medium 3012 that, when executed by the processor 3011, causes the processor 3011 to perform any of a variety of split network training methods.

The network interface is used for network communication such as transmitting assigned tasks and the like. It will be appreciated by those skilled in the art that the structure shown in FIG. 6 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.

It is to be appreciated that the processor 3011 can be a central processing unit (Central Processing Unit, CPU), and that the processor 3011 can also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. Wherein the general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

Wherein in some embodiments the processor 3011 is configured to run a computer program stored in a memory to implement the steps of:

In some embodiments, the processor 3011 is configured to, when training a preset first segmentation network according to the first picture training set to obtain a preliminary image segmentation network, implement:

In some embodiments, the processor 3011 is configured to, when setting training weight values corresponding to respective pixels in the first picture according to the first score map and the second score map, obtain training weight information corresponding to the first picture, implement:

In some embodiments, the processor 3011 is configured to implement, when training a preset second segmentation network according to the first picture feature set and the second picture feature set to obtain an image source identification network:

In some embodiments, the second partition network sets a feature tag of an output picture according to a picture partition feature corresponding to an input picture, and the processor 3011 is configured to, when evaluating whether the second partition network is trained according to the first output picture and the second output picture, implement:

In some embodiments, the processor 3011, when evaluating whether the preliminary image segmentation network is trained complete from the image source recognition network and the segmentation result map, is to implement:

It should be noted that, for convenience and brevity of description, the specific working process of the computer device described above may refer to the corresponding process in the foregoing embodiment of the split network training method, which is not described herein again.

The embodiment of the application also provides a storage medium, which is a computer readable storage medium, and the computer readable storage medium stores a computer program, wherein the computer program comprises program instructions, and the method implemented by the program instructions when being executed can refer to various embodiments of the split network training method.

The computer readable storage medium may be an internal storage unit of the computer device according to the foregoing embodiment, for example, a hard disk or a memory of the computer device. The computer readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like, which are provided on the computer device.

It is to be understood that the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should also be understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The foregoing embodiment numbers of the present application are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments. While the application has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the application. Therefore, the protection scope of the application is subject to the protection scope of the claims.

Claims

1. A method of training a split network, comprising:

training a preset first segmentation network according to the first picture training set to obtain a preliminary image segmentation network; inputting a first preset segmentation network by sequentially acquiring first pictures in the first picture training set; extracting background features of the first picture by using a first convolution layer of the first segmentation network to obtain a first score map, wherein the first score map is provided with scores of background categories corresponding to pixel points of the first picture; extracting main element category characteristics of the first picture by using a second convolution layer of the first segmentation network to obtain a second score map, wherein the first score map and the second score map have the same size, and the second score map is provided with scores of main element categories corresponding to pixel points of the first picture; setting training weight values corresponding to all pixel points in the first picture according to the first score diagram and the second score diagram to obtain training weight information corresponding to the first picture; training the first segmentation network according to the first picture and training weight information corresponding to the first picture; when the first segmentation network reaches a preset value according to the training times of the first picture training set, the first segmentation network is trained, and the preliminary image segmentation network is output;

2. The method of claim 1, wherein setting training weight values corresponding to each pixel point in the first picture according to the first score map and the second score map to obtain training weight information corresponding to the first picture includes:

3. The method according to any one of claims 1-2, wherein training a preset second segmentation network according to the first picture feature set and the second picture feature set to obtain an image source identification network comprises:

4. A method according to claim 3, wherein the second segmentation network sets a feature tag of an output picture according to a picture segmentation feature corresponding to the input picture, and the evaluating whether the second segmentation network is trained according to the first output picture and the second output picture comprises:

5. The method of claim 4, wherein said evaluating whether the preliminary image segmentation network is trained complete based on the image source recognition network and the segmentation result map comprises:

and when the label corresponding to the segmentation result feature map is the first feature label, completing the preliminary image segmentation network training.

6. A method of split network use, the method comprising:

acquiring a picture to be processed;

and carrying out image segmentation processing on the picture to be processed by using an image segmentation network to obtain a target result picture corresponding to the picture to be processed, wherein the image segmentation network is trained by the method of any one of claims 1-5.

7. A split network training apparatus, the apparatus comprising:

a first network training module: training a preset first segmentation network according to the first picture training set to obtain a preliminary image segmentation network; inputting a first preset segmentation network by sequentially acquiring first pictures in the first picture training set; extracting background features of the first picture by using a first convolution layer of the first segmentation network to obtain a first score map, wherein the first score map is provided with scores of background categories corresponding to pixel points of the first picture; extracting main element category characteristics of the first picture by using a second convolution layer of the first segmentation network to obtain a second score map, wherein the first score map and the second score map have the same size, and the second score map is provided with scores of main element categories corresponding to pixel points of the first picture; setting training weight values corresponding to all pixel points in the first picture according to the first score diagram and the second score diagram to obtain training weight information corresponding to the first picture; training the first segmentation network according to the first picture and training weight information corresponding to the first picture; when the first segmentation network reaches a preset value according to the training times of the first picture training set, the first segmentation network is trained, and the preliminary image segmentation network is output;

8. A computer device comprising a processor, a memory, and a computer program stored on the memory and executable by the processor, wherein the computer program when executed by the processor performs the steps of the split network training method according to any one of claims 1 to 5, or the steps of the split network use method according to claim 6.

9. A computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, wherein the computer program, when executed by a processor, implements the steps of the split network training method according to any one of claims 1 to 5, or the steps of the split network usage method according to claim 6.