CN111489365B

CN111489365B - Training method of neural network, image processing method and device

Info

Publication number: CN111489365B
Application number: CN202010278429.2A
Authority: CN
Inventors: 周千寓; 程光亮; 石建萍; 马利庄
Original assignee: Shanghai Sensetime Lingang Intelligent Technology Co Ltd
Current assignee: Shanghai Sensetime Lingang Intelligent Technology Co Ltd
Priority date: 2020-04-10
Filing date: 2020-04-10
Publication date: 2023-12-22
Anticipated expiration: 2040-04-10
Also published as: CN111489365A

Abstract

The disclosure provides a training method, an image processing method and a device of a neural network, wherein the training method comprises the following steps: carrying out semantic segmentation processing on a first noise image of the target image by utilizing a student network to obtain a first semantic segmentation image; performing semantic segmentation processing on a second noise image of the target image by using a teacher network to obtain a second semantic segmentation image; determining credibility information of each pixel point in the second semantic segmentation image based on the second semantic segmentation image; updating parameter values of the student network based on the first semantic segmentation image, the second semantic segmentation image and the credibility information; based on the updated parameter values of the student network, the parameter values of the teacher network are updated. According to the embodiment of the disclosure, the first semantic segmentation image, the second semantic segmentation image and the credibility information are used for controlling specific features in the student network and the teacher network learning target image, so that negative migration of the student network and the teacher network in migration learning is avoided.

Description

Training method of neural network, image processing method and device

Technical Field

The disclosure relates to the technical field of image processing, and in particular relates to a training method of a neural network, an image processing method and an image processing device.

Background

Image segmentation refers to the task of assigning a semantic label to each pixel of a given image; in the supervised training or semi-supervised training process of the semantic segmentation model, labels are firstly required to be carried out on a large number of sample images pixel by pixel; and then training a semantic segmentation model based on the marked sample. However, the process of labeling a large number of sample images pixel by pixel requires a great deal of time and cost; to solve this problem, a sample data set is generally constructed by adopting a mode of simulating and synthesizing a sample image at present; however, due to a certain difference between the synthesized image and the real image, the performance of the semantic segmentation network obtained based on the training of the synthesized image is significantly reduced when the semantic segmentation processing is performed on the real image.

Disclosure of Invention

The embodiment of the disclosure at least provides a training method, an image processing method and a device of a neural network.

In a first aspect, an embodiment of the present disclosure provides a training method of a neural network, including: carrying out semantic segmentation processing on a first noise image of the target image by utilizing a student network to obtain a first semantic segmentation image; performing semantic segmentation processing on a second noise image of the target image by using a teacher network to obtain a second semantic segmentation image; determining credibility information of each pixel point in the second semantic segmentation image based on the second semantic segmentation image; updating parameter values of the student network based on the first semantic segmentation image, the second semantic segmentation image, and the credibility information; updating the parameter values of the teacher network based on the updated parameter values of the student network.

The first semantic segmentation image, the second semantic segmentation image and the credibility information are used for controlling the student network and the teacher network to generate consistent prediction results after the same target image is disturbed, so that the student network can learn specific characteristics in the target image in the process of migrating based on the target image, namely, the student network migrates and learns towards a specific direction, and the parameter value of the teacher network is updated according to the parameter value of the student network, so that the teacher network migrates and learns towards the specific direction, and the problem of negative migration is avoided.

In a possible embodiment, the method further comprises: performing semantic segmentation processing on a style migration image of a source image by using a student network to obtain a third semantic segmentation image, wherein the style migration image of the source image is an image obtained by migrating the style of the source image to a target domain where the target image is located; the updating the parameter values of the student network based on the first semantically segmented image, the second semantically segmented image, and the credibility information includes: updating parameter values of the student network based on the first semantic segmentation image, the second semantic segmentation image, the credibility information, the third semantic segmentation image and the labeling information of the source image.

In this way, semantic segmentation processing is performed on the style migration image of the source image by using the student network to obtain a third semantic segmentation image, and then the parameter value updating process of the student network is supervised based on the first semantic segmentation image, the second semantic segmentation image, the credibility information, the third semantic segmentation image and the labeling information of the source image, so that the semantic segmentation precision of the student network and the teacher network can be further improved.

In a possible implementation manner, the updating the parameter value of the student network based on the first semantic segmentation image, the second semantic segmentation image, the credibility information, the third semantic segmentation image, and the labeling information of the source image includes: determining a consistency loss based on the first semantically segmented image, the second semantically segmented image, and the confidence information; determining the weight of the consistency loss based on the current iteration times; determining semantic segmentation loss based on the third semantic segmentation image and annotation information of the source image; updating parameter values of the student network based on the consistency loss, the weight, and the semantic segmentation loss.

In this way, the weight of the consistency loss is determined according to the current iteration times, the adjustment process of the parameter values of the student network is supervised based on the consistency loss, the determined weight of the consistency loss and the semantic segmentation loss, and the influence of the consistency loss and the semantic segmentation loss on the parameter values of the student network and the teacher network is dynamically adjusted along with the increase of the iteration times of the student network and the teacher network, so that specific features in the target image are learned on the premise of guaranteeing the semantic segmentation precision of the student network and the teacher network.

In a possible implementation manner, the semantic segmentation processing is performed on the second noise image of the target image by using a teacher network to obtain a second semantic segmentation image, which includes: respectively carrying out semantic segmentation processing on a plurality of second noise images of the target image by using a teacher network to obtain a plurality of intermediate semantic segmentation images; the second semantically segmented image is generated based on the plurality of intermediate semantically segmented images.

In this way, semantic segmentation processing is respectively carried out on a plurality of second noise images by using a teacher network to obtain a plurality of intermediate semantic segmentation images, and the second semantic segmentation images are generated based on the plurality of intermediate semantic segmentation images, so that uncertainty information in the second noise images can be extracted more, reliability information of each pixel point in the second semantic segmentation images obtained based on the second noise images has better prominence, and optimization efficiency of student network parameter values is further improved.

In a possible implementation manner, the generating the second semantic segmentation image based on the plurality of intermediate semantic segmentation images includes: sequentially calculating pixel value mean values of pixel points at corresponding positions in a plurality of intermediate semantic segmentation images; and determining the average value of the pixel points at any corresponding position as the pixel value of the pixel point at the corresponding position in the second semantic segmentation image.

In this way, more uncertain information can be extracted by solving the mean value of the pixel values of the pixel points at corresponding positions in the plurality of intermediate semantic segmentation images.

In a possible implementation manner, the determining, based on the second semantically segmented image, the credibility information of each pixel point in the second semantically segmented image includes: determining the information entropy of each pixel point in the second semantic segmentation image based on the pixel value of each pixel point in the second semantic segmentation image; and determining the credibility information of each pixel point in the second semantic segmentation image based on the information entropy of each pixel point in the second semantic segmentation image and a predetermined information entropy threshold.

In this way, the information entropy of each pixel point in the second semantic segmentation image is extracted through the pixel value of each pixel point in the second semantic segmentation image, and then the credibility information of each pixel point in the second semantic segmentation image is determined based on the information entropy.

In a possible implementation manner, the determining the credibility information of each pixel in the second semantically segmented image based on the information entropy of each pixel in the second semantically segmented image and a predetermined information entropy threshold value includes: comparing the information entropy of each pixel point in the second semantic segmentation image with the information entropy threshold; determining credibility information of each pixel point in the second semantic segmentation image based on the comparison result; and if the absolute value of the information entropy of any pixel point in the second semantic segmentation image is larger than the information entropy threshold, the credibility information corresponding to the any pixel point is set as a preset value representing the credibility of the pixel value of the any pixel point, wherein the preset value is larger than 0.

In this way, the consistency loss of the generated first semantic segmentation image and the generated second semantic segmentation image only considers the credible pixel points in the second semantic segmentation image, so that when the parameter values of the student network are updated based on the consistency loss, the result of the semantic segmentation processing of the student network and the teacher network on the target image added with different disturbance can be ensured to be consistent. And updating the parameter values of the teacher network based on the updated parameter values of the student network, so that the parameter values of the teacher network and the parameter values of the student network can be kept consistent, and the teacher network and the student network can learn the specific characteristics of the target image.

In a possible implementation manner, the information entropy threshold value is generated by adopting the following way: and determining the information entropy threshold based on the semantic segmentation type of the teacher network.

In a possible implementation, updating the parameter values of the teacher network based on the updated parameter values of the student network includes: performing exponential moving average processing on parameter values of parameters in the student network to obtain target parameter values; and replacing the parameter value of the corresponding parameter in the teacher network by using the target parameter value.

Therefore, the parameter value of the teacher network is an exponential moving average value based on the parameter value of the student network, so that the teacher network and the student network can be converged more quickly, and the training efficiency of the neural network is improved.

In a second aspect, an embodiment of the present disclosure further provides a training apparatus for a neural network, including: the first processing module is used for carrying out semantic segmentation processing on the first noise image of the target image by utilizing the student network to obtain a first semantic segmentation image; the second processing module is used for carrying out semantic segmentation processing on a second noise image of the target image by utilizing a teacher network to obtain a second semantic segmentation image; determining credibility information of each pixel point in the second semantic segmentation image based on the second semantic segmentation image; a first updating module, configured to update parameter values of the student network based on the first semantic segmentation image, the second semantic segmentation image, and the credibility information; and the second updating module is used for updating the parameter value of the teacher network based on the updated parameter value of the student network.

In a possible embodiment, the apparatus further comprises: the third processing module is used for carrying out semantic segmentation processing on the style migration image of the source image by utilizing the student network to obtain a third semantic segmentation image, wherein the style migration image of the source image is an image obtained by migrating the style of the source image to a target domain where the target image is located; the first updating module is configured to, when updating the parameter values of the student network based on the first semantically segmented image, the second semantically segmented image, and the credibility information: updating parameter values of the student network based on the first semantic segmentation image, the second semantic segmentation image, the credibility information, the third semantic segmentation image and the labeling information of the source image.

In a possible implementation manner, the first updating module is configured to, when updating the parameter values of the student network based on the first semantic segmentation image, the second semantic segmentation image, the credibility information, the third semantic segmentation image, and the labeling information of the source image: determining a consistency loss based on the first semantically segmented image, the second semantically segmented image, and the confidence information; determining the weight of the consistency loss based on the current iteration times; determining semantic segmentation loss based on the third semantic segmentation image and annotation information of the source image; updating parameter values of the student network based on the consistency loss, the weight, and the semantic segmentation loss.

In a possible implementation manner, the second processing module is configured to, when performing semantic segmentation processing on the second noise image of the target image by using the teacher network, obtain a second semantic segmentation image: respectively carrying out semantic segmentation processing on a plurality of second noise images of the target image by using a teacher network to obtain a plurality of intermediate semantic segmentation images; the second semantically segmented image is generated based on the plurality of intermediate semantically segmented images.

In a possible implementation manner, the second processing module is configured, when generating the second semantic segmentation image based on the plurality of intermediate semantic segmentation images, to: sequentially calculating pixel value mean values of pixel points at corresponding positions in a plurality of intermediate semantic segmentation images; and determining the average value of the pixel points at any corresponding position as the pixel value of the pixel point at the corresponding position in the second semantic segmentation image.

In a possible implementation manner, the second processing module is configured to, when determining, based on the second semantically segmented image, reliability information of each pixel point in the second semantically segmented image: determining the information entropy of each pixel point in the second semantic segmentation image based on the pixel value of each pixel point in the second semantic segmentation image; and determining the credibility information of each pixel point in the second semantic segmentation image based on the information entropy of each pixel point in the second semantic segmentation image and a predetermined information entropy threshold.

In a possible implementation manner, the second processing module is configured to, when determining the reliability information of each pixel point in the second semantically segmented image based on the information entropy of each pixel point in the second semantically segmented image and a predetermined information entropy threshold value: comparing the information entropy of each pixel point in the second semantic segmentation image with the information entropy threshold; determining credibility information of each pixel point in the second semantic segmentation image based on the comparison result; and if the absolute value of the information entropy of any pixel point in the second semantic segmentation image is larger than the information entropy threshold, the credibility information corresponding to the any pixel point is set as a preset value representing the credibility of the pixel value of the any pixel point, wherein the preset value is larger than 0.

In a possible implementation manner, the second processing module is further configured to generate the information entropy threshold in the following manner: and determining the information entropy threshold based on the semantic segmentation type of the teacher network.

In a possible implementation manner, the second updating module is configured to, when updating the parameter value of the teacher network based on the updated parameter value of the student network: performing exponential moving average processing on parameter values of parameters in the student network to obtain target parameter values; and replacing the parameter value of the corresponding parameter in the teacher network by using the target parameter value.

In a third aspect, an embodiment of the present disclosure further provides an image processing method, including: acquiring an image to be processed; and carrying out semantic segmentation processing on the image to be processed by using the neural network trained by the training method based on the neural network in any one of the first aspect to obtain a semantic segmentation result of the image to be processed.

In a fourth aspect, an embodiment of the present disclosure further provides an image processing apparatus, including: the acquisition module is used for acquiring the image to be processed; the processing module is used for carrying out semantic segmentation processing on the image to be processed by utilizing the neural network trained by the training method based on the neural network in any one of the first aspect to obtain a semantic segmentation result of the image to be processed.

In a fifth aspect, an embodiment of the present disclosure further provides an intelligent travel control method, including: acquiring an image acquired by a running device in the running process; detecting a target object in the image using a neural network trained based on the training method of any one of the first aspects; the running apparatus is controlled based on the detected target object.

In a sixth aspect, an embodiment of the present disclosure further provides an intelligent travel control apparatus, including: the data acquisition module is used for acquiring images acquired by the driving device in the driving process; a detection module for detecting a target object in the image using a neural network trained based on the training method of the neural network of any one of the first aspects; and the control module is used for controlling the running device based on the detected target object.

In a seventh aspect, an optional implementation manner of the disclosure further provides an electronic device, a processor, and a memory, where the memory stores machine-readable instructions executable by the processor, and the processor is configured to execute the machine-readable instructions stored in the memory, and when executed by the processor, the machine-readable instructions execute the steps in the first aspect, or any possible implementation manner of the first aspect, or execute the steps in the possible implementation manner of the third aspect, or execute the steps in the possible implementation manner of the fifth aspect.

In an eighth aspect, an alternative implementation manner of the present disclosure further provides a computer readable storage medium, where a computer program is stored, the computer program when executed performs the steps in the first aspect, or any possible implementation manner of the first aspect, or performs the steps in the possible implementation manner of the third aspect, or performs the steps in the possible implementation manner of the fifth aspect.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for the embodiments are briefly described below, which are incorporated in and constitute a part of the specification, these drawings showing embodiments consistent with the present disclosure and together with the description serve to illustrate the technical solutions of the present disclosure. It is to be understood that the following drawings illustrate only certain embodiments of the present disclosure and are therefore not to be considered limiting of its scope, for the person of ordinary skill in the art may admit to other equally relevant drawings without inventive effort.

FIG. 1 illustrates a flow chart of a method of training a neural network provided by an embodiment of the present disclosure;

FIG. 2 illustrates a flowchart of a particular method for determining confidence information for each pixel in a second semantically segmented image provided by embodiments of the present disclosure;

FIG. 3 illustrates a flow chart of another neural network training method provided by embodiments of the present disclosure;

FIG. 4 is a schematic diagram showing a specific example of a training method of a neural network according to an embodiment of the present disclosure;

FIG. 5 shows a flowchart of an image processing method provided by an embodiment of the present disclosure;

FIG. 6 shows a flow chart of an intelligent travel control method provided by an embodiment of the present disclosure;

FIG. 7 illustrates a schematic diagram of a training apparatus for a neural network provided by an embodiment of the present disclosure;

fig. 8 shows a schematic diagram of an image processing apparatus provided by an embodiment of the present disclosure;

fig. 9 shows a schematic diagram of an intelligent travel control apparatus provided in an embodiment of the present disclosure;

fig. 10 shows a schematic diagram of an electronic device provided by an embodiment of the disclosure.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are only some embodiments of the present disclosure, but not all embodiments. The components of the embodiments of the present disclosure, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure provided in the accompanying drawings is not intended to limit the scope of the disclosure, as claimed, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be made by those skilled in the art based on the embodiments of this disclosure without making any inventive effort, are intended to be within the scope of this disclosure.

It has been found that a neural network usually takes a lot of time and cost to label a sample image before training to form a labeled dataset; to reduce sample labeling time and cost, neural networks are in many cases trained by computer simulated synthetic images; however, because of a certain domain difference between the synthesized image and the real image, the neural network obtained by training the synthesized image has the problem of performance degradation when executing an image processing task on the real image; to solve this problem, more supervised signal supervised training is currently generally performed on the antagonism framework, for example, depth, style, category constraint, decision boundary and other supervised signals are adopted on the basis of the generated antagonism network to perform migration learning on the neural network; however, in the process of performing migration learning by using the neural network, the learned features have great uncertainty, so that the problem of negative migration may be caused.

Based on the above study, the disclosure provides a training method and device for a neural network, which controls a teacher network and a student network to generate consistent prediction results for unlabeled target images under different disturbance so as to supervise the student network to perform migration learning, and updates the teacher network based on parameter values of the student network, so that the teacher network and the student network can learn specific technical characteristics in the target images in the migration learning process, and the problem of negative migration is avoided.

The present invention is directed to a method for manufacturing a semiconductor device, and a semiconductor device manufactured by the method.

It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.

For the sake of understanding the present embodiment, first, a detailed description will be given of a neural network training method disclosed in the embodiments of the present disclosure, where an execution subject of the neural network training method provided in the embodiments of the present disclosure is generally a computer device having a certain computing capability, where the computer device includes, for example: a terminal device or server or other processing device; in some possible implementations, the training method of the neural network may be implemented by a processor invoking computer readable instructions stored in a memory.

The following describes a training method of a neural network provided in an embodiment of the present disclosure.

In the embodiment of the present disclosure, before updating the parameter values of the Student Network (Student Network) and the Teacher Network (Teacher Network) based on S101 to S104, the parameter values of the Student Network and the Teacher Network may be initialized first.

By way of example, the teacher network and the student network may be initialized, for example, with a pre-trained semantic segmentation network.

Here, the pre-trained semantic segmentation network is, for example, a neural network trained based on a source image; in the embodiment of the disclosure, the processes S101 to S104 are processes of controlling the pre-trained semantic segmentation network to perform transfer learning from the source domain to the target domain based on the target image, so that performance of the semantic segmentation network in performing semantic segmentation processing on the image of the target domain is not reduced after the transfer learning is performed.

The image of the source domain includes, for example: synthesizing the images; the image of the target domain includes, for example: a real image.

After initializing parameter values of the student network and the teacher network, performing multiple iterations on the student network and the teacher network based on S101-S104, and determining the teacher network or the student network after the multiple iterations as a trained neural network. Here, each time the processes of S101 to S104 are performed, a round of iteration is performed on the student network and the teacher network.

Referring to fig. 1, a flowchart of a neural network training method according to an embodiment of the disclosure is shown, where the method includes:

s101: and carrying out semantic segmentation processing on the first noise image of the target image by using the student network to obtain a first semantic segmentation image.

In a specific implementation, the first noise image may be obtained by injecting random noise into the target image, for example.

Illustratively, random noise includes, for example: any one of gaussian noise, white noise, and the like may be specifically determined according to actual needs.

After random noise is injected into a target image, a first noise image is generated, and then semantic segmentation processing is carried out on the first noise image by utilizing a student network; when the student network performs semantic segmentation processing on the first noise image, a semantic segmentation result of each pixel point in the first noise image can be obtained; then forming a first semantic segmentation image based on the semantic segmentation result of each pixel point in the first noise image; the size of the first semantically segmented image is the same as the size of the first noisy image.

The pixel value of any pixel point a 'in the first semantic segmentation image is the semantic segmentation result of the pixel point a corresponding to the any pixel point a' in the first noise image.

The training method of the neural network provided by the embodiment of the disclosure further comprises the following steps:

s102: performing semantic segmentation processing on a second noise image of the target image by using a teacher network to obtain a second semantic segmentation image; and determining the credibility information of each pixel point in the second semantic segmentation image based on the second semantic segmentation image.

In specific implementation, S102 and S101 have no sequential logic relationship; the execution may be synchronous or asynchronous.

The second noise image is generated in a similar manner to the first noise image, and may be obtained by injecting random noise into the target image, for example. Wherein different noise images of the target image are different in noise injected.

In one possible embodiment, the second noise image has one piece; in this case, the semantic segmentation processing is performed on the second noise image by using the teacher network, so that a semantic segmentation result of each pixel point in the second noise image can be obtained, and then the second semantic segmentation image is formed based on the semantic segmentation result of each pixel point in the second noise image.

In another possible embodiment, the second noise image has a plurality of pieces; under the situation, semantic segmentation processing is respectively carried out on a plurality of second noise images of the target image by using a teacher network, so that an intermediate semantic segmentation image corresponding to each second noise image in the plurality of second noise images is obtained; a second semantically segmented image is then generated based on the plurality of intermediate semantically segmented images.

Here, for example, the pixel value average value may be sequentially calculated for the pixels at the corresponding positions in the plurality of second semantic division images, and the pixel value average value at any corresponding position may be determined as the pixel value of the pixel at the corresponding position in the second semantic division image.

For example, the size of the target image is h×w, and the number of second noise images of the target image is N, which are A1, A2, … …, AN; then the teacher network is utilized to carry out semantic segmentation processing on the plurality of second noise images, and then an intermediate semantic segmentation image of the ith second noise image is obtainedExpressed as: />Wherein x is _t Representing a target image; h represents the height of the target image, and w represents the width of the target image; c represents the semantic segmentation class of the teacher network.

Second semantically segmented imageFor example, the following formula (1) is satisfied:

thus, random noise is injected into the target image for many times, a plurality of second noise images are generated, and based on the intermediate semantic segmentation images corresponding to the second noise images, the second semantic segmentation images are obtained, and uncertainty information in the second noise images can be extracted more, so that reliability information of each pixel point in the second semantic segmentation images obtained based on the second noise images has better prominence, and further, optimization efficiency of student network parameter values is improved.

After obtaining the second semantic segmentation image, referring to fig. 2, the embodiment of the disclosure further provides a specific method for determining reliability information of each pixel point in the second semantic segmentation image based on the second semantic segmentation image, which includes:

s201: and determining the information entropy of each pixel point in the second semantic segmentation image based on the pixel value of each pixel point in the second semantic segmentation image.

Here, the information entropy of any pixel pointFor example, the following formula (2) is satisfied:

s202: and determining the credibility information of each pixel point in the second semantic segmentation image based on the information entropy of each pixel point in the second semantic segmentation image and a predetermined information entropy threshold.

Here, the information entropy threshold may be determined, for example, based on the semantic segmentation class of the teacher network.

The information entropy threshold H satisfies the following formula (3), for example:

wherein a, b and c are super parameters; k (K) _max =log c; c represents the semantic segmentation class of the teacher network. t represents the current iteration round number; t is t _max Representing the maximum number of iteration rounds.

Illustratively, the information entropy threshold satisfies, for example:

for example, the information entropy of each pixel point in the second semantic segmentation image may be compared with a predetermined information entropy threshold; and then determining the credibility information of each pixel point in the second semantic segmentation image based on the comparison result.

And if the absolute value of the information entropy of any pixel point in the second semantic segmentation image is larger than the information entropy threshold, the credibility information corresponding to the any pixel point is set as a preset value representing the credibility of the pixel value of the any pixel point, wherein the preset value is larger than 0.

In a specific implementation, as can be seen from the above formula (2), the value of the information entropy is a negative number; for a certain pixel point in the second semantic segmentation image, the smaller the value of the information entropy of the pixel point is, the higher the credibility of the pixel point is represented, namely, the higher the credibility of the classification of the pixel point in the corresponding target image represented by the pixel value of the pixel point in the second semantic segmentation image is. When determining that the consistency between the first semantic segmentation image and the second semantic segmentation image is lost, taking pixels with higher reliability in the second semantic segmentation image into consideration, and increasing the influence of the pixels with higher reliability on the consistency loss; and for the pixel points with lower credibility in the second semantic segmentation image, the influence of the pixel points on the consistency loss can be reduced, and even the influence of the pixel points on the consistency loss can be removed.

Furthermore, for example, the preset that the pixel value is reliable may be set to 1; the preset value for which the pixel value is not trusted is set to 0.

For another example, the preset value for which the pixel value is trusted may be set to 1, the preset value for which the pixel value is not trusted may be set to 0.5, and so on.

Specific setting can be carried out according to actual needs.

Further, exemplarily, the reliability information of each pixel point in the second semantically segmented image satisfies the following formula (4), for example:

wherein H represents an information entropy threshold; i (·) represents a 0-1 function; and is also provided withWhen I (& gt) is adopted, 1 is taken; />When I (·) takes 0.

With the S101 and S102 described above in mind, the training method of the neural network provided in the embodiment of the present disclosure further includes:

s103: updating parameter values of the student network based on the first semantically segmented image, the second semantically segmented image, and the credibility information.

S104: updating the parameter values of the teacher network based on the updated parameter values of the student network.

In a specific implementation, for example, a consistency loss between the first semantically segmented image and the second semantically segmented image may be determined based on the first semantically segmented image, the second semantically segmented image, and the confidence information, and then parameter values of the student network may be updated based on the consistency loss.

In a specific implementation, as can be seen from the above formula (3), H is a time-dependent function, and the consistency loss may be, for example, a mean square error between a first semantic segmentation image extracted by the student network and a second semantic segmentation image extracted by the teacher network, and the consistency loss L _con For example, the following formula (5) is satisfied:

wherein f _S Representing a student network; f (f) _T Representing a teacher network; x is x _t1 Representing a first noisy image; x is x _t2 Representing a second noisy image; sigma represents an activation function, for example a softmax activation function.

When updating the parameter values of the student network based on the consistency loss, for example, the parameter values of the student network are adjusted in a direction to reduce the consistency loss.

When updating the parameter values of the teacher network based on the updated parameter values of the student network, for example, an exponential moving average process may be performed on the parameter values of the parameters in the student network to obtain target parameter values; and replacing the parameter value of the corresponding parameter in the teacher network by using the target parameter value.

In a specific implementation, as can be known from the above formula (4) and the formula (5), when the semantic segmentation result represented by any pixel point in the second semantic segmentation image is reliable, the value of the credibility information corresponding to the any pixel point is 1; when the semantic segmentation result represented by any pixel point in the second semantic segmentation image is not credible, the credibility information corresponding to the any pixel point is 0, and further the consistency loss is determined based on the pixel points with credible semantic segmentation results in the second semantic segmentation image, and further the generated consistency loss of the first semantic segmentation image and the second semantic segmentation image only considers the credible pixel points in the second semantic segmentation image, so that when the parameter values of the student network are updated based on the consistency loss, the result of the semantic segmentation processing of the student network and the teacher network on the target image added with different disturbance tends to be consistent. And updating the parameter values of the teacher network based on the updated parameter values of the student network, so that the parameter values of the teacher network and the parameter values of the student network can keep consistent change directions, and the teacher network and the student network can learn specific characteristics of the target image.

In the embodiment of the disclosure, the first noise image and the second noise image are images obtained by different disturbance on the target image; performing semantic segmentation processing on the first noise image by using a student network to obtain a first semantic segmentation image, performing semantic segmentation processing on the second noise image by using a teacher network to obtain a second semantic segmentation image, determining reliability information of each pixel point in the second semantic segmentation image based on the second semantic segmentation image, updating parameter values of the student network based on the first semantic segmentation image, the second semantic segmentation image and the reliability information, and updating parameter values of the teacher network based on the updated parameter values of the student network; according to the process, through the first semantic segmentation image, the second semantic segmentation image and the credibility information, the student network and the teacher network are controlled to generate consistent prediction results after the same target image is disturbed, so that the student network can learn specific characteristics in the target image in the process of migrating based on the target image, namely, the student network migrates and learns towards a specific direction, and because the parameter value of the teacher network is updated according to the parameter value of the student network, the teacher network migrates and learns towards the specific direction, and the problem of negative migration is avoided.

Referring to fig. 3, another method for training a neural network is provided in an embodiment of the present disclosure, including:

s301: and carrying out semantic segmentation processing on the first noise image of the target image by using the student network to obtain a first semantic segmentation image.

S302: performing semantic segmentation processing on a second noise image of the target image by using a teacher network to obtain a second semantic segmentation image; and determining the credibility information of each pixel point in the second semantic segmentation image based on the second semantic segmentation image.

The specific implementation process of S301 to S302 is similar to that of S101 to S102, and will not be repeated here.

S303: and carrying out semantic segmentation processing on the style migration image of the source image by utilizing the student network to obtain a third semantic segmentation image, wherein the style migration image of the source image is an image obtained by migrating the style of the source image to a target domain where the target image is located.

In specific implementation, S303 has no sequential logic relationship with the above S301-S302; the execution may be synchronous or asynchronous.

Specifically, the style migration image of the source image can be obtained, for example, in the following manner:

performing style migration processing on the source image by utilizing a pre-trained style migration network to obtain a style migration image corresponding to the source image; the style migration network is trained by utilizing the source image and the target image.

In particular implementations, the style migration network is, for example, a generative antagonism network (Generative Adversarial Networks, GANs), such as CycleGAN, or the like. The generation type countermeasure network can integrate semantic information of a source domain carried in the source image and semantic information of a target domain carried in the target image, so that the source image is converted into a style migration image containing partial features in the target image; and then carrying out semantic segmentation processing on the style migration image by using the student network.

In addition, the segmented migration image can also be generated by using a style migration network with other architectures, such as a neural network with an architecture VGG, googLeNet, and the style migration image can be specifically selected according to actual needs.

With the S302 and S303 described above in mind, the training method of the neural network provided in the embodiment of the present disclosure further includes:

s304: updating parameter values of the student network based on the first semantic segmentation image, the second semantic segmentation image, the credibility information, the third semantic segmentation image and the labeling information of the source image.

In a specific embodiment, the parameter values of the student network may be updated, for example, in the following manner: generating a consistency loss of the first semantic segmentation image and the second semantic segmentation image based on the first semantic segmentation image, the second semantic segmentation image, and the confidence information; generating semantic segmentation loss based on the labeling information of the third semantic segmentation image and the source image; parameters of the student network are updated based on the consistency loss and the semantic segmentation loss.

Exemplary, semantic segmentation penalty L _seg For example, for optimizing cross entropy loss of source images from a source domain, which satisfies the following formula (6):

wherein H represents the height of the style migration image; w represents the width of the style migration image; c represents the number of channels; y is _s Labeling information representing a source image;representing a third semantically segmented image; />Representing a source image; f (f) _S (. Cndot.) represents the student network.

When updating the parameter values of the student network based on the semantic segmentation loss and the consistency loss, for example, the weight of the consistency loss may be determined according to the current iteration number, and then the parameter values of the student network may be updated according to the consistency loss, the weight of the consistency loss, and the semantic segmentation loss.

For example, determining science from semantic segmentation loss, consistency lossTotal loss of the network; wherein the total loss L _total For example, the following formula (7) is satisfied:

L _total ＝L _seg +λ _con L _con (7)

wherein L is _seg Representing semantic segmentation loss; l (L) _con Representing a consistency loss; lambda (lambda) _con The weight of the consistency loss is, for example, a dynamic weight, which is set as a rising function increasing with the iteration number, the dynamic weight can balance between the semantic segmentation loss and the consistency loss, the advantage of the semantic segmentation loss is increased in the early training process of the neural network, and the advantage of the consistency loss is gradually increased in the later training process, so as to stably control the convergence of the parameter values of the neural network.

In connection with S304, the training method of the neural network provided in the embodiment of the disclosure further includes:

s305: updating the parameter values of the teacher network based on the updated parameter values of the student network.

Here, the specific implementation process of S305 is similar to S104 described above, and will not be described here again.

According to the embodiment of the disclosure, the semantic segmentation processing is carried out on the style migration image of the source image by utilizing the student network to obtain the third semantic segmentation image, and then the parameter value updating process of the student network is supervised based on the first semantic segmentation image, the second semantic segmentation image, the credibility information, the third semantic segmentation image and the labeling information of the source image, so that the semantic segmentation precision of the student network and the teacher network can be further improved.

Referring to fig. 4, the embodiment of the disclosure further provides a specific example of a training method of a neural network, including:

step 1: image x of source _s Inputting the source image x into a style migration network to obtain a source image x _s Is a style migration image of (1)

Step 2: migrating styles to imagesAnd inputting the third semantic segmentation image into a student network to obtain a third semantic segmentation image.

Step 3: based on source image x _s Is marked with information y _s And a third semantic segmentation image to obtain a semantic segmentation penalty L _seg 。

Step 4: for the target image x _t Injecting random noise, generating a first noise image, and inputting the first noise image into a student network to obtain a first semantic segmentation image.

Step 5: for the target image x _t Injecting random noise, generating N second noise images, and inputting the N second noise images into a teacher network to obtain a plurality of intermediate semantic segmentation images. And sequentially calculating pixel value mean values of pixel points at corresponding positions in the plurality of intermediate semantic segmentation images to obtain a second semantic segmentation image.

Step 7: and (3) calculating the information entropy of each pixel point in the second semantically segmented image according to the formula (2).

Step 8: and (3) performing reliability calculation according to the calculation of the formula (4) to obtain reliability information of each pixel point in the second semantic segmentation image.

Step 9: obtaining consistency loss L of the first semantic segmentation image and the second semantic segmentation image according to the first semantic segmentation image, the second semantic segmentation image and the credibility information _con 。

Step 10: calculate the total loss L according to equation (7) _total 。

Step 11: according to the total loss L _total Updating the parameter values of the student network.

Step 12: and carrying out index moving average processing on the updated parameter values of the student network, and updating the parameter values of the teacher network based on the result of the index moving average processing.

Through the above process, a round of iteration of the student network and the teacher network is achieved.

It will be appreciated by those skilled in the art that in the above-described method of the specific embodiments, the written order of steps is not meant to imply a strict order of execution but rather should be construed according to the function and possibly inherent logic of the steps.

Referring to fig. 5, an embodiment of the present disclosure further provides an image processing method, including:

s501: acquiring an image to be processed;

s502: and carrying out semantic segmentation processing on the image to be processed by using the neural network trained by the training method based on any embodiment of the neural network to obtain a semantic segmentation result of the image to be processed.

When the semantic segmentation processing is carried out on the image to be processed, the implementation is realized by utilizing the neural network trained by the neural network training method provided by the embodiment of the disclosure, the neural network trained by the neural network training method has better semantic style precision on the image to be processed, and further, the obtained semantic segmentation result of the image to be processed is more accurate.

Referring to fig. 6, an embodiment of the present disclosure further provides an intelligent driving control method, including:

S601: acquiring an image acquired by a running device in the running process;

s602: detecting a target object in the image using a neural network trained based on the training method of the neural network described in any of the embodiments of the present disclosure;

s603: the running apparatus is controlled based on the detected target object.

In particular implementations, the running gear is, for example, but not limited to, any of the following: an autonomous vehicle, a vehicle equipped with an advanced driving assistance system (Advanced Driving Assistance System, ADAS), or a robot, etc.

Controlling the running gear may include, for example, controlling the running gear to accelerate, decelerate, steer, brake, etc., or may play a voice prompt to prompt the driver to control the running gear to accelerate, decelerate, steer, brake, etc.

The intelligent driving control method disclosed by the embodiment of the disclosure is realized by utilizing the neural network trained by the neural network training method disclosed by the embodiment of the disclosure, and when the neural network trained by the neural network training method performs semantic segmentation processing on the image obtained in the driving process, a more accurate semantic segmentation processing result can be obtained, so that higher safety in the driving control execution process is ensured.

Based on the same inventive concept, the embodiments of the present disclosure further provide a neural network training device corresponding to the neural network training method, and since the principle of solving the problem of the device in the embodiments of the present disclosure is similar to that of the neural network training method in the embodiments of the present disclosure, implementation of the device may refer to implementation of the method, and repeated parts will not be repeated.

Referring to fig. 7, a schematic diagram of a training device for a neural network according to an embodiment of the disclosure is shown, where the device includes: a first processing module 71, a second processing module 72, a first updating module 73, and a second updating module 74; wherein,

a first processing module 71, configured to perform semantic segmentation processing on a first noise image of the target image by using the student network, so as to obtain a first semantic segmentation image;

a second processing module 72, configured to perform semantic segmentation processing on a second noise image of the target image by using a teacher network, so as to obtain a second semantic segmentation image; determining credibility information of each pixel point in the second semantic segmentation image based on the second semantic segmentation image;

a first updating module 73, configured to update parameter values of the student network based on the first semantically segmented image, the second semantically segmented image, and the credibility information;

A second updating module 74 for updating the parameter values of the teacher network based on the updated parameter values of the student network.

In a possible embodiment, the apparatus further comprises: a third processing module 75, configured to perform semantic segmentation processing on a style migration image of a source image by using a student network, to obtain a third semantic segmentation image, where the style migration image of the source image is an image obtained by migrating a style of the source image to a target domain where the target image is located;

the first updating module 73 is configured to, when updating the parameter values of the student network based on the first semantically segmented image, the second semantically segmented image, and the credibility information:

updating parameter values of the student network based on the first semantic segmentation image, the second semantic segmentation image, the credibility information, the third semantic segmentation image and the labeling information of the source image.

In a possible implementation manner, the first updating module 73 is configured to, when updating the parameter values of the student network based on the first semantic segmentation image, the second semantic segmentation image, the credibility information, the third semantic segmentation image, and the labeling information of the source image:

Determining a consistency loss based on the first semantically segmented image, the second semantically segmented image, and the confidence information; determining the weight of the consistency loss based on the current iteration times;

determining semantic segmentation loss based on the third semantic segmentation image and annotation information of the source image;

updating parameter values of the student network based on the consistency loss, the weight, and the semantic segmentation loss.

In a possible implementation manner, the second processing module 72 is configured to, when performing semantic segmentation processing on the second noise image of the target image by using the teacher network, obtain a second semantic segmented image:

respectively carrying out semantic segmentation processing on a plurality of second noise images of the target image by using a teacher network to obtain a plurality of intermediate semantic segmentation images;

the second semantically segmented image is generated based on the plurality of intermediate semantically segmented images.

In a possible implementation manner, the second processing module 72 is configured, when generating the second semantic segmentation image based on the plurality of intermediate semantic segmentation images, to:

sequentially calculating pixel value mean values of pixel points at corresponding positions in a plurality of intermediate semantic segmentation images;

And determining the average value of the pixel points at any corresponding position as the pixel value of the pixel point at the corresponding position in the second semantic segmentation image.

In a possible implementation manner, the second processing module 72 is configured to, when determining, based on the second semantically segmented image, reliability information of each pixel point in the second semantically segmented image:

determining the information entropy of each pixel point in the second semantic segmentation image based on the pixel value of each pixel point in the second semantic segmentation image;

and determining the credibility information of each pixel point in the second semantic segmentation image based on the information entropy of each pixel point in the second semantic segmentation image and a predetermined information entropy threshold.

In a possible implementation manner, the second processing module 72 is configured to, when determining the reliability information of each pixel point in the second semantically segmented image based on the information entropy of each pixel point in the second semantically segmented image and a predetermined information entropy threshold value:

comparing the information entropy of each pixel point in the second semantic segmentation image with the information entropy threshold;

determining credibility information of each pixel point in the second semantic segmentation image based on the comparison result;

In a possible implementation manner, the second processing module 72 is further configured to generate the information entropy threshold in the following manner:

and determining the information entropy threshold based on the semantic segmentation type of the teacher network.

In a possible implementation manner, the second updating module 74 is configured to, when updating the parameter value of the teacher network based on the updated parameter value of the student network:

performing exponential moving average processing on parameter values of parameters in the student network to obtain target parameter values;

and replacing the parameter value of the corresponding parameter in the teacher network by using the target parameter value.

In a possible embodiment, the method further comprises: a first generation module 76 is configured to generate the style migration image in the following manner:

In a possible embodiment, the method further comprises: an initialization module 77, configured to perform an initialization process on the teacher network and the student network using a pre-trained semantic segmentation network.

In a possible embodiment, the method further comprises: a second generating module 78, configured to generate the first noise image and the second noise image in the following manner:

injecting random noise into the target image to obtain the first noise image and the second noise image; wherein the noise corresponding to different noise images is different.

The process flow of each module in the apparatus and the interaction flow between the modules may be described with reference to the related descriptions in the above method embodiments, which are not described in detail herein.

Referring to fig. 8, an embodiment of the present disclosure further provides an image processing apparatus including:

an acquiring module 81, configured to acquire an image to be processed;

the processing module 82 is configured to perform semantic segmentation processing on the image to be processed by using the neural network trained by the neural network training method according to any embodiment of the present disclosure, so as to obtain a semantic segmentation result of the image to be processed.

Referring to fig. 9, an embodiment of the present disclosure further provides an intelligent driving control device, which is characterized by including:

A data acquisition module 91, configured to acquire an image acquired by the driving device during driving;

a detection module 92 for detecting a target object in the image using a neural network trained based on the neural network training method according to any of the embodiments of the present disclosure;

a control module 93 for controlling the running apparatus based on the detected target object.

The embodiment of the present disclosure further provides an electronic device 10, as shown in fig. 10, which is a schematic structural diagram of the electronic device 10 provided in the embodiment of the present disclosure, including:

a processor 11 and a memory 12; the memory 12 stores machine readable instructions executable by the processor 11 which, when the electronic device is running, are executed by the processor to perform the steps of:

carrying out semantic segmentation processing on a first noise image of the target image by utilizing a student network to obtain a first semantic segmentation image; performing semantic segmentation processing on a second noise image of the target image by using a teacher network to obtain a second semantic segmentation image; determining credibility information of each pixel point in the second semantic segmentation image based on the second semantic segmentation image; updating parameter values of the student network based on the first semantic segmentation image, the second semantic segmentation image, and the credibility information; updating the parameter values of the teacher network based on the updated parameter values of the student network.

Or the following steps are realized: acquiring an image to be processed; performing semantic segmentation processing on the image to be processed by using a neural network trained based on the training method of the neural network according to any embodiment of the disclosure to obtain a semantic segmentation result of the image to be processed;

or the following steps are realized: acquiring an image acquired by a running device in the running process; detecting a target object in the image using a neural network trained based on the training method of the neural network described in any of the embodiments of the present disclosure; the running apparatus is controlled based on the detected target object. .

The specific execution process of the above instruction may refer to the steps of the neural network training method described in the embodiments of the present disclosure, or the image processing steps, which are not described herein.

The disclosed embodiments also provide a computer-readable storage medium having a computer program stored thereon, which when executed by a processor performs the steps of the neural network training method described in the method embodiment, or performs the steps of the image processing method described in the method embodiment, or performs the steps of the intelligent travel control method described in the method embodiment. Wherein the storage medium may be a volatile or nonvolatile computer readable storage medium.

The computer program product of the training method and the image processing method of the neural network provided in the embodiments of the present disclosure includes a computer readable storage medium storing a program code, where the program code includes instructions for executing the training method, the steps of the image processing method, or the steps of the intelligent driving control method of the neural network described in the embodiments of the methods, and specifically, reference may be made to the embodiments of the methods described above, and details thereof will not be repeated herein.

The disclosed embodiments also provide a computer program which, when executed by a processor, implements any of the methods of the previous embodiments. The computer program product may be realized in particular by means of hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied as a computer storage medium, and in another alternative embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), or the like.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described system and apparatus may refer to corresponding procedures in the foregoing method embodiments, which are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present disclosure may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in essence or a part contributing to the prior art or a part of the technical solution, or in the form of a software product stored in a storage medium, including several instructions for causing an electronic device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Finally, it should be noted that: the foregoing examples are merely specific embodiments of the present disclosure, and are not intended to limit the scope of the disclosure, but the present disclosure is not limited thereto, and those skilled in the art will appreciate that while the foregoing examples are described in detail, it is not limited to the disclosure: any person skilled in the art, within the technical scope of the disclosure of the present disclosure, may modify or easily conceive changes to the technical solutions described in the foregoing embodiments, or make equivalent substitutions for some of the technical features thereof; such modifications, changes or substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the disclosure, and are intended to be included within the scope of the present disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims

1. A method of training a neural network, comprising:

carrying out semantic segmentation processing on a first noise image of the target image by utilizing a student network to obtain a first semantic segmentation image;

performing semantic segmentation processing on a second noise image of the target image by using a teacher network to obtain a second semantic segmentation image; determining credibility information of each pixel point in the second semantic segmentation image based on the second semantic segmentation image;

Updating parameter values of the student network based on the first semantic segmentation image, the second semantic segmentation image, and the credibility information;

updating the parameter values of the teacher network based on the updated parameter values of the student network;

the method further comprises the steps of:

performing semantic segmentation processing on a style migration image of a source image by using a student network to obtain a third semantic segmentation image, wherein the style migration image of the source image is an image obtained by migrating the style of the source image to a target domain where the target image is located;

the updating the parameter values of the student network based on the first semantically segmented image, the second semantically segmented image, and the credibility information includes:

updating parameter values of the student network based on the first semantic segmentation image, the second semantic segmentation image, the credibility information, the third semantic segmentation image and the labeling information of the source image;

the updating the parameter value of the student network based on the first semantic segmentation image, the second semantic segmentation image, the credibility information, the third semantic segmentation image and the labeling information of the source image comprises the following steps:

2. The training method according to claim 1, wherein performing semantic segmentation processing on the second noise image of the target image by using a teacher network to obtain a second semantic segmented image comprises:

3. The training method of claim 2, wherein the generating the second semantically segmented image based on the plurality of intermediate semantically segmented images comprises:

4. A training method according to any of claims 1-3, wherein said determining the confidence information of each pixel point in the second semantically segmented image based on the second semantically segmented image comprises:

5. The training method of claim 4, wherein determining the confidence information for each pixel in the second semantically segmented image based on the information entropy of each pixel in the second semantically segmented image and a predetermined information entropy threshold comprises:

6. The training method of claim 4, wherein the information entropy threshold is generated by:

7. The training method of any of claims 1-3, 5, 6, wherein updating the parameter values of the teacher network based on the updated parameter values of the student network comprises:

8. An image processing method, comprising:

acquiring an image to be processed;

performing semantic segmentation processing on the image to be processed by using the neural network trained by the training method based on the neural network according to any one of claims 1-7 to obtain a semantic segmentation result of the image to be processed.

9. An intelligent travel control method is characterized by comprising the following steps:

acquiring an image acquired by a running device in the running process;

detecting a target object in the image using a neural network trained based on the training method of any one of claims 1-7;

the running apparatus is controlled based on the detected target object.

10. A neural network training device, comprising:

the first processing module is used for carrying out semantic segmentation processing on the first noise image of the target image by utilizing the student network to obtain a first semantic segmentation image;

the second processing module is used for carrying out semantic segmentation processing on a second noise image of the target image by utilizing a teacher network to obtain a second semantic segmentation image; determining credibility information of each pixel point in the second semantic segmentation image based on the second semantic segmentation image;

a first updating module, configured to update parameter values of the student network based on the first semantic segmentation image, the second semantic segmentation image, and the credibility information;

a second updating module, configured to update a parameter value of the teacher network based on the updated parameter value of the student network;

The apparatus further comprises: the third processing module is used for carrying out semantic segmentation processing on the style migration image of the source image by utilizing the student network to obtain a third semantic segmentation image, wherein the style migration image of the source image is an image obtained by migrating the style of the source image to a target domain where the target image is located;

the first updating module is configured to, when updating the parameter values of the student network based on the first semantically segmented image, the second semantically segmented image, and the credibility information:

the first updating module is configured to, when updating parameter values of the student network based on the first semantically-segmented image, the second semantically-segmented image, the credibility information, the third semantically-segmented image, and the labeling information of the source image:

11. An image processing apparatus, comprising:

the acquisition module is used for acquiring the image to be processed;

the processing module is used for carrying out semantic segmentation processing on the image to be processed by utilizing the neural network trained by the training method based on the neural network according to any one of claims 1-7 to obtain a semantic segmentation result of the image to be processed.

12. An intelligent travel control device, comprising:

the data acquisition module is used for acquiring images acquired by the driving device in the driving process;

a detection module for detecting a target object in the image using a neural network trained based on the training method of any one of claims 1-7;

and the control module is used for controlling the running device based on the detected target object.

13. An electronic device, comprising: a processor, a memory storing machine-readable instructions executable by the processor for executing the machine-readable instructions stored in the memory, which when executed by the processor, perform the steps of the method of any one of claims 1 to 9.

14. A computer-readable storage medium, on which a computer program is stored which, when being run by an electronic device, performs the steps of the method according to any one of claims 1 to 9.