CN112633285A - Domain adaptation method, domain adaptation device, electronic equipment and storage medium - Google Patents

Domain adaptation method, domain adaptation device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112633285A
CN112633285A CN202011543313.3A CN202011543313A CN112633285A CN 112633285 A CN112633285 A CN 112633285A CN 202011543313 A CN202011543313 A CN 202011543313A CN 112633285 A CN112633285 A CN 112633285A
Authority
CN
China
Prior art keywords
image
network
pixel point
recognized
semantic segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011543313.3A
Other languages
Chinese (zh)
Inventor
刘杰
王健宗
瞿晓阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202011543313.3A priority Critical patent/CN112633285A/en
Priority to PCT/CN2021/082603 priority patent/WO2022134338A1/en
Publication of CN112633285A publication Critical patent/CN112633285A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the application relates to the technical field of artificial intelligence, in particular to a field adaptation method, a field adaptation device, field adaptation equipment and a storage medium. The method comprises the following steps: acquiring an image to be identified from a target domain; inputting the image to be recognized into a first segmentation network to obtain a first class proportion, wherein the first segmentation network is obtained by training images of a source domain; inputting the image to be recognized into a second segmentation network to obtain a second class proportion and an entropy diagram, wherein the entropy diagram is a matrix formed by information entropy of each pixel point in the image to be recognized; performing domain adaptation on the second segmentation network according to the first class proportion, the second class proportion and the entropy diagram. This application is favorable to improving the efficiency of field adaptation.

Description

Domain adaptation method, domain adaptation device, electronic equipment and storage medium
Technical Field
The present application relates to the field of image recognition technologies, and in particular, to a method and an apparatus for field adaptation, an electronic device, and a storage medium.
Background
Semantic segmentation has become a key step in many modern technology applications. Since the time of deep learning, automatic semantic segmentation methods in various problems have been advanced. However, the semantic segmentation network for semantic segmentation has a significantly degraded performance in different domains where samples have different distributions. Therefore, these semantic segmentation networks require pixel-by-pixel labeled images as training samples. And marking a sample is a work that costs time and money.
To overcome this problem, domain adaptive methods have emerged, which refers to a process of migrating a trained model in a labeled source domain to a target domain with no or few labels. While counterlearning strategies are popular techniques in field adaptive methods, one major limitation of counterlearning techniques is that they require simultaneous acquisition of image data of the source domain and the target domain for adaptation during the adaptation phase. Sometimes, the image data of the source field cannot be obtained for privacy or data loss reasons. Therefore, due to the limitation of the source domain image data acquisition, the domain adaptation efficiency is low, and it is highly desirable to provide an efficient domain adaptation method.
Disclosure of Invention
The embodiment of the application provides a field adaptation method, which is used for completing field adaptation without using image data of a source domain, so that a network trained through a target domain has the characteristics of the source domain, and the field adaptation efficiency is improved.
In a first aspect, an embodiment of the present application provides a domain adaptation method, including:
acquiring an image to be identified from a target domain;
inputting the image to be recognized into a first segmentation network to obtain a first class proportion, wherein the first segmentation network is obtained by training images of a source domain;
inputting the image to be recognized into a second segmentation network to obtain a second class proportion and an entropy diagram, wherein the entropy diagram is a matrix formed by information entropy of each pixel point in the image to be recognized;
performing domain adaptation on the second segmentation network according to the first class proportion, the second class proportion and the entropy diagram.
In some possible embodiments, the inputting the image to be recognized into the first segmentation network to obtain the first class ratio includes:
inputting the image to be recognized into the first segmentation network, and performing semantic segmentation on each pixel point in the image to be recognized to obtain a first semantic segmentation result of each pixel point, wherein the first semantic segmentation result of each pixel point represents the probability that each pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1;
averaging the first semantic segmentation result of each pixel point to obtain a first semantic segmentation result of the image to be recognized;
and obtaining the first category proportion according to the first semantic segmentation result of the image to be recognized.
In some possible embodiments, the inputting the image to be recognized into the second segmentation network to obtain a second class proportion and an entropy map includes:
inputting the image to be recognized into a second segmentation network, and performing semantic segmentation on each pixel point in the image to be recognized to obtain a second semantic segmentation result of each pixel point, wherein the second semantic segmentation result of each pixel point is used for expressing the probability that each pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1;
averaging the second semantic segmentation result of each pixel point to obtain a second semantic segmentation result of the image to be recognized;
and determining the information entropy of each pixel point according to the second semantic segmentation result of each pixel point and an information entropy calculation formula, and forming the information entropy of each pixel point into the entropy diagram.
In some possible embodiments, the performing a domain adaptation on the second segmentation network according to the first class ratio, the second class ratio, and the entropy map includes:
determining a first KL divergence between the first class ratio and the second class ratio;
determining the sum of information entropies of all pixel points in the entropy diagram;
determining target loss according to the first KL divergence, the sum of the information entropies of all the pixel points and preset parameters;
and adjusting the network parameters of the second segmentation network according to the target loss so as to carry out field adaptation on the second segmentation network.
In some possible embodiments, the second segmentation network further comprises a first convolution layer, the first convolution layer being connected to the coding network; before the first feature map is up-sampled by the decoding network to obtain a second feature map, the method further includes:
carrying out bilinear interpolation on the first feature map to obtain a third feature map, wherein the dimension of the third feature map is the same as that of the second feature map;
performing semantic segmentation on the third feature map through the first convolution layer to obtain a third semantic segmentation result of each pixel point;
determining a second KL divergence between the second semantic segmentation result of each second pixel point and the third semantic segmentation result of each pixel point, and obtaining an average value of the second KL divergences of the pixel points in the image to be identified to obtain a third KL divergence;
determining the target loss according to the first KL divergence, the sum of the information entropies of the pixel points and preset parameters, including:
and determining target loss according to the first KL divergence, the third KL divergence, the sum of the information entropies of all the pixel points and preset parameters.
In some possible embodiments, the method further comprises: after completing the domain adaptation to the second split network, deleting the decoding network and the second convolution layer to obtain a third split network; and performing semantic segmentation on the image by using the third segmentation network.
In a second aspect, an embodiment of the present application provides a domain adaptation apparatus, including:
the acquisition unit is used for acquiring an image to be identified from a target domain;
the processing unit is used for inputting the image to be recognized into a first segmentation network to obtain a first class proportion, and the first segmentation network is obtained by training images of a source domain;
inputting the image to be recognized into a second segmentation network to obtain a second class proportion and an entropy diagram, wherein the entropy diagram is a matrix formed by information entropy of each pixel point in the image to be recognized;
performing domain adaptation on the second segmentation network according to the first class proportion, the second class proportion and the entropy diagram.
In some possible embodiments, in terms of inputting the image to be recognized to the first segmentation network to obtain the first class ratio, the processing unit is specifically configured to:
inputting the image to be recognized into the first segmentation network, and performing semantic segmentation on each pixel point in the image to be recognized to obtain a first semantic segmentation result of each pixel point, wherein the first semantic segmentation result of each pixel point represents the probability that each pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1;
averaging the first semantic segmentation result of each pixel point to obtain a first semantic segmentation result of the image to be recognized;
and obtaining the first category proportion according to the first semantic segmentation result of the image to be recognized.
In some possible embodiments, in terms of inputting the image to be recognized to the second segmentation network to obtain the second class ratio and the entropy map, the processing unit is specifically configured to:
inputting the image to be recognized into a second segmentation network, and performing semantic segmentation on each pixel point in the image to be recognized to obtain a second semantic segmentation result of each pixel point, wherein the second semantic segmentation result of each pixel point is used for expressing the probability that each pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1;
averaging the second semantic segmentation result of each pixel point to obtain a second semantic segmentation result of the image to be recognized;
and determining the information entropy of each pixel point according to the second semantic segmentation result of each pixel point and an information entropy calculation formula, and forming the information entropy of each pixel point into the entropy diagram.
In some possible embodiments, in terms of performing domain adaptation on the second segmentation network according to the first class ratio, the second class ratio, and the entropy map, the processing unit is specifically configured to:
determining a first KL divergence between the first class ratio and the second class ratio;
determining the sum of information entropies of all pixel points in the entropy diagram;
determining target loss according to the first KL divergence, the sum of the information entropies of all the pixel points and preset parameters;
and adjusting the network parameters of the second segmentation network according to the target loss so as to carry out field adaptation on the second segmentation network.
In some possible embodiments, the second segmentation network further comprises a first convolution layer, the first convolution layer being connected to the coding network; before the decoding network performs upsampling processing on the first feature map to obtain a second feature map, the processing unit is further configured to:
carrying out bilinear interpolation on the first feature map to obtain a third feature map, wherein the dimension of the third feature map is the same as that of the second feature map;
performing semantic segmentation on the third feature map through the first convolution layer to obtain a third semantic segmentation result of each pixel point;
determining a second KL divergence between the second semantic segmentation result of each second pixel point and the third semantic segmentation result of each pixel point, and obtaining an average value of the second KL divergences of the pixel points in the image to be identified to obtain a third KL divergence;
in terms of determining a target loss according to the first KL divergence, the sum of the information entropies of the pixel points, and a preset parameter, the processing unit is specifically configured to:
and determining target loss according to the first KL divergence, the third KL divergence, the sum of the information entropies of all the pixel points and preset parameters.
In some possible embodiments, the processing unit is further configured to delete the decoding network and the second convolutional layer after completing the domain adaptation for the second split network, so as to obtain a third split network; and performing semantic segmentation on the image by using the third segmentation network.
In a third aspect, an embodiment of the present application provides an electronic device, including: a processor coupled to a memory, the memory configured to store a computer program, the processor configured to execute the computer program stored in the memory to cause the electronic device to perform the method of the first aspect.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, which stores a computer program, where the computer program makes a computer execute the method according to the first aspect.
In a fifth aspect, embodiments of the present application provide a computer program product comprising a non-transitory computer-readable storage medium storing a computer program, the computer being operable to cause a computer to perform the method according to the first aspect.
The embodiment of the application has the following beneficial effects:
it can be seen that, in the embodiment of the application, in the field adaptation process, the image to be identified of the target domain can be directly used to perform field adaptation on the second segmentation network of the target domain, and the image of the source domain does not need to be used, so that the problem of difficulty in obtaining the image in the source domain is solved, and the field adaptation efficiency is improved. In addition, in the process of adaptation, the information entropy of each pixel point is counted, so that the adapted second segmentation network can accurately classify each pixel point, and the semantic segmentation precision is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flowchart of a domain adaptation method according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a second split network according to an embodiment of the present application;
fig. 3 is a schematic diagram illustrating a training process of a first segmentation network according to an embodiment of the present disclosure;
fig. 4 is a block diagram illustrating functional units of a domain adaptation apparatus according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of a domain adaptation apparatus according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first," "second," "third," and "fourth," etc. in the description and claims of this application and in the accompanying drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, result, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
Referring to fig. 1, fig. 1 is a schematic flow chart of a domain adaptation method according to an embodiment of the present application. The method is applied to a field adaptation device. The method comprises the following steps:
101: the domain adaptation device acquires an image to be recognized from a target domain.
The image to be recognized may be any image in the target domain. Most of the images in the target domain do not contain tags, and the image to be identified does not contain tags in the application as an example.
102: the domain adaptation device inputs the image to be recognized into a first segmentation network to obtain a first class proportion, and the first segmentation network is obtained by training the image of the source domain.
Illustratively, the first segmentation network is obtained by training using an image of the source domain, and the training process of the first segmentation network is described later, which is not described herein too much.
Exemplarily, the image to be recognized is input into a first image segmentation network, a feature map is performed on the image to be recognized, the feature map of the image to be recognized is obtained, each pixel point in the image to be recognized is subjected to semantic segmentation according to the feature map, and a first semantic segmentation result of each pixel point is obtained, wherein the semantic segmentation result of each pixel point represents the probability that the pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1. That is to say, semantic segmentation is performed on each pixel point in the image to be recognized, so that the probability that each pixel point falls into 1 category, 2 categories, …, and N categories is obtained.
Then, averaging the first semantic segmentation result of each pixel point to obtain a first semantic segmentation result of the image to be recognized, and obtaining a first category proportion. Illustratively, the first semantic segmentation result of the image to be recognized may be represented by formula (1):
Figure BDA0002853653550000071
wherein s is an image to be recognized, k represents k classes, τ (s, k) is a first semantic segmentation result of the image to be recognized, | { Ω [, n }, and k |, respectivelysThe i represents the number of pixel points in the image to be identified, the i represents the ith pixel point in the image to be identified,
Figure BDA0002853653550000072
the first semantic segmentation result is the first semantic segmentation result of the ith pixel point in the image to be recognized, namely the probability of belonging to k categories.
Further, the first semantic segmentation result of the image to be recognized represents the probability that the image to be recognized belongs to k classes, that is, the class proportion of the image to be recognized, that is, the probability that the image to be recognized belongs to each class, as the proportion of each class.
103: the field adaptation device inputs the image to be recognized into a second segmentation network to obtain a second class proportion and an entropy diagram, wherein the entropy diagram is a matrix formed by information entropies of all pixel points in the image to be recognized.
Exemplarily, the image to be recognized is input into a second segmentation network, semantic segmentation is performed on each pixel point in the image to be recognized, so that a second semantic segmentation result of each pixel point is obtained, and similarly, the second semantic segmentation result of each pixel point is used for representing the probability that each pixel point belongs to k categories; and then, averaging the second semantic segmentation result of each pixel point. Furthermore, the information entropy of each pixel point can be determined according to the second semantic segmentation result of each pixel point, and then the information entropy of each pixel point is formed into an entropy chart, namely a matrix formed by the information entropy. Illustratively, the information entropy of each pixel point can be represented by formula (2):
Figure BDA0002853653550000073
wherein H (i) represents the information entropy of the ith pixel point,
Figure BDA0002853653550000074
and the probability that the ith pixel point belongs to j categories is represented, and the value of j is an integer from 1 to N.
104: and the domain adaptation device carries out domain adaptation on the second segmentation network according to the first class proportion, the second class proportion and the entropy diagram.
Illustratively, determining a first KL divergence between the first category ratio and the second category ratio, and summing information of each pixel point in the entropy diagram; determining target loss according to the first KL divergence, the sum of the information entropies of all the pixel points and preset parameters; finally, the network parameters of the second segmentation network are adjusted according to the target loss so as to carry out field adaptation on the second segmentation network. Illustratively, the target loss can be represented by equation (3):
Figure BDA0002853653550000081
wherein Loss is target Loss, lambda is a preset parameter, KL is KL divergence solving operation,
Figure BDA0002853653550000082
for an image to be recognizeds, lent is the entropy operation,
Figure BDA0002853653550000083
and obtaining a second semantic segmentation result of the ith pixel point in the image to be recognized.
It can be seen that, in the embodiment of the application, in the field adaptation process, the image to be identified of the target domain can be directly used to perform field adaptation on the second segmentation network of the target domain, and the image of the source domain does not need to be used, so that the problem of difficulty in obtaining the image in the source domain is solved, and the field adaptation efficiency is improved. In addition, in the process of adaptation, the information entropy of each pixel point is counted, so that the adapted second segmentation network can accurately classify each pixel point, and the semantic segmentation precision is improved.
The process of semantically segmenting the pixel points in the image to be identified is described below by combining the network structure of the second segmentation network. The first segmentation network and the second segmentation network have similar network structures, and the way of semantically segmenting the image to be recognized is similar to the way of segmenting the image to be recognized by the second network structure, which is not described again.
As shown in fig. 2, the second split network includes an encoding network, a first convolutional layer, a decoding network, and a second convolutional layer. Therefore, the image to be identified is subjected to down-sampling processing through the coding network to obtain a first feature map, and the first feature map is subjected to up-sampling processing through the decoding network to obtain a second feature map; and segmenting the second characteristic graph through the second convolution layer to obtain a second semantic segmentation result of each pixel point. Illustratively, if the dimension of the convolution kernel of the convolution layer is 1 × 1, performing convolution processing on the pixel value of each pixel point in the second feature map through the convolution kernel, and performing softmax normalization processing on the pixel value after convolution processing of the second feature map on each channel to obtain a second semantic segmentation result of each pixel point. It should be understood that more convolutional layers may be designed for semantic segmentation, and only one convolutional layer is illustrated in this application as an example.
In addition, before the first feature map is up-sampled by the decoding network, bilinear interpolation is carried out on the first feature map, and scale recovery is carried out on the first feature map to obtain a third feature map, wherein the dimension of the third feature map is the same as that of the second feature map; and then, performing semantic segmentation on the third feature map through the first convolution layer to obtain a third semantic segmentation result of each pixel point, wherein the semantic segmentation on the third feature map through the first convolution layer is similar to the semantic segmentation on the second feature map through the second convolution layer, and description is omitted. Then, determining a second KL divergence between a second semantic segmentation result of each pixel point and a third semantic segmentation result of each pixel point; and obtaining the average value of the second KL divergence of each pixel point in the image to be identified to obtain a third KL divergence. Illustratively, the third KL divergence may be represented by equation (4):
Figure BDA0002853653550000091
wherein KL is3Is the third KL divergence.
Further, after the third KL divergence is determined, the target loss may be determined according to the first KL divergence, the third KL divergence, the sum of the information entropies of the respective pixel points, and the preset parameter. Then, domain adaptation is performed on the second partitioned network based on the target loss. Further, after the adaptation to the second segmentation network domain is completed, the decoding network and the second convolution layer are deleted to obtain a third segmentation network, and the image is semantically segmented using the second segmentation network from which the decoding network and the second convolution layer are deleted (i.e., the third segmentation network).
It can be seen that, when domain adaptation is performed on the second segmentation network, the loss (the third KL divergence) between the coding network and the decoding network is determined, that is, the coding network and the decoding network are subjected to countermeasure training, so that the coding network has the function of the decoding network, and then, the decoding network is deleted without reducing the semantic segmentation precision, so that the model scale of the second segmentation network is reduced, the migration of the second segmentation network is facilitated, and the efficiency of performing semantic segmentation on the second segmentation network is improved.
It should be understood that the first segmentation network may be used as a supervision network of the second segmentation network, and therefore, in order to ensure the precision of semantic segmentation for each pixel, after the training of the first segmentation network is completed, the decoding network in the first segmentation network and the convolutional layer connected to the decoding network are not deleted.
In some possible embodiments, the field adaptation method of the present application may be applied to the medical field. That is, the first segmentation network and the second segmentation network are networks for lesion segmentation, and then the probability that each pixel belongs to k categories, that is, the probability that each pixel belongs to k lesions. For the medical field, the labeling cost of the medical image is relatively high, so that the first segmentation network may be trained by using image data of an existing source domain (for example, a medical image related to a tumor in a source database with a label), and then the second segmentation network is adapted based on the trained first segmentation network and the label which is not provided, so that the second segmentation network has the segmentation effect of the first segmentation network, thereby transferring the image knowledge of the source domain to a target domain, improving the segmentation precision of the second segmentation network, further improving the precision of lesion segmentation, and thus providing a data reference for diagnosis of a doctor, and promoting the improvement of medical science and technology.
In some possible embodiments, the domain adaptation method of the present application may also be applied to the field of blockchains, for example, images of the source domain and/or the target domain may be stored in the blockchains, so that security during accessing the images of the source domain and/or the target domain may be ensured.
Referring to fig. 3, fig. 3 is a schematic flowchart of a process for training a first segmentation network according to an embodiment of the present disclosure. The method comprises the following steps:
301: training images are acquired from the source domain.
302: and inputting the training image into a first segmentation network, predicting a fourth semantic segmentation result of each pixel point in the training image, wherein the fourth semantic segmentation result of each pixel point is used for expressing the probability that the pixel point belongs to k categories.
303: and determining a fourth KL divergence according to a fourth semantic segmentation result of each pixel point and the label of each pixel point, wherein the label of each pixel point is used for representing the real probability that the pixel point belongs to k categories.
304: and adjusting the network parameters of the first segmentation network according to the fourth KL divergence.
Illustratively, the fourth KL divergence is taken as a loss result of the first segmentation network, and then, the network parameters of the neural network are adjusted according to the loss result until the first segmentation network converges, thereby completing the training of the first segmentation network.
Referring to fig. 4, fig. 4 is a block diagram illustrating functional units of a domain adaptive device according to an embodiment of the present disclosure. The domain adaptation device 400 includes: an acquisition unit 401 and a processing unit 402, wherein:
an acquiring unit 401, configured to acquire an image to be recognized from a target domain;
a processing unit 402, configured to input the image to be identified into a first segmentation network to obtain a first class ratio, where the first segmentation network is obtained by using an image of a source domain for training;
inputting the image to be recognized into a second segmentation network to obtain a second class proportion and an entropy diagram, wherein the entropy diagram is a matrix formed by information entropy of each pixel point in the image to be recognized;
performing domain adaptation on the second segmentation network according to the first class proportion, the second class proportion and the entropy diagram.
In some possible embodiments, in terms of inputting the image to be recognized to the first segmentation network to obtain the first class ratio, the processing unit 402 is specifically configured to:
inputting the image to be recognized into the first segmentation network, and performing semantic segmentation on each pixel point in the image to be recognized to obtain a first semantic segmentation result of each pixel point, wherein the first semantic segmentation result of each pixel point represents the probability that each pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1;
averaging the first semantic segmentation result of each pixel point to obtain a first semantic segmentation result of the image to be recognized;
and obtaining the first category proportion according to the first semantic segmentation result of the image to be recognized.
In some possible embodiments, in terms of inputting the image to be recognized to the second segmentation network to obtain the second class ratio and the entropy map, the processing unit 402 is specifically configured to:
inputting the image to be recognized into a second segmentation network, and performing semantic segmentation on each pixel point in the image to be recognized to obtain a second semantic segmentation result of each pixel point, wherein the second semantic segmentation result of each pixel point is used for expressing the probability that each pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1;
averaging the second semantic segmentation result of each pixel point to obtain a second semantic segmentation result of the image to be recognized;
and determining the information entropy of each pixel point according to the second semantic segmentation result of each pixel point and an information entropy calculation formula, and forming the information entropy of each pixel point into the entropy diagram.
In some possible embodiments, in terms of performing domain adaptation on the second segmentation network according to the first class ratio, the second class ratio, and the entropy map, the processing unit 402 is specifically configured to:
determining a first KL divergence between the first class ratio and the second class ratio;
determining the sum of information entropies of all pixel points in the entropy diagram;
determining target loss according to the first KL divergence, the sum of the information entropies of all the pixel points and preset parameters;
and adjusting the network parameters of the second segmentation network according to the target loss so as to carry out field adaptation on the second segmentation network.
In some possible embodiments, the second segmentation network further comprises a first convolution layer, the first convolution layer being connected to the coding network; before performing upsampling processing on the first feature map through the decoding network to obtain a second feature map, the processing unit 402 is further configured to:
carrying out bilinear interpolation on the first feature map to obtain a third feature map, wherein the dimension of the third feature map is the same as that of the second feature map;
performing semantic segmentation on the third feature map through the first convolution layer to obtain a third semantic segmentation result of each pixel point;
determining a second KL divergence between the second semantic segmentation result of each second pixel point and the third semantic segmentation result of each pixel point, and obtaining an average value of the second KL divergences of the pixel points in the image to be identified to obtain a third KL divergence;
in terms of determining a target loss according to the first KL divergence, the sum of the information entropies of the respective pixel points, and a preset parameter, the processing unit 402 is specifically configured to:
and determining target loss according to the first KL divergence, the third KL divergence, the sum of the information entropies of all the pixel points and preset parameters.
In some possible embodiments, the processing unit 402 is further configured to delete the decoding network and the second convolutional layer after completing the domain adaptation for the second split network, so as to obtain a third split network; and performing semantic segmentation on the image by using the third segmentation network.
Referring to fig. 5, fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. As shown in fig. 5, the electronic device 500 includes a transceiver 501, a processor 502, and a memory 503. Connected to each other by a bus 504. The memory 503 is used to store computer programs and data, and may transmit the data stored by the memory 503 to the processor 502.
The processor 502 is configured to read the computer program in the memory 503 to perform the following operations:
acquiring an image to be identified from a target domain;
inputting the image to be recognized into a first segmentation network to obtain a first class proportion, wherein the first segmentation network is obtained by training images of a source domain;
inputting the image to be recognized into a second segmentation network to obtain a second class proportion and an entropy diagram, wherein the entropy diagram is a matrix formed by information entropy of each pixel point in the image to be recognized;
performing domain adaptation on the second segmentation network according to the first class proportion, the second class proportion and the entropy diagram.
In some possible embodiments, in inputting the image to be recognized to the first segmentation network to obtain the first class ratio, the processor 502 is configured to perform the following steps:
inputting the image to be recognized into the first segmentation network, and performing semantic segmentation on each pixel point in the image to be recognized to obtain a first semantic segmentation result of each pixel point, wherein the first semantic segmentation result of each pixel point represents the probability that each pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1;
averaging the first semantic segmentation result of each pixel point to obtain a first semantic segmentation result of the image to be recognized;
and obtaining the first category proportion according to the first semantic segmentation result of the image to be recognized.
In some possible embodiments, in inputting the image to be recognized into the second segmentation network, obtaining a second class ratio and an entropy map, the processor 502 is configured to perform the following steps:
inputting the image to be recognized into a second segmentation network, and performing semantic segmentation on each pixel point in the image to be recognized to obtain a second semantic segmentation result of each pixel point, wherein the second semantic segmentation result of each pixel point is used for expressing the probability that each pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1;
averaging the second semantic segmentation result of each pixel point to obtain a second semantic segmentation result of the image to be recognized;
and determining the information entropy of each pixel point according to the second semantic segmentation result of each pixel point and an information entropy calculation formula, and forming the information entropy of each pixel point into the entropy diagram.
In some possible embodiments, the processor 502 is configured to perform the following steps in terms of performing a domain adaptation on the second segmentation network according to the first class ratio, the second class ratio and the entropy map:
determining a first KL divergence between the first class ratio and the second class ratio;
determining the sum of information entropies of all pixel points in the entropy diagram;
determining target loss according to the first KL divergence, the sum of the information entropies of all the pixel points and preset parameters;
and adjusting the network parameters of the second segmentation network according to the target loss so as to carry out field adaptation on the second segmentation network.
In some possible embodiments, the second segmentation network further comprises a first convolution layer, the first convolution layer being connected to the coding network; before the first feature map is upsampled by the decoding network to obtain a second feature map, the processor 502 is further configured to perform the following steps:
carrying out bilinear interpolation on the first feature map to obtain a third feature map, wherein the dimension of the third feature map is the same as that of the second feature map;
performing semantic segmentation on the third feature map through the first convolution layer to obtain a third semantic segmentation result of each pixel point;
determining a second KL divergence between the second semantic segmentation result of each second pixel point and the third semantic segmentation result of each pixel point, and obtaining an average value of the second KL divergences of the pixel points in the image to be identified to obtain a third KL divergence;
in terms of determining the target loss according to the first KL divergence, the sum of the information entropies of the respective pixel points, and the preset parameter, the processor 502 is configured to execute the following steps:
and determining target loss according to the first KL divergence, the third KL divergence, the sum of the information entropies of all the pixel points and preset parameters.
In some possible embodiments, the processor 502 is further configured to perform the following steps:
after completing the domain adaptation to the second split network, deleting the decoding network and the second convolution layer to obtain a third split network; and performing semantic segmentation on the image by using the third segmentation network.
Specifically, the transceiver 501 may be the transceiver unit 401 of the domain adaptive apparatus 400 according to the embodiment shown in fig. 4, and the processor 502 may be the processing unit 402 of the domain adaptive apparatus 400 according to the embodiment shown in fig. 4.
It should be understood that the field adaptive device in the present application may include a smart Phone (e.g., an Android Phone, an iOS Phone, a Windows Phone, etc.), a tablet computer, a palm computer, a notebook computer, a Mobile Internet device MID (MID), a wearable device, etc. The field adaptation means described above are merely examples and are not exhaustive, including but not limited to the field adaptation means described above. In practical applications, the field adapting apparatus may further include: intelligent vehicle-mounted terminal, computer equipment and the like.
Embodiments of the present application also provide a computer-readable storage medium, which stores a computer program, where the computer program is executed by a processor to implement part or all of the steps of any one of the domain adaptation methods described in the above method embodiments.
In one embodiment of the present application, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism and an encryption algorithm. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
Embodiments of the present application also provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps of any of the domain adaptation methods as set out in the above method embodiments.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are exemplary embodiments and that the acts and modules referred to are not necessarily required in this application.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may be implemented in the form of a software program module.
The integrated units, if implemented in the form of software program modules and sold or used as stand-alone products, may be stored in a computer readable memory. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a memory, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned memory comprises: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable memory, which may include: flash Memory disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
The foregoing detailed description of the embodiments of the present application has been presented to illustrate the principles and implementations of the present application, and the above description of the embodiments is only provided to help understand the method and the core concept of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. A domain adaptation method, comprising:
acquiring an image to be identified from a target domain;
inputting the image to be recognized into a first segmentation network to obtain a first class proportion, wherein the first segmentation network is obtained by training images of a source domain;
inputting the image to be recognized into a second segmentation network to obtain a second class proportion and an entropy diagram, wherein the entropy diagram is a matrix formed by information entropy of each pixel point in the image to be recognized;
performing domain adaptation on the second segmentation network according to the first class proportion, the second class proportion and the entropy diagram.
2. The method of claim 1, wherein inputting the image to be recognized into a first segmentation network to obtain a first class ratio comprises:
inputting the image to be recognized into the first segmentation network, and performing semantic segmentation on each pixel point in the image to be recognized to obtain a first semantic segmentation result of each pixel point, wherein the first semantic segmentation result of each pixel point represents the probability that each pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1;
averaging the first semantic segmentation result of each pixel point to obtain a first semantic segmentation result of the image to be recognized;
and obtaining the first category proportion according to the first semantic segmentation result of the image to be recognized.
3. The method according to claim 1 or 2, wherein the inputting the image to be recognized into a second segmentation network to obtain a second class proportion and an entropy map comprises:
inputting the image to be recognized into a second segmentation network, and performing semantic segmentation on each pixel point in the image to be recognized to obtain a second semantic segmentation result of each pixel point, wherein the second semantic segmentation result of each pixel point is used for expressing the probability that each pixel point belongs to k categories, the value of k is an integer from 1 to N, and N is an integer greater than 1;
averaging the second semantic segmentation result of each pixel point to obtain a second semantic segmentation result of the image to be recognized;
and determining the information entropy of each pixel point according to the second semantic segmentation result of each pixel point and an information entropy calculation formula, and forming the information entropy of each pixel point into the entropy diagram.
4. The method of claim 3, wherein said performing domain adaptation on said second partitioned network based on said first class ratio, said second class ratio, and said entropy map comprises:
determining a first KL divergence between the first class ratio and the second class ratio;
determining the sum of information entropies of all pixel points in the entropy diagram;
determining target loss according to the first KL divergence, the sum of the information entropies of all the pixel points and preset parameters;
and adjusting the network parameters of the second segmentation network according to the target loss so as to carry out field adaptation on the second segmentation network.
5. The method according to claim 4, wherein the second segmentation network includes an encoding network, a decoding network, and a second convolution layer, wherein the second convolution layer is connected to the decoding network, the inputting the image to be recognized into the second segmentation network, segmenting each pixel point in the image to be recognized, and obtaining a second semantic segmentation result of each pixel point, includes:
carrying out downsampling processing on the image to be identified through the coding network to obtain a first feature map;
performing upsampling processing on the first feature map through the decoding network to obtain a second feature map;
and performing semantic segmentation on the second feature map through the second convolution layer to obtain a second semantic segmentation result of each pixel point.
6. The method of claim 5, wherein the second split network further comprises a first convolutional layer, the first convolutional layer being connected to the coding network; before the first feature map is up-sampled by the decoding network to obtain a second feature map, the method further includes:
carrying out bilinear interpolation on the first feature map to obtain a third feature map, wherein the dimension of the third feature map is the same as that of the second feature map;
performing semantic segmentation on the third feature map through the first convolution layer to obtain a third semantic segmentation result of each pixel point;
determining a second KL divergence between the second semantic segmentation result of each second pixel point and the third semantic segmentation result of each pixel point, and obtaining an average value of the second KL divergences of the pixel points in the image to be identified to obtain a third KL divergence;
determining the target loss according to the first KL divergence, the sum of the information entropies of the pixel points and preset parameters, including:
and determining target loss according to the first KL divergence, the third KL divergence, the sum of the information entropies of all the pixel points and preset parameters.
7. The method of claim 5 or 6, further comprising:
after completing the domain adaptation to the second split network, deleting the decoding network and the second convolution layer to obtain a third split network;
and performing semantic segmentation on the image by using the third segmentation network.
8. A domain adaptation device, comprising:
the acquisition unit is used for acquiring an image to be identified from a target domain;
the processing unit is used for inputting the image to be recognized into a first segmentation network to obtain a first class proportion, and the first segmentation network is obtained by training images of a source domain;
inputting the image to be recognized into a second segmentation network to obtain a second class proportion and an entropy diagram, wherein the entropy diagram is a matrix formed by information entropy of each pixel point in the image to be recognized;
performing domain adaptation on the second segmentation network according to the first class proportion, the second class proportion and the entropy diagram.
9. An electronic device, comprising: a processor coupled to the memory, and a memory for storing a computer program, the processor being configured to execute the computer program stored in the memory to cause the electronic device to perform the method of any of claims 1-7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which is executed by a processor to implement the method according to any one of claims 1-7.
CN202011543313.3A 2020-12-23 2020-12-23 Domain adaptation method, domain adaptation device, electronic equipment and storage medium Pending CN112633285A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011543313.3A CN112633285A (en) 2020-12-23 2020-12-23 Domain adaptation method, domain adaptation device, electronic equipment and storage medium
PCT/CN2021/082603 WO2022134338A1 (en) 2020-12-23 2021-03-24 Domain adaptation method and apparatus, electronic device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011543313.3A CN112633285A (en) 2020-12-23 2020-12-23 Domain adaptation method, domain adaptation device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112633285A true CN112633285A (en) 2021-04-09

Family

ID=75322072

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011543313.3A Pending CN112633285A (en) 2020-12-23 2020-12-23 Domain adaptation method, domain adaptation device, electronic equipment and storage medium

Country Status (2)

Country Link
CN (1) CN112633285A (en)
WO (1) WO2022134338A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114024726A (en) * 2021-10-26 2022-02-08 清华大学 Method and system for detecting network flow online

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190130220A1 (en) * 2017-10-27 2019-05-02 GM Global Technology Operations LLC Domain adaptation via class-balanced self-training with spatial priors
CN110135510A (en) * 2019-05-22 2019-08-16 电子科技大学中山学院 Dynamic domain self-adaptive method, equipment and computer readable storage medium
CN110750665A (en) * 2019-10-12 2020-02-04 南京邮电大学 Open set domain adaptation method and system based on entropy minimization
CN111199550A (en) * 2020-04-09 2020-05-26 腾讯科技(深圳)有限公司 Training method, segmentation method, device and storage medium of image segmentation network

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018126213A1 (en) * 2016-12-30 2018-07-05 Google Llc Multi-task learning using knowledge distillation
CN111062951B (en) * 2019-12-11 2022-03-25 华中科技大学 Knowledge distillation method based on semantic segmentation intra-class feature difference
CN111401406B (en) * 2020-02-21 2023-07-18 华为技术有限公司 Neural network training method, video frame processing method and related equipment
CN111489365B (en) * 2020-04-10 2023-12-22 上海商汤临港智能科技有限公司 Training method of neural network, image processing method and device
CN112200889A (en) * 2020-10-30 2021-01-08 上海商汤智能科技有限公司 Sample image generation method, sample image processing method, intelligent driving control method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190130220A1 (en) * 2017-10-27 2019-05-02 GM Global Technology Operations LLC Domain adaptation via class-balanced self-training with spatial priors
CN110135510A (en) * 2019-05-22 2019-08-16 电子科技大学中山学院 Dynamic domain self-adaptive method, equipment and computer readable storage medium
CN110750665A (en) * 2019-10-12 2020-02-04 南京邮电大学 Open set domain adaptation method and system based on entropy minimization
CN111199550A (en) * 2020-04-09 2020-05-26 腾讯科技(深圳)有限公司 Training method, segmentation method, device and storage medium of image segmentation network

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114024726A (en) * 2021-10-26 2022-02-08 清华大学 Method and system for detecting network flow online

Also Published As

Publication number Publication date
WO2022134338A1 (en) 2022-06-30

Similar Documents

Publication Publication Date Title
WO2022105125A1 (en) Image segmentation method and apparatus, computer device, and storage medium
EP3869385B1 (en) Method for extracting structural data from image, apparatus and device
CN114066902A (en) Medical image segmentation method, system and device based on convolution and transformer fusion
CN111275107A (en) Multi-label scene image classification method and device based on transfer learning
CN110929806B (en) Picture processing method and device based on artificial intelligence and electronic equipment
CN110659667A (en) Picture classification model training method and system and computer equipment
CN112163637B (en) Image classification model training method and device based on unbalanced data
CN112231416B (en) Knowledge graph body updating method and device, computer equipment and storage medium
CN113221983B (en) Training method and device for transfer learning model, image processing method and device
CN112995414B (en) Behavior quality inspection method, device, equipment and storage medium based on voice call
CN117095019B (en) Image segmentation method and related device
CN112287069A (en) Information retrieval method and device based on voice semantics and computer equipment
CN112150470B (en) Image segmentation method, device, medium and electronic equipment
CN116978011B (en) Image semantic communication method and system for intelligent target recognition
CN114780701A (en) Automatic question-answer matching method, device, computer equipment and storage medium
CN112966687B (en) Image segmentation model training method and device and communication equipment
CN112633285A (en) Domain adaptation method, domain adaptation device, electronic equipment and storage medium
CN111582284B (en) Privacy protection method and device for image recognition and electronic equipment
CN110490876B (en) Image segmentation method based on lightweight neural network
KR20210038027A (en) Method for Training to Compress Neural Network and Method for Using Compressed Neural Network
CN114241411B (en) Counting model processing method and device based on target detection and computer equipment
CN113139581B (en) Image classification method and system based on multi-image fusion
CN113780148A (en) Traffic sign image recognition model training method and traffic sign image recognition method
CN111552827A (en) Labeling method and device, and behavior willingness prediction model training method and device
CN117540306B (en) Label classification method, device, equipment and medium for multimedia data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination