CN116187206B - COD spectrum data migration method based on generation countermeasure network - Google Patents

COD spectrum data migration method based on generation countermeasure network Download PDF

Info

Publication number
CN116187206B
CN116187206B CN202310450642.0A CN202310450642A CN116187206B CN 116187206 B CN116187206 B CN 116187206B CN 202310450642 A CN202310450642 A CN 202310450642A CN 116187206 B CN116187206 B CN 116187206B
Authority
CN
China
Prior art keywords
data
spectrum data
network
cod
generator
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310450642.0A
Other languages
Chinese (zh)
Other versions
CN116187206A (en
Inventor
张颖颖
侯士伟
袁达
吴丙伟
冯现东
曹璐
程岩
王茜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Oceanographic Instrumentation Shandong Academy of Sciences
Original Assignee
Institute of Oceanographic Instrumentation Shandong Academy of Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Oceanographic Instrumentation Shandong Academy of Sciences filed Critical Institute of Oceanographic Instrumentation Shandong Academy of Sciences
Priority to CN202310450642.0A priority Critical patent/CN116187206B/en
Publication of CN116187206A publication Critical patent/CN116187206A/en
Application granted granted Critical
Publication of CN116187206B publication Critical patent/CN116187206B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/02Reliability analysis or reliability optimisation; Failure analysis, e.g. worst case scenario performance, failure mode and effects analysis [FMEA]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)

Abstract

The invention discloses a COD spectrum data migration method based on a generated countermeasure network, which relates to the technical field of seawater detection and comprises the following steps of; and (3) data acquisition: collecting seawater sample COD spectrum data in the field as source domain data, and determining target domain data; defining the network structure of the generator G and the determiner D in the WGAN-GP network: the generator G takes LSTM as a network structure, and the judging device D takes a full-connection layer as a network structure; training the WGAN-GP network: training a judging device D and a generator G through source domain data and target domain data respectively; and inputting the source domain data into a trained WGAN-GP network to obtain simulated spectrum data, and comparing the similarity of the simulated spectrum data and the target domain spectrum data. The invention can effectively solve the problems of different spectral characteristics and insufficient samples caused by different COD components in different areas, and improves the identification accuracy and reliability of seawater COD spectral data.

Description

COD spectrum data migration method based on generation countermeasure network
Technical Field
The invention relates to the technical field of seawater detection, in particular to a COD spectrum data migration method based on a generated countermeasure network.
Background
Chemical oxygen demand (Chemical Oxygen Demand) is a comprehensive evaluation index of marine environmental organic pollution, and the sensitive section of the spectrum depends on the type and concentration of organic substances in water. The problems of different spectral characteristics and insufficient samples caused by insufficient sample numbers of COD components formed by different areas and different single areas are caused, so that certain difficulties exist in spectral data processing by deep learning.
Disclosure of Invention
In order to overcome the problems in the prior art, the invention provides a COD spectrum data migration method based on generation of an antagonism network.
The technical scheme adopted for solving the technical problems is as follows: a COD spectrum data migration method based on a generated countermeasure network comprises the following steps of;
step 1, data acquisition: collecting seawater sample COD spectrum data in the field as source domain data, and determining target domain data;
step 2, defining the network structure of a generator G and a judging device D in the WGAN-GP network: the generator G takes LSTM as a network structure, and the judging device D takes a full-connection layer as a network structure;
step 3, training the WGAN-GP network: training a judging device D and a generating device G respectively through the source domain data and the target domain data obtained in the step 1;
and 4, inputting the source domain data in the step 1 into the WGAN-GP network trained in the step 3 to obtain simulated spectrum data, comparing the similarity of the simulated spectrum data and the target domain spectrum data, comparing the distribution overlapping parts of the two groups of data in a distribution mode, and verifying the effectiveness of the model.
In the above-mentioned COD spectrum data migration method based on generation of the countermeasure network, in the step 2, the generator G includes two LSTM layers and 1 fully connected network, and the determiner D includes three fully connected layers.
The above method for migrating COD spectrum data based on generation of an countermeasure network, wherein the step 3 specifically includes:
step 3.1, selecting m samples from the target domain data, inputting the m samples into a judging device D, and marking the m samples as true;
step 3.2, selecting m samples from the source domain data, and inputting the m samples into a generator G to obtain simulated spectrum data;
step 3.3, inputting the simulated spectrum data obtained in the step 3.2 into a judging device D, and calculating the Wasserstein distance between the data obtained in the step 3.1 and the simulated spectrum data by the judging device D;
and 3.4, judging whether the Wasserstein distance obtained in the step 3.3 meets a set threshold, outputting the simulated spectrum data if the Wasserstein distance meets the set threshold, and repeating the steps 3.1-3.3 until the Wasserstein distance meets the set threshold if the Wasserstein distance does not meet the set threshold, and outputting the simulated spectrum data.
The calculation formula of the Wasserstein distance in the step 3.3 is as follows:
Figure SMS_1
,
wherein,,
Figure SMS_2
for the probability distribution of the target data, +.>
Figure SMS_3
Probability distribution for source data; />
Figure SMS_4
Is->
Figure SMS_5
And->
Figure SMS_6
Spatial sampling between the two distributions; d (x) and D (G (z)) represent Wasserstein distances, E, of real target domain data and simulated spectral data, respectively x~Pr [D(X)]And E is t~Pt [D(G(z))]Respectively representing the corresponding expected values; />
Figure SMS_7
Is a super-parameter which is used for the processing of the data,
Figure SMS_8
is a gradient penalty term.
The method for migrating COD spectrum data based on the generation countermeasure network specifically comprises the following steps:
for each target domain sample x and simulated spectral data sample t, an interpolated sample is calculated between them
Figure SMS_9
Figure SMS_10
,
Wherein the method comprises the steps of
Figure SMS_11
Is a random number, and the value range of the random number is between 0 and 1;
calculating the difference between the interpolated samples
Figure SMS_12
Gradient norm above:
Figure SMS_13
,
wherein the method comprises the steps of
Figure SMS_14
L representing vector 2 Norms (F/F)>
Figure SMS_15
Representing the discriminator D pair->
Figure SMS_16
Is a gradient of (2);
multiplying norm internal minus 1 by a superparameter
Figure SMS_17
As a gradient penalty term.
According to the COD spectrum data migration method based on the generation countermeasure network, the threshold value range is set to be 0.001-0.1 in the step 3.4.
The method has the beneficial effects that the data migration of the COD spectrum data from the source domain to the target domain is realized through the generation of the countermeasure network, the sample distribution of the real lighting spectrum data of the target domain is taken as the basis, the sample distribution of the COD spectrum data of the target domain is identified and simulated in a form of countermeasure learning through two network structures of the generator G and the judging device D, and the purpose of the COD spectrum migration of different areas is realized through sampling in the distribution; meanwhile, a WGAN-GP generator with an improved architecture is utilized to generate simulation data which is highly similar to real data, so that the migration of seawater COD spectrum data in different areas is performed; the method can effectively solve the problems of different spectral characteristics and insufficient samples caused by different COD components in different areas, and improves the identification accuracy and reliability of seawater COD spectral data.
Drawings
The invention will be further described with reference to the drawings and examples.
Fig. 1 is a schematic diagram of a WGAN-GP network structure according to an embodiment of the present invention;
FIG. 2 is a sample generated spectrum contrast diagram in an embodiment of the present invention, wherein (a) in FIG. 2 is a spectrum diagram of data in the source domain of 5 samples; FIG. 2 (b) is a simulated spectrum of 5 samples; FIG. 2 (c) is a target domain spectrogram of 5 samples.
Detailed Description
The present invention will be described in detail below with reference to the drawings and detailed description to enable those skilled in the art to better understand the technical scheme of the present invention.
The embodiment discloses a COD spectrum data migration method based on a generated countermeasure network, which comprises the following steps of;
step 1, data acquisition: collecting seawater sample COD spectrum data in the field as source domain data, and determining target domain data;
step 2, defining the network structure of a generator G and a judging device D in the WGAN-GP network: the generator G takes LSTM as a network structure, and the judging device D takes a full-connection layer as a network structure;
in this embodiment, the specific structure of the WGAN-GP network is shown in fig. 1, the generator G includes two LSTM layers and 1 fully-connected network, and the determiner D includes three fully-connected layers.
Step 3, training the WGAN-GP network: training a judging device D and a generating device G respectively through the source domain data and the target domain data obtained in the step 1;
the step 3 specifically comprises the following steps:
step 3.1, selecting m samples from the target domain data, inputting the m samples into a judging device D, and marking the m samples as true;
step 3.2, selecting m samples from the source domain data, and inputting the m samples into a generator G to obtain simulated spectrum data;
step 3.3, inputting the simulated spectrum data obtained in the step 3.2 into a judging device D, and calculating the Wasserstein distance between the data obtained in the step 3.1 and the simulated spectrum data by the judging device D;
the calculation formula of the Wasserstein distance is as follows:
Figure SMS_18
pr is probability distribution of target data, and Pt is probability distribution of source data;
Figure SMS_19
sampling the space between Pr and Pt; d (x) and D (G (z)) represent Wasserstein distances, E, of real target domain data and simulated spectral data, respectively x~Pr [D(X)]And E is t~Pt [D(G(z))]Respectively representing the corresponding expected values; />
Figure SMS_20
Is a super-parameter which is used for the processing of the data,
Figure SMS_21
is a gradient penalty term.
The gradient penalty term acts to smooth the gradient of the arbiter over the interpolated samples, thus making the training of the generator more stable. In addition, the gradient penalty term can also improve the quality and diversity of the generated samples of the WGAN-GP. The gradient penalty term is specifically:
for each target domain sample x and simulated spectral data sample t, an interpolated sample is calculated between them
Figure SMS_22
Figure SMS_23
Wherein the method comprises the steps of
Figure SMS_24
Is a random number, and the value range of the random number is between 0 and 1;
calculating the difference between the interpolated samples
Figure SMS_25
Gradient norm above:
Figure SMS_26
,
wherein the method comprises the steps of
Figure SMS_27
L representing vector 2 Norms (F/F)>
Figure SMS_28
Representing the discriminator D pair->
Figure SMS_29
Is a gradient of (2);
multiplying norm internal minus 1 by a superparameter
Figure SMS_30
As a gradient penalty term.
And 3.4, judging whether the Wasserstein distance obtained in the step 3.3 meets a set threshold, outputting the simulated spectrum data if the Wasserstein distance meets the set threshold, and repeating the steps 3.1-3.3 until the Wasserstein distance meets the set threshold if the Wasserstein distance does not meet the set threshold, and outputting the simulated spectrum data.
The threshold range is empirically set to 0.001-0.1, and in this embodiment is set to 0.01.
And 4, inputting the source domain data in the step 1 into the WGAN-GP network trained in the step 3 to obtain simulated spectrum data, comparing the similarity of the simulated spectrum data and the target domain spectrum data, comparing the distribution overlapping parts of the two groups of data in a distribution mode, and verifying the effectiveness of the model.
The origin of the WGAN-GP network in this embodiment is: the generation of the antagonism network (GAN) is composed of two parts of a generator and a arbiter. The generator is used for generating the simulation data of the target domain, and the discriminator is used for distinguishing the real data and the simulation data of the target domain. The goal of the generator is to generate simulated data that is highly similar to the target domain data, such that the arbiter classifies the simulated data as real data. The objective of the discriminator is to constantly discriminate between the simulated data and the real data generated by the generator. Through the mutual antagonism of the two network structures of the generator and the discriminator, the GAN realizes the conversion of spectrum COD data from a source domain to target domain data.
Example 1
Two different seawater region COD data sets are prepared, the distribution form of the target domain data x is Pr (x), and the distribution form of the source domain data t is Pt (t). The variable t is first obtained by sampling in Pt (t) and put into a generator to produce analog data x' =g (z). Then, the existing real data x and the obtained analog data x 'are classified and labeled, the real data x is labeled 1, and the analog data x' is labeled 0. The real data x with the tag and the analog data x' are added into the discriminator together, and the discriminator is trained by a supervised learning method. Meanwhile, the generator is trained through the performance of the discriminator so as to achieve the aim of mutual antagonism of two network structures. The loss function in GAN training is:
Figure SMS_31
in formula (1), the discriminators and generators are denoted as D and G, respectively, the target domain data is x, the source domain data is t, and E is an expected value. Due to monotonicity of the logarithmic function, the arbiter D expects to infinitely approximate the result of D (G (t)) to 0 and to infinitely approximate D (x) to 1, thereby letting equation (1) take a large value; while generator G expects to approach D (G (z)) to 1 infinitely, taking equation (1) to a minimum. Where D (x) represents the classification probability of the discriminator for the true data of the target domain, G (t) represents the data generated by the generator, and log is the natural logarithm. The first expected value represents the classification accuracy of the arbiter for the real target domain data, and the second expected value represents the classification accuracy of the arbiter for the generated data. Therefore, the goal of generating the countermeasure network is to maximize the degree of discrimination of D for the generated target data and the real target data, and minimize the distribution of G generated data and target conversion data, thereby achieving the effect of two kinds of network countermeasure. The loss function of this countermeasure mode is defined as a cross entropy loss function.
However, cross entropy loss functions have problems such as mode collapse and gradient disappearance, which can lead to GAN training instability. To solve these problems, the Wasserstein Distance (also known as Earth Mover's Distance) is used to measure the Distance between two probability distributions. By optimizing the Wasserstein distance function, a steady gradient can be provided to the generator G, thereby reducing the distance between the distribution f (x) and the target distribution, and achieving the goal of pulling the two distributions closer.
The calculation method of Wasserstein distance is as follows: for probability distributions Pr (x) and Pt (t) of target data and source data, their density functions are expected to be E, respectively x~Pr [f(X)]And E is x~Pt [f(X)]The wasperstein distance between them can be expressed as:
Figure SMS_32
in the middle of
Figure SMS_33
Indicating the presence of a constant +.>
Figure SMS_34
Making any two values x in the definition domain 1 And x 2 All satisfy:
Figure SMS_35
in calculating the Wasserstein distance from the distribution Pr (x) to the distribution Pt (t), the distance functionLipschitz constant K of the number f (x) also needs to be considered. Specifically, f (x) in the formula represents an evaluation function, E represents an expected value, and sup represents an upper bound. This formula represents that one is chosen from all possible functions f (x) such that
Figure SMS_36
Is then multiplied by 1/k to obtain the value of the Wasserstein distance. The advantage of this formula is that the wasperstein distance can be calculated by optimizing the function f (x) without directly calculating the distance between the two distributions.
The essence of the wasperstein distance is to measure the minimum transport cost from one distribution to another, i.e. the minimum cost required to move the mass in one distribution to another. By using the Wasserstein distance as a loss function, the training stability of GAN and the quality of generated data are improved by replacing a two-class calculation mode of the cross entropy loss function. Meanwhile, the Lipschitz constant K of the evaluation function f (x) is considered, so that the calculation efficiency and accuracy of the Wasserstein distance can be further improved.
The loss function of the determiner D of the WGAN is thus defined as:
Figure SMS_37
the loss function of generator G is defined as:
Figure SMS_38
wasserstein distance is an important indicator for measuring the difference between the generated sample and the real sample, wherein D (x) and D (G (z)) represent the Wasserstein distance of the real target domain data and the generated target domain data, E, respectively x~Pr [D(X)]And E is t~Pt [D(G(z))]Respectively representing the corresponding expected values. In WGAN, it is desirable to minimize E t~Pt [D(G(z))]-E x~Pr [D(X)]I.e. minimizing the desired wasperstein distance between the generated sample and the real sample, thereby training the generator and the arbiter.
However, it is not feasible to directly calculate the gradient of the wasperstein distance, since we cannot enumerate all possible joint distributions. To solve this problem, an approximation method is used, i.e., a function of a discriminator is used to approximate the wasperstein distance, thereby achieving effective optimization of the wasperstein distance. Specifically, WGAN approximates the Wasserstein distance using a function D (x) of the arbiter, i.e
Figure SMS_39
This approximation can be achieved by gradient descent, but since the Lipschitz constant of D (x) can be very large, it is not feasible to calculate the gradient directly.
Thus, the gradient penalty technique is employed to limit the Lipschitz constant of D (x), thereby achieving efficient optimization of Wasserstein distance. Specifically, the WGAN adopts a gradient constraint mode to enable the equation to meet the requirement of Lipschitz continuity, namely, by punishing the gradient of the discriminator, the norm of the gradient is limited to not exceed a preset threshold value. In this way, the Lipschitz constant of D (x) can be effectively controlled, thereby achieving effective optimization of the Wasserstein distance.
The equation meets the requirement of Lipschitz continuity by adopting a gradient constraint mode, and the equation is as follows:
Figure SMS_40
wherein,,
Figure SMS_41
for the probability distribution of the target data, +.>
Figure SMS_42
Probability distribution for source data; />
Figure SMS_43
Is->
Figure SMS_44
And->
Figure SMS_45
Spatial sampling between the two distributions; d (x) and D (G (z)) represent Wasserstein distances, E, of real target domain data and simulated spectral data, respectively x~Pr [D(X)]And E is t~Pt [D(G(z))]Respectively representing the corresponding expected values; />
Figure SMS_46
Is a super-parameter which is used for the processing of the data,
Figure SMS_47
is a gradient penalty term.
The gradient penalty term acts to smooth the gradient of the arbiter over the interpolated samples, thus making the training of the generator more stable. In addition, the gradient penalty term can also improve the quality and diversity of the generated samples of the WGAN-GP. The gradient penalty term is specifically:
for each target domain sample x and simulated spectral data sample t, an interpolated sample is calculated between them
Figure SMS_48
Figure SMS_49
Wherein the method comprises the steps of
Figure SMS_50
Is a random number, and the value range of the random number is between 0 and 1;
calculating the difference between the interpolated samples
Figure SMS_51
Gradient norm above:
Figure SMS_52
,
wherein the method comprises the steps of
Figure SMS_53
L representing vector 2 Norms (F/F)>
Figure SMS_54
Representing the discriminator D pair->
Figure SMS_55
Is a gradient of (2);
multiplying norm internal minus 1 by a superparameter
Figure SMS_56
As a gradient penalty term, is added to the loss function of the arbiter.
The gradient penalty term acts to smooth the gradient of the arbiter over the interpolated samples, thus making the training of the generator more stable. In addition, the gradient penalty term can also improve the quality and diversity of the generated samples of the WGAN-GP.
In this embodiment, the COD spectrum data source field is a real-harvest gulf seawater sample, and the target field is a COD seawater sample configured by artificial seawater and o-benzene, and the data size is 63×601. Wasserstein GAN with Gradient Penalty (WGAN-GP) network is implemented in which LSTM is used as the network structure of generator G and the full connectivity layer is used as the network structure of arbiter D.
First, some hyper parameters are defined, where the batch size (number of samples per batch) is chosen to be 5; the learning rate is 0.0002; the training round number is 1000, the judgment device is updated 5 times before each updating generator, the weight cut-off threshold is 0.01, the hidden variable dimension is 1, the characteristic number 601 of spectrum data (depending on the variable length of samples, here the sample conversion of 601 variables is taken as an example), the hidden dimension 128 of LSTM, and the above super parameters are manually adjusted according to the correlation degree of network training.
Next, the network structures of the generator G and the determiner D are defined. The generator G comprises two LSTM layers followed by a fully connected network of 601 x 601; the arbiter D comprises three full connection layers, the first layer is 601×1024, the second layer is 1024×5112, and the third layer is 512×1. In the generator G, the target domain data x is input into the LSTM, and then the output of the last time step of the LSTM is taken as the generated simulated seawater COD spectral data. In the determiner D, the LeakyReLU is used as an activation function to avoid the problem of gradient disappearance.
Then, a loss function and optimizer are defined for calculating gradient penalty term and Wasserstein distance.
Finally, training the model is started. In each iteration, the arbiter D will be trained first, and then the generator G will be trained. In training the arbiter D, a gradient penalty term needs to be calculated and added to the loss function. When the generator G is trained, seawater COD data with 601 variable points of each sample collected in a source domain is taken as input, and analog spectrum data with 601 variables is output through an LSTM layer and a fully connected network; the target domain data having 601 variables is input as a judgment device D, the judgment result is output, and the generated analog spectrum data is input into the judgment device D, and the judgment result is output. The WGAN-GP method no longer uses a classification method, but evaluates the distance between the two distributions. The loss function of generator G uses equation (3).
And judging the quality of the output result of the generator G, if the output result meets the requirement, outputting, if the output result does not meet the requirement, selecting the number of sample queues for single training, then performing 5-cycle training of the discriminator D, and after the completion, further training the generator G. And thus reciprocates until the output of the generator G meets the requirements. In this iterative manner, the generator G and the arbiter D are constantly performing countermeasure training, thereby improving the quality of the data generated by the generator G so as to be closer to the target domain data x.
FIG. 2 is a comparison of the generated spectra of 5 samples, wherein (a) in FIG. 2 is a spectrum of data of the source domain of 5 samples; FIG. 2 (b) is a simulated spectrum of 5 samples; in fig. 2 (c) is a 5 sample target domain spectrogram, calculated to be 0.457 for real source domain data, and 0.036 for variance; the simulated spectrum data is expected to be 0.468 and the variance is 0.041, and the gaussian distribution of the surface simulated spectrum data is less numerical in the center position of the gaussian distribution than the source domain data and more discrete in the distribution.
Table 1 lists the d_loss and g_loss values for the neural network at different runs, and the sample error rates in the training set and test set. Where d_loss is the loss function value of the arbiter, g_loss is the generator loss function value, and as can be seen from table 1, as the number of runs increases, the d_loss and g_loss values gradually decrease, indicating that the performance of the network is continuously improved. Meanwhile, the sample error dividing ratio of the training set and the test set is gradually reduced, which means that the generalization capability of the network is gradually enhanced. However, at higher run times, the sample misclassification rate increases instead, possibly due to overfitting. Therefore, in practical applications, an appropriate number of operations needs to be selected according to the circumstances.
Figure SMS_57
The above embodiments are only exemplary embodiments of the present invention and are not intended to limit the present invention, the scope of which is defined by the claims. Various modifications and equivalent arrangements of this invention will occur to those skilled in the art, and are intended to be within the spirit and scope of the invention.

Claims (4)

1. The COD spectrum data migration method based on the generation countermeasure network is characterized by comprising the following steps of;
step 1, data acquisition: collecting seawater sample COD spectrum data in the field as source domain data, and determining target domain data;
step 2, defining the network structure of a generator G and a judging device D in the WGAN-GP network: the generator G takes LSTM as a network structure, and the judging device D takes a full-connection layer as a network structure;
step 3, training the WGAN-GP network: training a judging device D and a generating device G respectively through the source domain data and the target domain data obtained in the step 1;
step 4, inputting the source domain data in the step 1 into the WGAN-GP network trained in the step 3 to obtain simulated spectrum data, comparing the similarity of the simulated spectrum data and the target domain spectrum data, comparing the distributed overlapping parts of the two groups of data in a distributed mode, and verifying the effectiveness of the model;
the step 3 specifically includes:
step 3.1, selecting m samples from the target domain data, inputting the m samples into a judging device D, and marking the m samples as true;
step 3.2, selecting m samples from the source domain data, and inputting the m samples into a generator G to obtain simulated spectrum data;
step 3.3, inputting the simulated spectrum data obtained in the step 3.2 into a judging device D, and calculating the Wasserstein distance between the data obtained in the step 3.1 and the simulated spectrum data by the judging device D;
step 3.4, judging whether the Wasserstein distance obtained in step 3.3 meets a set threshold, if so, outputting simulated spectrum data, and if not, repeating steps 3.1-3.3 until the Wasserstein distance meets the set threshold, and outputting the simulated spectrum data;
the calculation formula of the Wasserstein distance in the step 3.3 is as follows:
Figure QLYQS_1
wherein P is r For probability distribution of target data, P t Probability distribution for source data;
Figure QLYQS_2
is P r And P t Spatial sampling between the two distributions; d (x) and D (G (z)) represent the wasperstein distances of the real target domain data and the simulated spectrum data respectively,
Figure QLYQS_3
and->
Figure QLYQS_4
Respectively representing the corresponding expected values; lambda is the superparameter, ">
Figure QLYQS_5
Is a gradient penalty term.
2. The COD spectrum data migration method based on generation countermeasure network according to claim 1, wherein in the step 2, the generator G includes two LSTM layers and 1 fully connected network, and the determiner D includes three fully connected layers.
3. The COD spectrum data migration method based on generation countermeasure network according to claim 1, wherein the gradient penalty term is specifically:
for each target domain sample x and simulated spectral data sample t, an interpolated sample is calculated between them
Figure QLYQS_6
Figure QLYQS_7
Wherein epsilon is a random number, and the value of epsilon is in the range of 0 to 1;
calculating the difference between the interpolated samples
Figure QLYQS_8
Gradient norm above:
Figure QLYQS_9
where L represents the L of the vector 2 The norm of the sample is calculated,
Figure QLYQS_10
representing the discriminator D pair->
Figure QLYQS_11
Is a gradient of (2);
the norm internal minus 1 is multiplied by a super parameter lambda as a gradient penalty term.
4. The COD spectrum data migration method based on generation of countermeasure network according to claim 1, wherein the threshold value range is set to 0.001-0.1 in the step 3.4.
CN202310450642.0A 2023-04-25 2023-04-25 COD spectrum data migration method based on generation countermeasure network Active CN116187206B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310450642.0A CN116187206B (en) 2023-04-25 2023-04-25 COD spectrum data migration method based on generation countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310450642.0A CN116187206B (en) 2023-04-25 2023-04-25 COD spectrum data migration method based on generation countermeasure network

Publications (2)

Publication Number Publication Date
CN116187206A CN116187206A (en) 2023-05-30
CN116187206B true CN116187206B (en) 2023-07-07

Family

ID=86444629

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310450642.0A Active CN116187206B (en) 2023-04-25 2023-04-25 COD spectrum data migration method based on generation countermeasure network

Country Status (1)

Country Link
CN (1) CN116187206B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113361566A (en) * 2021-05-17 2021-09-07 长春工业大学 Method for migrating generative confrontation network by using confrontation learning and discriminant learning

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160252459A1 (en) * 2011-05-16 2016-09-01 Renishaw Plc Spectroscopic apparatus and methods for determining components present in a sample
CN105651742A (en) * 2016-01-11 2016-06-08 北京理工大学 Laser-induced breakdown spectroscopy based explosive real-time remote detection method
CN112651173B (en) * 2020-12-18 2022-04-29 浙江大学 Agricultural product quality nondestructive testing method based on cross-domain spectral information and generalizable system
CN115656074B (en) * 2022-12-28 2023-04-07 山东省科学院海洋仪器仪表研究所 Adaptive selection and estimation method for sea water COD (chemical oxygen demand) spectral variable characteristics
CN115953683A (en) * 2023-01-30 2023-04-11 辽宁师范大学 Method for detecting hyperspectral change through learning of small samples across heterogeneous domains based on bidirectional generation

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113361566A (en) * 2021-05-17 2021-09-07 长春工业大学 Method for migrating generative confrontation network by using confrontation learning and discriminant learning

Also Published As

Publication number Publication date
CN116187206A (en) 2023-05-30

Similar Documents

Publication Publication Date Title
CN109376913A (en) The prediction technique and device of precipitation
CN110346831B (en) Intelligent seismic fluid identification method based on random forest algorithm
CN114595732B (en) Radar radiation source sorting method based on depth clustering
CN110287985B (en) Depth neural network image identification method based on variable topology structure with variation particle swarm optimization
CN112613536B (en) Near infrared spectrum diesel fuel brand recognition method based on SMOTE and deep learning
CN110826618A (en) Personal credit risk assessment method based on random forest
CN105913450A (en) Tire rubber carbon black dispersity evaluation method and system based on neural network image processing
CN112735097A (en) Regional landslide early warning method and system
CN111105035A (en) Neural network pruning method based on combination of sparse learning and genetic algorithm
CN102109495A (en) Method for classifying types of mixed seabed sediment based on multi-beam sonar technology
CN116204794B (en) Method and system for predicting dissolved gas in transformer oil by considering multidimensional data
CN114019370B (en) Motor fault detection method based on gray level image and lightweight CNN-SVM model
CN110263834B (en) Method for detecting abnormal value of new energy power quality
CN112199862B (en) Nanoparticle migration prediction method, influence factor analysis method and system
CN113111786A (en) Underwater target identification method based on small sample training image convolutional network
CN109948726A (en) A kind of Power Quality Disturbance Classification Method based on depth forest
Rawat et al. A comprehensive analysis of the effectiveness of machine learning algorithms for predicting water quality
CN113240201A (en) Method for predicting ship host power based on GMM-DNN hybrid model
CN104966106A (en) Biological age step-by-step predication method based on support vector machine
CN113468796A (en) Voltage missing data identification method based on improved random forest algorithm
Liu et al. Prediction of water inrush through coal floors based on data mining classification technique
CN106971170A (en) A kind of method for carrying out target identification using one-dimensional range profile based on genetic algorithm
CN116881676B (en) Prediction method for water inflow of open pit
CN113378998A (en) Stratum lithology while-drilling identification method based on machine learning
CN116187206B (en) COD spectrum data migration method based on generation countermeasure network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant