CN117744757A - Intelligent model reverse engineering method based on data feature statistical distribution - Google Patents

Intelligent model reverse engineering method based on data feature statistical distribution Download PDF

Info

Publication number
CN117744757A
CN117744757A CN202311781128.1A CN202311781128A CN117744757A CN 117744757 A CN117744757 A CN 117744757A CN 202311781128 A CN202311781128 A CN 202311781128A CN 117744757 A CN117744757 A CN 117744757A
Authority
CN
China
Prior art keywords
model
output
student
teacher
teacher model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311781128.1A
Other languages
Chinese (zh)
Inventor
徐文渊
陈艳姣
白怡杰
徐艺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202311781128.1A priority Critical patent/CN117744757A/en
Publication of CN117744757A publication Critical patent/CN117744757A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses an intelligent model reverse engineering method based on data feature statistical distribution, which relates to the fields of Artificial Intelligence (AI) and Machine Learning (ML), and can recover training data from a trained model when the training data is absent, and is used for knowledge distillation. According to the method, a trained network is inverted, a teacher model is kept fixed under the condition that additional information about a training data set is not needed, a student model is trained through knowledge distillation, input is optimized, and regularization depth inversion is improved by using information stored in a batch normalization layer of the teacher model on the basis of a deep stream method. In addition, the self-adaptive depth inversion improves the diversity of images by maximizing JS divergence between logic outputs of a teacher model and a student model, enhances the effect of regularized depth inversion, and further optimizes the images, thereby realizing better effect compared with deep stream.

Description

Intelligent model reverse engineering method based on data feature statistical distribution
Technical Field
The present application relates to the field of Artificial Intelligence (AI) and Machine Learning (ML), and in particular, to an intelligent model reverse engineering method based on statistical distribution of data features.
Background
In the absence of training data, how to recover training data from an already trained model and use it for knowledge distillation is a problem that needs to be solved. The most common and classical approach to this problem is deep stream. The deep stream method also requires a pre-trained model, except that it optimizes the input pictures according to the output class and pre-trained model. The input information is initially a noisy image, by using some regularization while keeping the selected output activation values fixed, but leaving the representation of the intermediate features unconstrained.
Many high performance Convolutional Neural Networks (CNNs), resnet, densentes, etc., or variants thereof, use the BN layer. The BN layer stores the sliding mean and variance of the output of many layers, and deep stream optimizes the input picture by using this information when training the input picture. The deep stream method can enable the input picture to be converged stably through regularization, but the method still has the problems of poor optimization effect on the input picture and larger difference from a target result.
Disclosure of Invention
The invention provides an intelligent model reverse engineering method based on data feature statistical distribution, which is used for synthesizing category condition images from CNN (carbon nanotubes) subjected to image classification training by introducing a depth inversion method, and improving comprehensive diversity by utilizing teacher and student divergence through self-adaptive depth inversion.
The invention adopts the following technical scheme:
an intelligent model reverse engineering method based on data feature statistical distribution comprises the following steps:
1) Giving a noise picture to be optimized as input and a target label;
2) Respectively inputting the images in the step 1) into a pre-trained teacher model and an untrained student model to obtain teacher model output and student model output, and calculating to obtain an image regularization item based on the teacher model output and the student model output;
3) Obtaining a classification loss term through a teacher model output and a loss function of a target label, and updating the noise picture to be optimized by using the image regularization term and the classification loss term obtained by the calculation in the step 2);
4) Respectively taking the updated noise pictures in the step 3) as the input of a teacher model and a student model, and reducing the distance between the student model and the output of the teacher model through knowledge distillation;
5) Repeating the steps 2) -4) to iteratively train the student model, and judging whether the optimized noise picture accords with the condition of the target label; if not, returning to the step 2) to continue iteration; if yes, ending the iteration to obtain a trained student model, and optimizing the picture by using the trained student.
In the above technical solution, further, in the step 3), the noise picture to be optimized is updated by using the image regularization term and the classification loss term calculated in the step 2), and specifically, the noise picture is optimized by using deep stream:
DeepDream optimizes the noise picture by:
in the method, in the process of the invention,is a loss of classification,/->Is an image regularization term,/->For a given noise picture, y is the target label.
Further, in step 4), the knowledge distillation process is expressed as:
wherein χ is training set, p T (x) Is the output of teacher model, p S (x) Is output by student model W s Is a parameter of the student model, and KL (·) represents the relative entropy.
The invention introduces regularized depth inversion by expanding the image regularization term using a new regularization termImproving the deep to improve the training noise picture quality; the feature statistics are assumed to follow a gaussian distribution between batches and thus can be calculated by means of the mean μ and variance σ 2 To define the new regularization term; thus, the new regularization term is expressed as:
in the middle ofAnd->Is a batch mean and variance estimate corresponding to the feature map of convolution layer 1.And|| | 2 Operators respectively represent the expected value and l 2 Calculating norms; χ is the training set; the running_mean and running_variance of the BN layer of the teacher model are used instead of a complete set of training data χ to estimate the expectations in the above equation by:
final regularized depth inversion pair regularization termAfter improvement, it becomes +.>
Wherein alpha is tvFor the corresponding scale factor, +.>Is punishment total variance, < >>Is->Is L2 canonical; alpha f Is a scale factor.
The invention introduces adaptive depth inversion by introducing an additional penalty termThe JS divergence between the teacher model and the student model is maximized, so that the output of the student model and the output of the teacher model are inconsistent as much as possible, and the image is further optimized:
in the method, in the process of the invention,is the average value of the teacher model and student model outputs, < >>Is the output of the teacher model,/>Is the output of the student model;
the picture teacher model is easy to classify, and the student network is difficult to classify accurately, so that the output of the student network is more diversified as much as possible.
Final adaptive depth inversion pair regularization termAfter improvement, it becomes +.>
Alpha in the formula c Is a scale factor.
The competitiveness and interactivity of the adaptive depth inversion are beneficial to continuously evolving a student model, so that new image features are gradually forced to appear, and the depth inversion is enhanced;
the depth inversion and the adaptive depth inversion are compared as follows: depth inversion is a general method that can be applied to any trained CNN classifier. For knowledge distillation, it can synthesize a large number of images at once to initiate knowledge transfer given a teacher's network. On the other hand, adaptive depth inversion requires the use of a student network in the loop to enhance the diversity of the image. Its competitive and interactive features facilitate a growing student network, which gradually motivates new image features to appear, and experiments have shown that such enhancements allow the effect of depth inversion to be improved.
The beneficial effects of the invention are as follows:
according to the method, a trained network is inverted, a teacher model is kept fixed under the condition that additional information about a training data set is not needed, a student model is trained through knowledge distillation, input is optimized, and regularization depth inversion is improved by using information stored in a batch normalization layer of the teacher model on the basis of a deep stream method. In addition, the self-adaptive depth inversion improves the diversity of images by maximizing JS divergence between logic outputs of a teacher model and a student model, enhances the effect of regularized depth inversion, and further optimizes the images, thereby realizing better effect compared with deep stream.
Drawings
Fig. 1 is a schematic flow chart of a depth inversion of an intelligent model reverse engineering method based on data feature statistical distribution according to an embodiment of the invention.
Detailed Description
The invention will be further described with reference to the drawings.
The flow of the method in the embodiment of the invention is as shown in fig. 1:
1) Giving a noise picture to be optimized as input and a target label;
2) Respectively inputting the images in the step 1) into a pre-trained teacher model and an untrained student model to obtain teacher model output and student model output, and calculating to obtain an image regularization item based on the teacher model output and the student model output;
3) Obtaining a classification loss term through a teacher model output and a loss function of a target label, and updating the noise picture to be optimized by using the image regularization term and the classification loss term obtained by the calculation in the step 2);
4) Respectively taking the updated noise pictures in the step 3) as the input of a teacher model and a student model, and reducing the distance between the student model and the output of the teacher model through knowledge distillation;
5) Repeating the steps 2) -4) to iteratively train the student model, and judging whether the optimized noise picture accords with the condition of the target label; if not, returning to the step 2) to continue iteration; if yes, ending the iteration to obtain a trained student model, and optimizing the picture by using the trained student.
In step 3), updating the noise picture to be optimized by using the image regularization term and the classification loss term calculated in step 2), and specifically optimizing the noise picture by using deep stream:
DeepDream optimizes the noise picture by:
in the method, in the process of the invention,is a loss of classification,/->Is an image regularization term,/->For a given noise picture, y is the target label.
In step 4), the knowledge distillation process is expressed as:
wherein χ is training set, p T (x) Is the output of teacher model, p S (x) Is output by student model W s Is a parameter of the student model, and KL (·) represents the relative entropy.
The invention introduces regularized depth inversion by expanding the image regularization term using a new regularization termImproving the deep to improve the training noise picture quality; the feature statistics are assumed to follow a gaussian distribution between batches and thus can be calculated by means of the mean μ and variance σ 2 To define the new regularization term; thus, the new regularization term is expressed as:
in the middle ofAnd->Is a batch mean and variance estimate corresponding to the feature map of convolution layer 1.And|| | 2 Operators respectively represent the expected value and l 2 Calculating norms; χ is the training set; the running_mean and running_variance of the BN layer of the teacher model are used instead of a complete set of training data χ to estimate the expectations in the above equation by:
final regularized depth inversion pair regularization termAfter improvement, it becomes +.>
Wherein alpha is tvFor the corresponding scale factor, +.>Is punishment total variance, < >>Is->Is L2 canonical; alpha f Is a scale factor.
The invention introduces adaptive depth inversion by introducing an additional penalty termThe JS divergence between the teacher model and the student model is maximized, so that the output of the student model and the output of the teacher model are inconsistent as much as possible, and the image is further optimized:
in the method, in the process of the invention,is the average value of the teacher model and student model outputs, < >>Is the output of the teacher model,/>Is the output of the student model;
the picture teacher model is easy to classify, and the student network is difficult to classify accurately, so that the output of the student network is more diversified as much as possible.
Final adaptive depth inversion pair regularization termAfter improvement, it becomes +.>
Alpha in the formula c Is a scale factor.
According to the method, a trained network is inverted, a teacher model is kept fixed under the condition that additional information about a training data set is not needed, a student model is trained through knowledge distillation, input is optimized, and regularization depth inversion is improved by using information stored in a batch normalization layer of the teacher model on the basis of a deep stream method. In addition, the self-adaptive depth inversion improves the diversity of images by maximizing JS divergence between logic outputs of a teacher model and a student model, enhances the effect of regularized depth inversion, and further optimizes the images, thereby realizing better effect compared with deep stream. For example, the generated image supports knowledge transfer between two networks in knowledge distillation, even though the two networks have different architectures, accuracy loss is minimal on a simple CIFAR-10 dataset and a bulky complex ImageNet dataset.

Claims (5)

1. An intelligent model reverse engineering method based on data characteristic statistical distribution is characterized by comprising the following steps:
1) Giving a noise picture to be optimized as input and a target label;
2) Respectively inputting the images in the step 1) into a pre-trained teacher model and an untrained student model to obtain teacher model output and student model output, and calculating to obtain an image regularization item based on the teacher model output and the student model output;
3) Obtaining a classification loss term through a teacher model output and a loss function of a target label, and updating the noise picture to be optimized by using the image regularization term and the classification loss term obtained by the calculation in the step 2);
4) Respectively taking the updated noise pictures in the step 3) as the input of a teacher model and a student model, and reducing the distance between the student model and the output of the teacher model through knowledge distillation;
5) Repeating the steps 2) -4) to iteratively train the student model, and judging whether the optimized noise picture accords with the condition of the target label; if not, returning to the step 2) to continue iteration; if yes, ending the iteration to obtain a trained student model, and optimizing the picture by using the trained student.
2. The intelligent model reverse engineering method based on data feature statistical distribution according to claim 1, wherein in the step 3), the noise picture to be optimized is updated by using the image regularization term and the classification loss term calculated in the step 2), and specifically, the noise picture is optimized by deep stream:
DeepDream optimizes the noise picture by:
in the method, in the process of the invention,is a loss of classification,/->Is an image regularization term,/->For a given noise picture, y is the target label.
3. The intelligent model reverse engineering method based on statistical distribution of data features according to claim 1, wherein in the step 4), the knowledge distillation process is expressed as:
wherein χ is training set, p T (x) Is the output of teacher model, p S (x) Is output by student model W s Is a parameter of the student model, and KL (·) represents the relative entropy.
4. The intelligent model reverse engineering method based on statistical distribution of data features according to claim 2, characterized in that regularized depth inversion is introduced by using new regularization termExpanding the image regularization termImproving the deep to improve the training noise picture quality; the feature statistics are assumed to follow a gaussian distribution between batches and thus can be calculated by means of the mean μ and variance σ 2 To define the new regularization term; thus, the new regularization term is expressed as:
in the middle ofAnd->Is a batch mean and variance estimate corresponding to the feature map of convolution layer 1. />And|| | 2 Operators respectively represent the expected value and l 2 Calculating norms; χ is the training set; the running_mean and running_variance of the BN layer of the teacher model are used instead of a complete set of training data χ to estimate the expectations in the above equation by:
final regularized depth inversion pair regularization termAfter improvement, it becomes +.>
Wherein alpha is tvFor the corresponding scale factor, +.>Is punishment total variance, < >>Is->Is L2 canonical; alpha f Is a scale factor.
5. The intelligent model reverse engineering method based on statistical distribution of data features according to claim 4, wherein the adaptive depth inversion is introduced by introducing an additional penalty termThe JS divergence between the teacher model and the student model is maximized, so that the output of the student model and the output of the teacher model are inconsistent as much as possible, and the image is further optimized:
in the method, in the process of the invention,is the average value of the teacher model and student model outputs, < >>Is the output of the teacher model,/>Is the output of the student model;
final adaptive depth inversion pair regularization termAfter improvement, it becomes +.>
Alpha in the formula c Is a scale factor.
CN202311781128.1A 2023-12-22 2023-12-22 Intelligent model reverse engineering method based on data feature statistical distribution Pending CN117744757A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311781128.1A CN117744757A (en) 2023-12-22 2023-12-22 Intelligent model reverse engineering method based on data feature statistical distribution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311781128.1A CN117744757A (en) 2023-12-22 2023-12-22 Intelligent model reverse engineering method based on data feature statistical distribution

Publications (1)

Publication Number Publication Date
CN117744757A true CN117744757A (en) 2024-03-22

Family

ID=90281057

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311781128.1A Pending CN117744757A (en) 2023-12-22 2023-12-22 Intelligent model reverse engineering method based on data feature statistical distribution

Country Status (1)

Country Link
CN (1) CN117744757A (en)

Similar Documents

Publication Publication Date Title
Yin et al. Dreaming to distill: Data-free knowledge transfer via deepinversion
CN110689086B (en) Semi-supervised high-resolution remote sensing image scene classification method based on generating countermeasure network
CN112446423B (en) Fast hybrid high-order attention domain confrontation network method based on transfer learning
CN113077388B (en) Data-augmented deep semi-supervised over-limit learning image classification method and system
Kumar et al. Improved semi-supervised learning with gans using manifold invariances
Ye et al. Learning joint latent representations based on information maximization
CN112528830A (en) Lightweight CNN mask face pose classification method combined with transfer learning
CN112232395B (en) Semi-supervised image classification method for generating countermeasure network based on joint training
Wehenkel et al. Diffusion priors in variational autoencoders
Gong et al. Margin based PU learning
US11978141B2 (en) Generating images using sequences of generative neural networks
Tan et al. Bidirectional long short-term memory with temporal dense sampling for human action recognition
CN114299362A (en) Small sample image classification method based on k-means clustering
CN115797835A (en) Non-supervision video target segmentation algorithm based on heterogeneous Transformer
CN116071817A (en) Network architecture and training method of gesture recognition system for automobile cabin
CN113628101B (en) Three-stage tile image generation method based on GAN network structure
US20220101122A1 (en) Energy-based variational autoencoders
CN105678340B (en) A kind of automatic image marking method based on enhanced stack autocoder
CN117593398A (en) Remote sensing image generation method based on diffusion model
CN111667006A (en) Method for generating family font based on AttGan model
CN115280329A (en) Method and system for query training
CN117744757A (en) Intelligent model reverse engineering method based on data feature statistical distribution
CN117037176A (en) Pre-training language model adaptation method for vision-language task
US20220101145A1 (en) Training energy-based variational autoencoders
CN114266283A (en) Multi-service application scene oriented meteorological information characteristic data generation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination