CN117993480A

CN117993480A - AIGC federal learning method for designer style fusion and privacy protection

Info

Publication number: CN117993480A
Application number: CN202410389515.9A
Authority: CN
Inventors: 吴迪; 武名柱; 蒋佳楠; 李星霖; 邓晗晖
Original assignee: Hunan University
Current assignee: Hunan University
Priority date: 2024-04-02
Filing date: 2024-04-02
Publication date: 2024-05-07

Abstract

The invention discloses a AIGC federal learning method for designer style fusion and privacy protection, which comprises the following steps: step S1, drawing sketches belonging to own styles by different designers according to a given color drawing, and extracting sketch style characteristics by locally utilizing the same generated countermeasure network; training the generated countermeasure network through the sketch style characteristics to obtain model parameters of the generated countermeasure network with the sketch style characteristics; step S2, uploading locally generated countermeasure network model parameters to a server by all designer terminals, aggregating a plurality of groups of model parameters through a learning algorithm by the server, distributing the aggregated model parameters to each designer terminal, and updating parameters of the generated countermeasure network; and S3, each designer generates a sketch with different designer styles by using the updated generation countermeasure network. The invention solves the problems of data privacy protection, limited equipment computing resources and the like existing in the current design field.

Description

AIGC federal learning method for designer style fusion and privacy protection

Technical Field

The invention belongs to the technical field of image generation, and particularly relates to a AIGC federal learning method for designer style fusion and privacy protection.

Background

In recent years, with the development of artificial intelligence technology, artificial intelligence generation content (ARTIFICIAL INTELLIGENCE-GENERATED CONTENT, AIGC) is more emerging in the creative work field. However, in the AIGC co-creation environment in the design area, the greatest challenge is how to protect personal data privacy of the designer, outside the circulation and application of data, thereby protecting the intellectual property rights of the designer's work to the greatest extent. Federal learning is a framework method for realizing common modeling by using data mastered by each node on the basis of ensuring the privacy safety of the data and improving the effect of an AI model. In particular, a federal learning system can train machine learning algorithms, such as deep neural networks, on multiple local data sets contained in local nodes, but without explicitly exchanging data samples, but rather exchange parameters, such as network weights, between these local nodes at some frequency to generate a global model that is shared by all nodes. In addition, in the actual design work application scenario, the usability of the system is affected by other factors such as the response speed and the work efficiency of the system. Therefore, how to construct a federal learning system suitable for the design field needs to consider various problems such as efficiency, privacy and distributability.

Disclosure of Invention

The embodiment of the invention aims to provide a AIGC federal learning method for designer style fusion and privacy protection, which aims to solve the problems of data privacy protection, limited equipment computing resources and the like in artificial intelligence generation in the current design field.

In order to solve the technical problems, the technical scheme adopted by the invention is that the AIGC federal learning method for designer style fusion and privacy protection comprises the following steps:

Step S1, drawing sketches belonging to own styles by different designers according to a given color drawing, and extracting sketch style characteristics f _Gi by using the same generated countermeasure network locally; training the generated countermeasure network through the sketch style characteristics f _Gi to obtain model parameters of the generated countermeasure network with the sketch style characteristics;

The generating an countermeasure network comprises a VGG feature extractor for extracting color drawings and sketch features, an adaptive instance normalization module for dynamically normalizing the features, a generator G _S for generating a sketch I _i2s, and a loss function for optimizing the generating of the countermeasure network;

The generator G _S comprises convolution blocks positioned at the head end and the tail end and three layers of multi-resolution fusion modules MRFM positioned in the middle, and the convolution blocks and the multi-resolution fusion modules MRFM are sequentially connected in series; the multi-resolution fusion module MRFM is used for fusing the characteristics normalized by the adaptive instance normalization module, and generating a sketch I _i2s through a convolution block at the tail end after fusion;

Step S2, uploading locally generated countermeasure network model parameters to a server by all designer terminals, aggregating a plurality of groups of model parameters through a learning algorithm by the server, distributing the aggregated model parameters to each designer terminal, and updating parameters of the generated countermeasure network;

And S3, each designer generates a sketch with different designer styles by using the updated generation countermeasure network.

Further, the step of extracting the sketch style features by using the generated countermeasure network in the step S1 is:

s11, extracting the color drawings provided by a designer and the characteristics of the corresponding sketch by using a VGG network;

step S12, carrying out normalization adjustment on the features extracted in the step S11;

And S13, inputting the normalized characteristic f _Mj into a generator to obtain a sketch I _i2s generated by the countermeasure network, and extracting the characteristic of I _i2s by using a VGG network to obtain a sketch style characteristic f _Gi.

Further, in the step S12, the normalization process is expressed as:

；

wherein mu (·) and sigma (·) respectively represent the mean value taking and variance taking operations, f _Rj、f_Sj is respectively the first j features of the color map and sketch extracted by VGG, ; The f _Rj、f_Sj comprises four dimensions, namely a batch size batchsize, a channel number channel, a height and a width, wherein the mean value and variance operation are performed in the height dimension and the width dimension, and the f _Rj、f_Sj still keeps the four dimensions after the mean value and the variance are obtained.

Further, the generating process of the sketch I _i2s in the step S13 specifically includes:

f _M4 is passed to a generator Outputs f _G1 and f _M3 of the first layer convolution block are passed as inputs to the generator/>The outputs f _G2 and f _M2 of the second layer MRFM of the second layer are passed as inputs to the generator/>The outputs f _G3 and f _M1 of the third layer MRFM are passed to the generator/>The output f _G4 of the fourth layer MRFM is passed as input to the generator/>And finally outputting a sketch I _i2s by the convolution block of the last layer.

Further, the first layer convolution block is expressed as:

；

Where BN represents batch normalization, SN represents spectral normalization, leakyReLU represents activation function, cnov3 ×3 represents the convolutional layer of the 3×3 convolutional kernel;

The last layer of convolution blocks are expressed as:

；

wherein Cnov D represents a 2D convolution.

Further, the output process of the multi-resolution fusion module MRFM of the second layer is as follows:

First, f _M3 is downsampled by using the max pooling operation to obtain a feature map to conveniently capture more global semantic information of the input features, and then input to a convolution layer Cnov (), where the output is:

；

Then, with this output splice f _G1, the output operation of the splice with up-sampling US is then input to a convolution layer Cnov (·) to generate a feature containing both encoded information from the image layer and decoded information from the upper layer, where the output is:

；

wherein Cat represents stitching;

Finally, the characteristics are combined Added to the input feature f _M3:

；

wherein, the parameter epsilon is the weight of the feature f _M3;

the MRFM of the fourth layer and the third layer are identical to the MRFM of the second layer.

Further, the generating the loss function of the countermeasure network in the step S1 is:

；

In the method, in the process of the invention, For Gram matrix loss,/>Generating loss,/>For cosine similarity loss,/>Loss for clip; /(I)Respectively/>Is a weight of (2).

Further, the Gram matrix loss is expressed as:

；

wherein Gram is a Gram matrix, MSE is a mean square error loss, and f _Rj、f_Sj is the first j color map features and sketch features extracted by the VGG network respectively;

The generation loss is expressed as:

；

In the method, in the process of the invention, Representing the loss function of the arbiter D _S -Representing a loss function of the generator G _S, wherein f _s is a sketch feature extracted by the VGG network, and f _G、f_M is an output feature and an input feature of the generator respectively;

The cosine similarity penalty is expressed as:

；

f _S5、f_G5 is the sketch provided by the designer of the VGG extraction, the 5 th feature of the sketch generated by the generator, respectively;

Said clipping loss Expressed as loss of CLIP model:

；

In the method, in the process of the invention, Generating sketch characteristics and referencing sketch characteristics for the last layer of the CLIP model; the characteristics of the generated sketch and the characteristics of the reference color map of the 4 th layer of the CLIP model are respectively.

Further, the step S2 specifically includes:

generating model parameters for an countermeasure network The representation, T rounds, for each round t=1, 2., for each designer k and each layer of neural network/>The following operations are performed: /(I); Wherein SGD represents a random gradient descent method;

If t is 0 as a result of performing the modulo operation on the local update step E, then the following operation is performed: for each designer k and each layer of neural network If/>The layer is not BatchNorm layers, then perform: /(I) Then the other layers all carry out an average local model parameter step; if/>The layer is BatchNorm layers, then no execution, direct skip, review/>Whether the layer is BatchNorm layers;

And obtaining model parameters with k designer styles until the parameters of the T round server and the designer edge are updated, and then giving the model parameters to a target designer to generate an countermeasure network model.

Further, the polymerization process in the step S2 is optimized by using the following formula:

；

Wherein, Is a loss function that generates an countermeasure network, β is a coefficient of a regularization term L _FedDecorr (w, X), w is a model parameter that generates the countermeasure network, and X is a sketch style feature.

The beneficial effects of the invention are as follows:

(1) The invention uses the federal learning technology to protect designer data privacy from being revealed, and carries out multi-designer style parameter fusion on a cloud server, and the multi-designer style fused model parameters can be obtained after the multi-designer style fusion is distributed back to each designer end; the cloud executes FedSM (Style Mixing) method provided by the invention, so that the characteristic dimension collapse caused by data heterogeneity is prevented, and the calculation time of a remote server is saved; (2) Considering that a designer, namely a client has limited memory and computational power, the invention provides a lightweight generation countermeasure network model, thereby reducing the calculation time of the client and the communication time of the server; (3) The invention reduces the network structure in the generated countermeasure network by self-distillation, thereby reducing the number of generated countermeasure network model parameters which need to be calculated and transmitted.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow diagram of a AIGC federal learning system for designer style fusion and privacy protection in accordance with an embodiment of the present invention.

FIG. 2 is a schematic diagram of a generating countermeasure network model deployed at each client according to an embodiment of the present invention.

FIG. 3 is a schematic diagram of a Multi-resolution fusion module (Multi-Resolution Fusion module, MRFM) architecture according to an embodiment of the invention.

FIG. 4 is a sketch-generating result comparison diagram of the present method and its variants and other methods according to an embodiment of the present invention.

FIG. 5 is a diagram of the tool statistics of the auxiliary sketch AIGC preferred for use in the design process according to an embodiment of the present invention.

FIG. 6 is a diagram of statistics of a manner in which protection of creative intellectual property is facilitated according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The invention provides a AIGC federal learning method for designer style fusion and privacy protection, which is characterized in that different designers draw sketches belonging to own styles according to given color charts and extract sketch style characteristics locally by using the same generated countermeasure network; and then all designer terminals upload the local generation countermeasure network model parameters to a server, the server aggregates a plurality of groups of model parameters by adopting a preconfigured federal learning algorithm, and distributes the aggregated model parameters to each designer terminal, and finally the designer terminal locally generates the countermeasure network model, namely, the designer terminal can have the generation styles of a plurality of designers.

Referring to fig. 1, the method of the present invention can be divided into three stages: the method comprises a designer-end style extraction stage, a cloud server model parameter aggregation stage and a model parameter feedback distribution stage. The specific description of each stage is as follows:

And in the designer-side style extraction stage, the same generated countermeasure network model is deployed at the local sides of a plurality of designers, style characteristics of sketches drawn by different designers according to the color drawings are extracted, and then model parameters containing the sketch drawing styles of the designers are used as input of a cloud server. It should be noted that, in order to reduce the transmission time and the network traffic and the calculation amount of cloud federal learning, the embodiment only uploads the model parameters of the identifier network. As shown in fig. 1, in the system Data set representing color and sketch of the Kth designer,/>Representing a generated challenge network model obtained by training with the kth designer data, of particular note is the model/>, trained for each designerThe structure of the system is the same, and the model parameters of the generated countermeasure network of different sketch drawing styles are obtained by inputting sketches of different style designers for training.

In the cloud server model parameter aggregation stage, the cloud server does not have dataAnd modelBecause the decentralization algorithm can learn federally to model parameters/>, of the discriminant models of K designers without uploading the designer data to a serverModel parameters of different designer sketch styles are aggregated at a server side through FedSM federal learning algorithm.

And in the model parameter feedback distribution stage, model parameters after the cloud server is aggregated through federal learning are simultaneously feedback distributed to each designer. By continuously executing the three stages, a group of model parameters with different designer styles are finally obtained and updated to each designer to generate an countermeasure network model for the data set of each designerThe service generates a sketch that merges with other designer styles.

The following describes the core module in the sketch generation process in detail. In order to help the designer to quickly complete the sketch of the hand drawing or provide the designer with the inspiration, a model of generating the countermeasure network is provided as shown in fig. 2, and the main structure of the model comprises a standard pretrained VGG-16 model, an adaptive instance normalization AdaIN _sm, a multi-resolution fusion (MRFM) module and a loss function for optimizing the generation of the countermeasure network, which are described one by one.

1. VGG feature extractor

The advantage of VGG networks is to provide rich multi-level feature representation and enhanced feature expression capabilities. Each level is capable of extracting a different level of feature representation. In the image task, the first to generate a sketch is to extract the contour features of the input image. CNNs exhibit the ability to capture low-level image features, extracting edges and textures in shallower layers, while deeper layers can capture higher-level semantic features, such as overall shape and structure. By extracting the multi-level features, rich hierarchical structure image information can be obtained.

The sketch pairs in the data set of each designer and the sketch generated by the generator are taken as inputs, and after passing through the VGG feature extractor, each sketch has i feature outputs, namely:

；

Wherein T (·) represents feature extraction, I _R is a color drawing provided by the designer, I _S is a corresponding sketch of the color drawing provided by the designer, and I _i2s is a corresponding sketch generated against the network generation,/>Respectively represent slave/> The ith feature is extracted. Notably, the output characteristics/>, of the VGG network of FIG. 2And/>In which the order of quadrilaterals from large to small is/>; The dashed arrows in fig. 2 represent the gradient flow of the training sketch generator.

Considering that only semantic information of the sketch needs to be extracted, a classification layer in the VGG-16 model is not needed, and considering that the largest pooling layer in the VGG-16 model can cause loss of local features of the extracted sketch, the embodiment only keeps the first four layers in the VGG-16.

2. Adaptive instance normalization AdaIN _sm

The conventional adaptive instance normalization module AdaIN dynamically adjusts the normalization parameters to enable the model to adapt to the distribution differences between different samples. The sketches drawn by the various designers should exhibit similar feature means and feature variances, given that the designers all have their own unique design styles. This is due to the fact that the image can effectively represent its basic features such as brightness and contrast. Similarly, the variance and mean of sketch features extracted by the VGG network can also adequately describe the similarity of sketch features, particularly features related to style.

To enhance the ability of the model to capture the designer's unique design style, the present embodiment is implemented by aligning sketch features and color map features. Before training the model, the average and variance of the first four features of the VGG extraction of all sketch samples are calculated first,Four dimensions are included, batchsize (batch size), channel number, height and width, respectively.

；

Mu (·) and sigma (·) represent the mean and variance respectively,Representing the mean and variance std operation of only the first four sketch features extracted by VGG,/>Representing the dimension of the mean and variance, i.e. mean and variance only in 3, 4 dimensions, notably/>, after mean and varianceThe 4 dimensions remain.

Referring to fig. 2, the average and variance of the VGG extracted color map features and the processed sketch features are transferred to a AdaIN _sm module to normalize the model to adapt to the distribution differences between different samples, and then transferred to a generator G _S:

；

Wherein the method comprises the steps of Is to prevent denominator from being 0,/>The first j features of the color map and the sketch extracted by VGG are respectively. It is noted that the mean and variance operations here, i.e., μ (-), σ (-), are performed only in the 3 rd and 4 th dimensions of the input feature, i.e., height and width.

3. Multi-resolution fusion module MRFM

Based on the first two modules, a multi-level image feature is obtained that is normalized based on the designer's hand-drawn sketch sample. These features do not undergo a max-pooling operation and contain more fine-grained information such as contours and textures. The object of the decoder module of the present invention is to achieve improved feature decoupling with existing features to generate a designer-style high quality sketch. To achieve this goal, the present embodiment introduces a multi-resolution fusion (MRFM) module that focuses on the first three layers of features extracted by the VGG encoder. The MRFM module may refer to generator G _S of FIG. 2 and FIG. 3.

F _Mj as input to generator G _S, f _M4 is passed to first layer ConvBlock (convolution block) of generator G _S, outputs f _G1 and f _M3 of first layer ConvBlock as inputs to MRFM of the second layer of generator G _S, outputs f _G2 and f _M2 of the MRFM of the second layer as inputs to MRFM of the third layer of generator G _S, outputs f _G3 and f _M1 of the MRFM of the third layer to MRFM of the fourth layer of generator G _S, and output of the MRFM of the fourth layer as inputs to last layer ConvBlock of generator G _S, and finally output sketch I _i2s.

It should be noted that the number of MRFM layers of each layer refers to the position of the MRFM layer in the generator, that is, the generator G _S layers, the convolution block is the first layer, the MRFM layers are located in the second to fourth layers, and the fifth layer is the convolution block.

The structure of the first layer ConvBlock is as follows: first a convolution layer of a 3 x3 convolution kernel, in addition a 1-pixel fill convolution is applied to preserve the spatial dimension of the feature map. To enhance the stability and performance of the network, this embodiment introduces spectral normalization (spectral normalization, SN) and batch normalization (batch normalization, BN). Finally, using LeakyReLU activation functions introduces nonlinearities and avoids the gradient vanishing problem. Thus, the first layer ConvBlock can be expressed as:

；

The MRFM structure is as follows: f _M3 is first downsampled (downsampling, DS) using a max pooling operation to obtain a feature map that facilitates capturing more global semantic information of the input features, and then fed into a computer system The output at this time is:

；

Immediately after the output operation of the splice with this output splice (cat) f _G1, then with Upsampling (US), feeding Generating a feature containing both encoded information from the image layer and decoded information from the upper layer, the output being:

；

Finally, the characteristics are combined Added to the input feature f _M3. To ensure that the generated features contain global information from the combined features and fine-grained information from f _M3, the present embodiment introduces a differentiable parameter epsilon as the weight of the f _M3 features to minimize the impact on the final decoded features. Then the MRFM can be expressed as follows:

；

then when the layer number Time,/>This can be expressed as:

；

Then when k=4, f _G4 will be fed to generator G _S as the output of the last layer MRFM, the last layer convolutions ConvBlock, convBlock2 consist of a layer 2D convolution sum spectrum normalization (spectral normalization, SN), then the output of G _S is as follows:

；

4. Loss function

The loss function to generate the antagonism network is defined as follows:

；

This includes four loss functions: gram matrix loss, generation loss, cosine similarity loss, and Clip (Clip) loss. The weight values assigned to these loss functions are as follows: ，/> . These weight values are set to achieve a balanced and stable training process, ensuring that each loss function contributes equally to the optimization of the overall model.

(1) Gram matrix loss (Gram matrix loss)

To ensure detailed features of texture, color, and spatial information of the model captured image, a Gram matrix is utilized at each level of the VGG encoder to enhance similarity between the reference hand sketch and the generated sketch.

；

Where Gram is the Gram matrix, MSE is the mean square error loss, and f _Rj、f_Sj is the VGG extracted color map feature, sketch feature, respectively, note that only the first four features are extracted.

(2) Generating losses

The generation of the antagonism network (GAN) includes two basic components, a generator G _S and a discriminator D _S. Wherein G _S maps the input f _Gj into a false image to fool the arbiter, and the arbiter distinguishes the generator output f _Mj from the true image f _Sj. The generator and discriminator are alternately optimized by the countermeasure to achieve respective optimization objectives. The optimization objective is defined as follows:

；

(3) Cosine similarity (Cosine Similarity Loss)

In order to ensure the direct semantic similarity of the reference sketch and the generated sketch, cosine similarity is used as an index for evaluating the similarity of the reference sketch and the generated sketch:

；

Referring to fig. 2, f _S5、f_G5 is the fifth feature of each of the VGG extracted reference sketch and the generated sketch, respectively.

(4) Clip loss

And performing image-text matching by adopting a CLIP model with strong cross-modal expression capability and a ViT model capable of capturing global characteristic capability, wherein the specific model is a pre-trained CLIP model based on ViT-16/B.

；

In order to ensure semantic feature consistency between the generated sketch and the reference hand-drawn sketch (i.e. the sketch provided by the designer), this example makes use of the last layer of features of the CLIP model in particular, the generated sketch features areReference to hand drawn sketches is characterized by/>. In addition, to maintain consistency of global features, such as the outline of the sketch generated and the reference color, the present embodiment uses layer 4 features of the CLIP model to generate the sketch as/>Reference color is/>。

The cloud server model parameter aggregation stage of the invention utilizes an efficient federal learning strategy, fedSM technology, which is similar to FedAvg, fedSM performs local update and averages local model parameters. However FedSM assumes that the local model has BatchNorm layers and excludes their parameters from the average local model parameters step. The operation is mainly to save the calculation time of the server side and improve the use experience of a designer, namely the client side.

Referring to fig. 1, the specific flow is as follows: the designer is denoted by k and generates a layer index for each layer of neural network in the antagonism networkRepresentation, model parameters for generating discriminators in an antagonism network/>The local updating step size is represented by E, and the total server and edge parameter updating optimization round number is T. T-rounds for each round t=1, 2., for each designer k and each layer of neural network/>The operation is performed: /(I)That is, performing parameter updates for each layer of neural network that each designer generates an antagonistic network model using a random gradient descent SGD; considering a problem in federal learning, namely the heterogeneity of data (data heterogeneity), i.e. the local data of different participants (designers) comes from different distributions, this will seriously affect the final performance of the global model, the underlying reasons are also very complex. Namely: data heterogeneity results in a collapse of the characterized dimensions (dimensional collapse), thereby greatly limiting the expressive power of the model, affecting the performance of the final global model. For this purpose a federally learned regularization term FedDecorr is introduced,

；

Wherein the method comprises the steps ofIs a loss function that generates the countermeasure network, β is a coefficient of the regularization term FedDecorr, w is a model parameter that generates the countermeasure network, and X is a feature f _Gi of the VGG extracted generator generation sketch I _i2s.

Immediately after that, if the result of the modulo operation performed by t on E is 0, the following operation is performed (if not 0 is not performed): for each designer k and each layer of neural networkIf/>If the layer is not BatchNorm layers, then perform: i.e. parameters of BatchNorm layers are not updated, and then the other layers are all subjected to an average local model parameter step; if/> The layer is BatchNorm layers, then no execution, direct skip, review/>The layer is not BatchNorm layers. And (3) until the parameters of the T round server and the edge end of the designer are updated, obtaining model parameters with K designer styles, and then giving the parameters to a target designer to generate an countermeasure network model, so that a sketch of a target designer data set with different designer styles can be generated.

In view of the limited computational resources and memory of the client and the excessive redundancy of model parameters transmitted to the cloud server by the client, a model compression algorithm can be deployed to generate an countermeasure network model to reduce the required operational computational effort and communication cost required for transmission to the cloud server while guaranteeing model performance.

Whereas the generator and the arbiter must follow a Nash equilibrium (Nash equivalent) state to avoid pattern collapse in countermeasure training, conventional GAN compression methods are both compression generators while maintaining the original capabilities of the arbiter, which results in the balance of countermeasure learning being broken, thereby more easily resulting in pattern collapse. The present embodiment uses a new training protocol specifically tailored for efficient generation of models, which reduces the computational cost and model size of the generator in conditional GAN, and specifically uses knowledge distillation and neural architecture search to reduce training instability and improve model efficiency. The method is applied to the GAN (generation countermeasure network) model provided by the invention, the designer terminal occupies memory to be reduced, the running speed is improved, and the precision is kept unchanged.

Examples

This section demonstrates four forms of research results, including systematic and user studies: (1) comparing system research of different sketch generation models and verifying user research of effectiveness of different components in the models, (2) comparing system research of different federal learning methods and user research on intellectual property protection characteristics based on style fusion sketch generation, (3) style fusion sketch efficiency evaluation. The result shows that the AIGC federal learning method for designer style fusion and privacy protection can improve the work efficiency of the designer. The method performs comparable to a human designer in sketch generation and sketch style fusion. The present embodiment also reports feedback from professional designers in expert interviews.

This example compares the method of the present invention with a partially advanced method. Namely the TOM method （Bingchen Liu, Yizhe Zhu, Kunpeng Song, and Ahmed Elgammal. 2021. Self-supervised sketch-to-image synthesis. In Proceedings of the AAAI conference on artificial intelligence, Vol. 35. 2073–2081） and the StyleMe method （Di Wu, Zhiwang Yu, Nan Ma, Jianan Jiang, Yuetian Wang, Guixiang Zhou, Hanhui Deng, and Yi Li. 2023. StyleMe: Towards Intelligent Fashion Generation with Designer Style. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–16.）., in addition, the performance of the three model components in the system, namely the adaptive instance normalization module AdaIN _sm, the multi-resolution fusion module MRFM, and the CLIP loss in the loss function, is verified by three types of comparison tasks, including sketches. Figure 4 shows the effect of the present method and its variants on generating a sketch. "without model compression" represents the method of the present invention without GAN compression, "without MRFM" represents the method of the present invention but without MRFM module control group, "without AdaIN _sm" represents the method of the present invention but without AdaIN _sm module control group, without CLIP represents the method of the present invention but without CLIP loss module control group. In fig. 4, the first to seventh rows represent first to seventh designers, and the first and second columns depict color images authored by each designer and corresponding sketches of hand-drawing. The third column to seventh column represent sketches generated by the system and its ablation experimental control group, while the last two columns represent the generation results of StyleMe and Tom, respectively. From fig. 4, it can be observed that the sketch produced by the method of the present invention presents a style very close to that of a hand-drawn sketch and possesses complex details. In contrast, when the sketch created for the fourth designer is not using MRFM and not using AdaIN _SM, a significant line loss appears under the garment; when examining StyleMe the sketch created for the fourth designer, it is apparent that extra lines are created around the collar area. As for the TOM method, it can be noted that there are obvious detail defects and missing lines when generating sketches for the third, fourth and sixth designers.

The embodiment also evaluates the copyright protection function of the method from a subjective aspect through user research. This example summarizes three common data driven AIGC tool usage methods: A. generating a sketch of the own style by uploading the reference image and the sketch of the own; B. uploading complete AI model parameters to generate a sketch with own style; C. by uploading part of the AI model parameters (only the discriminant, not including designer style parameters), a sketch fusing different design styles is generated. 20 designers were recruited in the assessment process, were interpreted the meaning of each option, and were asked to answer two questions: (1) What AIGC tools do you prefer to use to assist in sketching during the design process? (2) What is you thought to be more beneficial to creative intellectual property protection? They are required to select one or two options from the three options. The statistical results are shown in fig. 5 and 6. The result shows that the function of only uploading part of model parameters in the method obtains active subjective feedback of a designer, and is considered to be more beneficial to protecting intellectual property.

TABLE 1 time consuming representation of the sketches of the mixed styles by different user groups with/without system support

To examine the efficiency improvement of the present method for sketching, in this experiment, the model was trained with seven designer styles, one image (1 reference image and 6 style images) was selected from each style. 30 designers were invited to participate in the experiment, 15 of which professional designers designed 15 of the professional students. 30 designers were instructed to create a sketch from the reference image while mixing it with the styles in the 6 style images, keeping the design elements unchanged, changing only the lines and positions. All participants were asked to complete the design task with or without the assistance of the present method and their completion times were recorded. To mitigate the impact of design complexity on time, this experiment ensures that the reference image and the pattern image have similar complexity. Table 1 shows experimental records and comparison between groups shows that the novice group requires more time than the expert group both with and without the present method. The comparison in the group shows that when the method is used, the sketch completion time of the novice group and the expert group is greatly shortened. For both groups, the first sketch takes the longest time, decreasing in time as the participants complete more tasks. The time reduction is more pronounced and stable when using the present method, whereas the smoothness of the reduction is worse without the present system.

In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.

The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention are included in the protection scope of the present invention.

Claims

1. The AIGC federal learning method for designer style fusion and privacy protection is characterized by comprising the following steps of:

2. The method for AIGC federal learning for designer style fusion and privacy protection of claim 1, wherein the step of extracting sketch style features using the generated countermeasure network in step S1 comprises:

Step S13, normalizing the characteristics And inputting the generated sketch I _i2s generated by the countermeasure network, and extracting the characteristics of the I _i2s by using a VGG network to obtain sketch style characteristics f _Gi.

3. The method for AIGC federal learning for designer style fusion and privacy protection according to claim 2, wherein in step S12, the normalization process is expressed as:

；

4. The method for AIGC federal learning for designer style fusion and privacy protection according to claim 3, wherein the generating process of sketch I _i2s in step S13 specifically includes:

f _M4 is passed to the first layer convolution block of generator G _S, outputs f _G1 and f _M3 of the first layer convolution block are passed as inputs to the multi-resolution fusion module MRFM of the second layer of generator G _S, outputs f _G2 and f _M2 of the MRFM of the second layer are passed as inputs to the MRFM of the third layer of generator G _S, outputs f _G3 and f _M1 of the MRFM of the third layer are passed to the MRFM of the fourth layer of generator G _S, and output f _G4 of the MRFM of the fourth layer is passed as inputs to the final layer convolution block of generator G _S, and finally sketch I _i2s is output.

5. A designer style fusion and privacy protection oriented AIGC federal learning method according to claim 1 or 4, wherein the first layer convolution block is represented as:

；

The last layer of convolution blocks are expressed as:

；

wherein Cnov D represents a 2D convolution.

6. The method for AIGC federal learning for designer style fusion and privacy protection according to claim 1 or 4, wherein the output process of the multi-resolution fusion module MRFM of the second layer is:

；

wherein Cat represents stitching;

Finally, the characteristics are combined Added to the input feature f _M3:

；

wherein, the parameter epsilon is the weight of the feature f _M3;

7. A designer style fusion and privacy protection oriented AIGC federal learning method according to claim 1 or 3, wherein the generating a penalty function against the network in step S1 is:

；

8. The designer style fusion and privacy protection oriented AIGC federal learning method of claim 7,

The Gram matrix loss is expressed as:

；

The generation loss is expressed as:

；

The cosine similarity penalty is expressed as:

；

Said clipping loss Expressed as loss of CLIP model:

；

9. The AIGC federal learning method for designer style fusion and privacy protection according to claim 1, wherein the step S2 specifically includes:

10. The method for AIGC federal learning for designer style fusion and privacy protection according to claim 1, wherein the aggregation process in step S2 is optimized using the following formula:

；