CN115659852A - Layout generation method and device based on discrete potential representation - Google Patents

Layout generation method and device based on discrete potential representation Download PDF

Info

Publication number
CN115659852A
CN115659852A CN202211671875.5A CN202211671875A CN115659852A CN 115659852 A CN115659852 A CN 115659852A CN 202211671875 A CN202211671875 A CN 202211671875A CN 115659852 A CN115659852 A CN 115659852A
Authority
CN
China
Prior art keywords
layout
representation
model
discrete
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211671875.5A
Other languages
Chinese (zh)
Other versions
CN115659852B (en
Inventor
陈柳青
景千芝
孙凌云
甄焱鲲
周婷婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202211671875.5A priority Critical patent/CN115659852B/en
Publication of CN115659852A publication Critical patent/CN115659852A/en
Application granted granted Critical
Publication of CN115659852B publication Critical patent/CN115659852B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Document Processing Apparatus (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention discloses a layout generation method based on discrete potential representation, which comprises the following steps: step 1, constructing a training set, wherein the training set comprises an element frame sequence and a constraint condition for generating layout; step 2, constructing a layout generation network based on element constraint, wherein the layout generation network comprises a feature extraction module, a discrete latent variable generation module and a reconstruction module; step 3, training the layout generation network by adopting a training set to obtain a layout generation model; step 4, constructing a one-way Transformer model by taking constraint conditions as input constraints and corresponding discrete potential representations of the layout as outputs; and 5, taking the obtained unidirectional Transformer model as an input end of the constraint condition, taking an output result of the unidirectional Transformer model as an input of the layout generation model, and obtaining an element frame sequence meeting the input constraint condition. The invention also provides a layout generation device. The method can output the layout of the design drawing which meets the requirement and has high quality according to the planar design requirement.

Description

Layout generation method and device based on discrete potential representation
Technical Field
The invention relates to the field of image generation, in particular to a layout generation method and device based on discrete potential representation.
Background
The flat design is a very important visual communication tool, and the colorful image and the concise and readable Wen Zijie are combined to form a specific visual expression with aesthetic tendency to attract the attention of people and transmit information. The layout design is the basis of the planar design, the core content of the layout design is to reasonably arrange a plurality of design elements required to be displayed in a given canvas range, and a designer usually realizes the layout design by adjusting the size (width, height) and the position (abscissa and ordinate) of the design elements. In addition, in order to enable the flat design to quickly and accurately convey information and attract the attention of the user, the designer usually considers the application scenario of the layout and the type of the design element when arranging the design element. For example, the layout of a fashion magazine is flexible and changeable, image information occupies a large area, and the layout of a science and technology magazine is more neat and precise, and mainly takes text information as a main part.
The academic literature is Layout Generation and Completion with Self-orientation [ J ] 2020. Firstly, discretization is carried out on type information and geometric parameters of design elements in the Layout, then all element information is spliced into a sequence, the relation between the element information is learned by using the Self-attention mechanism of a transform model, the residual element information is predicted step by step according to the relation, and finally, the whole element information sequence is predicted to obtain a brand-new Layout. The scheme can generate a new layout from an empty sequence or a sequence containing partial element information, and can be expanded to various layout generation tasks such as UI layout, document layout, space layout and the like. But the scheme lacks specified constraints in the implementation process and is very dependent on the quality and the volume of the training set. Therefore, the finally generated model is very dependent on heuristic rules, and the diversity of output results cannot be ensured.
Patent document CN110706315a discloses a layout generation method, apparatus, electronic device, and storage medium for a planar design, the method including: acquiring element types in the planar design and the number of elements corresponding to each type of elements, and randomly generating a plurality of initial planar layouts according to the element types and the element numbers; grading each initial plane layout by using a preset grading rule, and classifying each initial plane layout into a high-quality plane layout or a low-quality plane layout according to a grading result; and training a preset generative confrontation network GAN by using a high-quality plane layout in the plurality of initial plane layouts to obtain a trained GAN, and obtaining a new high-quality plane layout through the trained GAN. The method cannot generate the model according to a given scene, lacks practical application capability, has the phenomenon of posterior collapse, and cannot finish the training convergence work of the model.
Patent document CN1584930a discloses an image element layout apparatus, a layout program, and a layout method, which include calculating arrangement intervals between image elements to be laid out based on time differences acquired between the image elements, and arranging the image element arrangement of the selected image elements to be laid out along a path of selected path information. According to the method, the configuration interval between the image elements is analyzed based on the relation between the image elements and time, but the problem of layer stacking or repeated configuration exists only according to the time sequence.
Disclosure of Invention
In order to solve the above problems, the present invention provides a layout generation method based on discrete latent representation, which can output a high-quality design drawing layout meeting requirements and having high quality according to planar design requirements.
A method for generating a layout based on discrete potential representations, comprising:
step 1, constructing a training set, wherein the training set comprises an element frame sequence used for generating layout and corresponding constraint conditions, and the constraint conditions comprise an element category sequence and an application scene;
step 2, constructing a layout generation network based on element constraint, wherein the layout generation network comprises a feature extraction module, a discrete latent variable generation module and a reconstruction module, the feature extraction module comprises a self-attention encoder and is used for hiding an input element category sequence, an element frame sequence and an application scene into a d-dimensional space to generate a corresponding layout latent representation, the discrete latent variable generation module is used for carrying out discretization processing on the generated layout latent representation to obtain a corresponding layout discrete latent representation, and the reconstruction module is used for outputting an element frame sequence corresponding to a real layout according to the input element category sequence, the application scene and the layout discrete latent representation;
step 3, training the layout generation network constructed in the step 2 by adopting a training set to obtain a layout generation model;
step 4, constructing a one-way Transformer model by taking the element category sequence and the application scene as input constraints, and training the one-way Transformer model by utilizing the training set and the layout discrete potential representation in the step 2 to obtain a layout discrete potential representation meeting the input constraint conditions;
and 5, taking the unidirectional Transformer model obtained by training in the step 4 as an input end of a constraint condition, taking the output result layout discrete potential representation of the unidirectional Transformer model as an input of a reconstruction module in the layout generation model, and decoding the layout discrete potential representation to obtain an element frame sequence meeting the input constraint condition.
The invention provides a brand-new LayoutVQ-VAE model, which generates layout by learning discrete potential representation of the layout, and reconstructs a frame sequence of elements by adopting a non-natural regression decoder, thereby obtaining an element frame sequence corresponding to input constraint, and generates high-quality design drawing layout by using the obtained element frame sequence.
Specifically, in step 2, the expression formula of the self-attention encoder is as follows:
Figure DEST_PATH_IMAGE002
where, and means a multi-layer sensor,
Figure DEST_PATH_IMAGE003
is shown as
Figure DEST_PATH_IMAGE004
The parameters of the borders of the individual elements,
Figure DEST_PATH_IMAGE005
is shown as
Figure 805844DEST_PATH_IMAGE004
An element category representing a second application scenario of the layout,
Figure DEST_PATH_IMAGE006
a hidden representation representing each of the entries is shown,
Figure DEST_PATH_IMAGE007
the position of the representation is embedded in,
Figure DEST_PATH_IMAGE008
is shown as
Figure DEST_PATH_IMAGE009
The number of the learning embedded-type can be learned,
Figure DEST_PATH_IMAGE010
Figure DEST_PATH_IMAGE011
express correspondence
Figure 644355DEST_PATH_IMAGE008
The hidden output of (2), representing the number of layout headers,
Figure DEST_PATH_IMAGE012
parameters representing a self-attention encoder represent a multi-headed self-attention mechanism in a Transformer model.
Specifically, in step 2, the layout potential representation has the following expression:
Figure DEST_PATH_IMAGE014
in the formula (I), the compound is shown in the specification,
Figure DEST_PATH_IMAGE015
,,
Figure DEST_PATH_IMAGE016
the number of the layout heads is indicated,
Figure DEST_PATH_IMAGE018
a sequence of bounding boxes of the elements is represented,
Figure DEST_PATH_IMAGE019
a sequence of the element categories is represented,
Figure DEST_PATH_IMAGE020
representing an application scenario.
Specifically, in step 2, the discrete latent variable generating module converts the layout latent representation into the closest element in space by using a mapping function according to VQ-VAE theory, where an expression of the mapping function is as follows:
Figure DEST_PATH_IMAGE021
Figure DEST_PATH_IMAGE022
where the layout is represented as a discretized latent representation, representing discretization.
Preferably, in step 2, the reconstruction module reconstructs the frame sequence of the element by using a non-natural regression decoder, so that the model can better understand the relationship between the preceding and following elements, and the layout obtained by final reconstruction is closer to the real layout.
Specifically, the expression of the non-natural regression decoder is as follows:
Figure DEST_PATH_IMAGE023
the formula represents a reconstructed element frame parameter, represents a parameter of the non-natural regression decoder, represents a hidden representation of each input item, represents a corresponding hidden output, represents a first element category, and represents a first application scenario of the layout.
Specifically, in step 3, a cross entropy function and a commitment loss are used to perform parameter adjustment on the layout generation network in the training process, and the specific expression is as follows:
Figure DEST_PATH_IMAGE024
the expression represents reconstruction loss of a model calculated by using a cross entropy method, a weight coefficient representing commitment loss, a stop gradient operator, and a bounding box sequence of reconstruction elements.
Specifically, the stop gradient operator is specifically represented as follows:
Figure DEST_PATH_IMAGE025
the invention also provides a layout generation device, which comprises a computer memory, a computer processor and a computer program which is arranged in the computer memory and can be executed on the computer processor, wherein the computer memory adopts the layout generation model and the one-way Transformer model;
the computer processor, when executing the computer program, performs the steps of: inputting the element category sequence and the application scene requirement of the planar design drawing into a one-way Transformer model, and using the layout discrete potential representation output by the one-way Transformer model as the input of a reconstruction module in a layout generation model to obtain the high-quality design drawing layout meeting the element category and the scene constraint.
Compared with the prior art, the invention has the beneficial effects that:
(1) A novel generative model is proposed that is capable of generating layouts that satisfy user constraints including design element labels (element types and quantities) inside the layout and application scenarios outside the layout.
(2) Accurate and comprehensive data distribution is provided for a generating part through a pre-constructed one-way Transformer model, so that the diversity of generating layout can be ensured on one hand, the quality of the layout directly generated by the model can be ensured on the other hand, the complex post-processing optimization operation is avoided, and the time and space complexity of an algorithm is reduced.
Drawings
FIG. 1 is an overall architecture of a layout generation model proposed by the present invention;
FIG. 2 is a diagram comparing the layout generation results of the layout generation model and the existing model under the constraint of the element category sequence;
FIG. 3 is a diagram comparing a layout reconstruction result of a layout generation model and an existing model under the constraint of an element category sequence;
FIG. 4 is a diagram comparing layout reconstruction results of a layout generation model under the constraints of different application scenarios;
FIG. 5 is a diagram illustrating a layout reconstruction result of a layout generation model under a constraint of an element category sequence and an application scenario.
Detailed Description
A flat design layout consists of a series of design elements. In order to generate a completely new layout, we need to predict the geometric parameters of these elements, including the position coordinates, width and height of the elements, according to given constraints (layout application scenarios and element labels). A layout may thus be defined, wherein the application scenario of the layout is represented,
Figure DEST_PATH_IMAGE026
to express the layout
Figure 662908DEST_PATH_IMAGE004
The number of the elements is one,
Figure DEST_PATH_IMAGE027
indicating the number of elements in the layout. For each element, we use
Figure DEST_PATH_IMAGE028
Is shown in which
Figure DEST_PATH_IMAGE029
Representing the category of the element (e.g., image or title), representing the center coordinates, width, and height of the element's border. In actual training, geometric parameters of all element borders are spliced into a sequence, and a 7-bit uniform quantization method is used for discretizing parameter values. The constraint of the layout is filled by two repeated values as a sum
Figure DEST_PATH_IMAGE030
Sequences of the same length
Figure 215374DEST_PATH_IMAGE019
And a representation.
For the sake of brevity, we use to denote each item in the sequence.
In order to solve the problem of generating a flat design drawing faced by the prior art, the present embodiment provides a layout generation method based on a discrete potential representation.
As shown in fig. 1, step 1, constructing a training set, including an element frame sequence for generating a layout and corresponding constraints, where the constraints include an element category sequence and an application scenario;
step 2, constructing a layout generation network based on element constraint, wherein the layout generation network comprises a feature extraction module, a discrete latent variable generation module and a reconstruction module, the feature extraction module comprises a self-attention encoder and is used for hiding an input element category sequence, an element frame sequence and an application scene into a d-dimensional space to generate a corresponding layout latent representation, the discrete latent variable generation module is used for carrying out discretization processing on the generated layout latent representation to obtain a corresponding layout discrete latent representation, and the final reconstruction module is used for outputting a corresponding element frame sequence in a real layout according to the input element category sequence, the application scene and the layout discrete latent representation;
the self-attention encoder firstly uses a multilayer multi-layer perceptron to perform hidden projection on each input item into a d-dimensional space and performs embedded addition on the input item and the position to obtain a corresponding hidden input sequence, the final output of the encoder is limited to output vectors corresponding to the learnable embedding, the vectors are multi-head potential representations of the layout, the vectors contain characteristic information of the whole layout, and the expression is as follows:
Figure DEST_PATH_IMAGE002A
wherein, and denotes a multilayer perceptron, denotes
Figure 357643DEST_PATH_IMAGE004
Element bounding box parameter, representing
Figure 711264DEST_PATH_IMAGE004
An element class representing a second application scenario of the layout, representing a hidden representation of each entry, representing a positional embedding, representing a first learnable embedding, representing a corresponding hidden output, representing a number of layout headers,
Figure 509455DEST_PATH_IMAGE012
parameters representing a self-attention encoder represent a multi-headed self-attention mechanism in a Transformer model.
The expression of the layout potential representation output by the encoder is as follows:
Figure 3628DEST_PATH_IMAGE014
in the formula (I), the compound is shown in the specification,
Figure 203665DEST_PATH_IMAGE015
,,
Figure 728188DEST_PATH_IMAGE016
the number of the layout heads is indicated,
Figure 216938DEST_PATH_IMAGE030
and the element frame sequence is represented, the element category sequence is represented, and the application scene is represented.
The discrete latent variable generation module adopts a mapping function to convert the layout latent representation into the closest elements in space according to VQ-VAE theory, and the expression of the mapping function is as follows:
Figure 485108DEST_PATH_IMAGE021
Figure 805231DEST_PATH_IMAGE022
in the formula, a layout discrete potential representation is represented, and discretization is represented;
the reconstruction model adopts a non-natural regression decoder to reconstruct the frame sequence of the element, the result obtained by reconstruction can be closer to the real sequence, and the reconstructed element frame sequence is output according to the constraint conditions such as input layout discrete potential representation and response, and the specific expression is as follows:
Figure 766234DEST_PATH_IMAGE023
in the formula (I), the compound is shown in the specification,
Figure DEST_PATH_IMAGE032
representing one reconstructed element bounding box parameter, representing a parameter of the non-natural regression decoder, representing a hidden representation of each input term, representing a corresponding hidden output,
Figure 70176DEST_PATH_IMAGE005
is shown as
Figure 377923DEST_PATH_IMAGE004
The number of the element categories is one,
Figure DEST_PATH_IMAGE033
to express the layout
Figure 286973DEST_PATH_IMAGE004
An application scenario.
Step 3, training the layout generation network constructed in the step 2 by adopting a training set to obtain a layout generation model, and adjusting parameters of the layout generation network by adopting a cross entropy function and commitment loss in the training process, wherein the specific expression is as follows:
Figure 950036DEST_PATH_IMAGE024
in the formula (I), the compound is shown in the specification,
Figure DEST_PATH_IMAGE034
representing the reconstruction loss of the model calculated using the cross-entropy method,
Figure DEST_PATH_IMAGE035
a weight coefficient representing the loss of commitment,
Figure DEST_PATH_IMAGE036
the stop gradient operator is represented by the stop gradient operator,
Figure DEST_PATH_IMAGE037
and representing a frame sequence of the reconstruction element, wherein the expression is as follows:
Figure DEST_PATH_IMAGE039
therefore, the decoder is optimized only by the reconstruction loss, the encoder is optimized by the reconstruction loss and the commitment loss, and the mapping space is optimized by an exponential Moving average algorithm (EMA).
Step 4, constructing a one-way Transformer model by taking the element category sequence and the application scene as input constraints, and training the one-way Transformer model by utilizing the training set and the layout discrete potential representation in the step 2 to obtain a layout discrete potential representation meeting the input constraint conditions;
and (3) the prior distribution of the discrete potential representation is defined as a uniform multi-class distribution, so that after the training of the step (4) is completed, a one-way Transformer model is accessed to obtain the discrete potential representation of the regression prediction layout, the prediction of the model on the discrete potential marks is only required to be optimized when the one-way Transformer model is trained, the output corresponding to the condition constraint representation is omitted, the autoregressive sampling is carried out on the discrete potential representation of the layout conforming to the input constraint condition, then the representation and the condition constraint are input into a reconstruction module together, and the element frame sequence meeting the constraint condition is generated, so that the high-quality layout of the design drawing is obtained.
And 5, taking the unidirectional Transformer model obtained by training in the step 4 as an input end of a constraint condition, taking the layout discrete potential representation output by the unidirectional Transformer model as an input of a reconstruction module in the layout generation model, and decoding the layout discrete potential representation to obtain an element frame sequence meeting the input constraint condition.
The process of generating the layout of the design diagram based on the specified element bounding box sequence is known in the prior art, and therefore, the detailed description is omitted.
The present embodiment also provides a layout generating apparatus, including a computer memory, a computer processor, and a computer program in the computer memory and executable on the computer processor, the computer memory using the layout generating model and the one-way Transformer model proposed in the above embodiments, and when the computer processor executes the computer program, the following steps are implemented:
inputting the element category sequence and the application scene requirement of the planar design drawing into a one-way Transformer model, and using the layout discrete potential representation output by the one-way Transformer model as the input of a reconstruction module in a layout generation model to obtain the high-quality design drawing layout meeting the element category and the scene constraint.
In order to illustrate the difference between the model and the existing model, the embodiment also provides the comparison and evaluation of the effect in the practical application process.
In the first case, we use the layout transform model and the layout gan + + model as the reference to evaluate the performance of the layout generation model proposed in this embodiment in the layout generation task under the constraint of the element class sequence, and the specific results are shown in table 1.
Figure DEST_PATH_IMAGE041
As can be seen from table 1, the layout generation model provided in this embodiment achieves the best results in both FID and maxlou indexes, thereby proving that the capability of the discrete layout representation method in summarizing layout features is better than that of the conventional continuous layout representation; and in the aspect of aesthetic quality, the discretization treatment based on the element geometric parameters enables the final model to realize better alignment effect, so that the optimal alignment score is obtained.
As shown in fig. 2, the layout generation results of the layout generation model and the two comparison models are respectively shown, and it can be understood from the figure that the layout generated by the layout gan + + model has the problems of element misalignment and element overlap due to lack of post-processing optimization; and the LayoutTransformer model only learns the predicted relationship between elements because of using a one-way Transformer, the type and number of elements that have not occurred can not be predicted, and thus the distribution of elements in the layout that it produces is not uniform, and phenomena of element overlap and large area whiteout can occur.
Therefore, the output result of the layout generation model provided by the embodiment is closer to the actual layout, and various types of elements can be reasonably arranged and a good alignment effect is realized; in addition, compared with the layout transform model, the one-way transform model used in the layout generation model provided in this embodiment is only used to generate the discrete potential representation of the layout, and a two-way transform is used in decoding the discrete potential representation of the layout, so that the relationship among all elements can be modeled and the geometric parameters of the borders of all elements can be predicted at the same time, which effectively solves the problem of the layout transform model.
Since the layout reconstruction function cannot be realized by the layout transform model, the layout generation model is compared with the layout gan + + model in example 1, and the specific comparison result is shown in table 2.
Figure DEST_PATH_IMAGE042
As can be seen from table 2, the layout generated by the layout generation model more closely approximates the true layout in both feature distribution (using FID evaluation) and element bounding box distance (using maxlou and evaluation).
As shown in fig. 3, it can be seen that our model can not only reconstruct the structure of the real layout, but also accurately restore the detailed position and size of the element, while the layout structure can be roughly captured by the layout gan + + model, it is not accurate enough in predicting the bounding box, and there are serious misalignment and overlap problems.
In the second case, since the layout generation under the constraint of the application scenario is not considered in the prior art, and the influence of the application scenario on the layout is difficult to quantitatively evaluate, in this embodiment, the performance of the layout generation model is qualitatively evaluated by comparing the layouts generated by the layout generation model under the constraint of the same element class sequence but different scenarios (based on PDCard and Magazine datasets).
As shown in fig. 4, a layout reconstruction result diagram generated based on the PDCard data set is shown, where a scene one is a recommended commodity scene, a scene two is a commodity classified display scene, and a scene three-dimensional commodity search scene, and it can be clearly found by comparing layouts of the same line that even though the same element category sequence is used, a layout corresponding to the corresponding scene can be generated:
in the layout applied to the merchandise recommendation scene, the image elements occupy a larger area because the picture representation can transfer merchandise information and attract consumers more quickly;
in the layout applied to the commodity classified display scene, the proportion of the description elements corresponding to the image elements is increased, so that the corresponding description elements can be seen when the image elements are seen;
whereas in layouts applied to merchandise search scenarios, the image elements typically occupy a smaller area, the information is presented primarily in text form, which can help consumers to further explore merchandise details.
As shown in fig. 5, a layout reconstruction result diagram generated based on a Magazine data set is shown, the layout of the scientific Magazine and the news Magazine is more focused on the structural and regularity of texts, and needs to show a serious and rigorous layout, while the fashion Magazine and the food Magazine are biased to entertainment and leisure, and need more image creatives and unconventional layouts, so that the eyeballs of readers can be caught.

Claims (8)

1. A method for generating a layout based on discrete potential representations, comprising:
step 1, constructing a training set, wherein the training set comprises an element frame sequence for generating layout and corresponding constraint conditions, and the constraint conditions comprise an element category sequence and an application scene;
step 2, constructing a layout generation network based on element constraint, wherein the layout generation network comprises a feature extraction module, a discrete latent variable generation module and a reconstruction module, the feature extraction module comprises a self-attention encoder and is used for hiding an input element category sequence, an element frame sequence and an application scene into a d-dimensional space to generate a corresponding layout latent representation, the discrete latent variable generation module is used for carrying out discretization processing on the generated layout latent representation to obtain a corresponding layout discrete latent representation, and the reconstruction module is used for outputting a corresponding element frame sequence in a real layout according to the input element category sequence, the application scene and the layout discrete latent representation;
step 3, training the layout generation network constructed in the step 2 by adopting a training set to obtain a layout generation model;
step 4, constructing a one-way Transformer model by taking the element category sequence and the application scene as input constraints, and training the one-way Transformer model by utilizing the training set and the layout discrete potential representation in the step 2 to obtain a layout discrete potential representation meeting the input constraint conditions;
and 5, taking the unidirectional Transformer model obtained by training in the step 4 as an input end of a constraint condition, taking the layout discrete potential representation output by the unidirectional Transformer model as an input of a reconstruction module in the layout generation model, and decoding the layout discrete potential representation to obtain an element frame sequence meeting the input constraint condition.
2. The method of claim 1, wherein in step 2, the self-attention encoder has the following expression:
Figure 547015DEST_PATH_IMAGE001
in the formula (I), the compound is shown in the specification,
Figure 688146DEST_PATH_IMAGE002
and
Figure 779599DEST_PATH_IMAGE003
a multi-layer perceptron is represented,
Figure 115027DEST_PATH_IMAGE004
is shown as
Figure 331245DEST_PATH_IMAGE005
The parameters of the borders of the individual elements,
Figure 682592DEST_PATH_IMAGE006
denotes the first
Figure 628551DEST_PATH_IMAGE007
The number of the element categories is one,
Figure 39941DEST_PATH_IMAGE008
to express the layout
Figure 805772DEST_PATH_IMAGE009
The context of the application is such that,
Figure 695230DEST_PATH_IMAGE010
a hidden representation representing each of the entries is shown,
Figure 433379DEST_PATH_IMAGE011
the position of the representation is embedded in,
Figure 343566DEST_PATH_IMAGE012
is shown as
Figure 360807DEST_PATH_IMAGE013
The embedded-type intelligent learning machine can be embedded,
Figure 53957DEST_PATH_IMAGE014
Figure 708929DEST_PATH_IMAGE015
express correspondence
Figure 258859DEST_PATH_IMAGE016
The hidden output of (a) is,
Figure 468124DEST_PATH_IMAGE017
the number of the layout heads is indicated,
Figure 761702DEST_PATH_IMAGE018
the parameters representing the self-attention encoder are,
Figure 740022DEST_PATH_IMAGE019
represents the multi-headed self-attention mechanism in the Transformer model.
3. The method of claim 1, wherein in step 2, the expression of the layout potential representation is as follows:
Figure 398537DEST_PATH_IMAGE020
in the formula (I), the compound is shown in the specification,
Figure 393300DEST_PATH_IMAGE021
Figure 428252DEST_PATH_IMAGE022
Figure 588975DEST_PATH_IMAGE023
the number of the layout heads is indicated,
Figure 418390DEST_PATH_IMAGE024
a sequence of bounding boxes of the elements is represented,
Figure 398985DEST_PATH_IMAGE025
a sequence of the element categories is represented,
Figure 565524DEST_PATH_IMAGE026
representing an application scenario.
4. The method according to claim 1, wherein in step 2, the discrete latent variable generation module converts the layout latent representation into space by using a mapping function according to VQ-VAE theory
Figure 924961DEST_PATH_IMAGE027
The expression of the mapping function is as follows:
Figure 73746DEST_PATH_IMAGE028
Figure 479320DEST_PATH_IMAGE029
in the formula (I), the compound is shown in the specification,
Figure 121654DEST_PATH_IMAGE030
the presentation layout is a discrete potential presentation,
Figure 991389DEST_PATH_IMAGE031
the representation is discretized and the representation is discretized,
Figure 756083DEST_PATH_IMAGE032
Figure 117794DEST_PATH_IMAGE033
5. the discrete potential representation-based layout generation method of claim 1, wherein in step 2, the reconstruction module reconstructs the bounding box sequence of elements using a non-natural regression decoder.
6. The method of claim 5, wherein the non-natural regression decoder is expressed as follows:
Figure 127601DEST_PATH_IMAGE034
in the formula (I), the compound is shown in the specification,
Figure 727210DEST_PATH_IMAGE035
representing one of the reconstructed element bounding box parameters,
Figure 459542DEST_PATH_IMAGE036
representing the parameters of a non-natural regression decoder,
Figure 511812DEST_PATH_IMAGE037
a hidden representation representing each of the entries is shown,
Figure 355003DEST_PATH_IMAGE038
indicate a correspondence
Figure 809118DEST_PATH_IMAGE037
The output is hidden and the output is hidden,
Figure 712352DEST_PATH_IMAGE039
is shown as
Figure 986338DEST_PATH_IMAGE040
The number of the element categories is one,
Figure 397335DEST_PATH_IMAGE041
to express the layout
Figure 440377DEST_PATH_IMAGE040
And (4) application scenes.
7. The method for generating a layout based on discrete potential representations according to claim 1, wherein in step 3, a cross-entropy function and a commitment loss are used in a training process to perform parameter adjustment on the layout generation network, and a specific expression thereof is as follows:
Figure 514512DEST_PATH_IMAGE042
in the formula (I), the compound is shown in the specification,
Figure 72533DEST_PATH_IMAGE043
representing the reconstruction loss of the model calculated using the cross-entropy method,
Figure 398472DEST_PATH_IMAGE044
a weight coefficient representing the loss of commitment,
Figure 686234DEST_PATH_IMAGE045
the stop gradient operator is represented by the stop gradient operator,
Figure 134533DEST_PATH_IMAGE046
representing a bounding box sequence of reconstructed elements.
8. A layout generating apparatus comprising a computer memory, a computer processor, and a computer program in the computer memory and executable on the computer processor, wherein the computer memory employs the layout generation model and the one-way fransformer model of claim 1; the computer processor, when executing the computer program, performs the steps of:
inputting the element category sequence and the application scene requirement of the planar design drawing into a one-way Transformer model, and using the layout discrete potential representation output by the one-way Transformer model as the input of a reconstruction module in a layout generation model to obtain the high-quality design drawing layout meeting the element category and the scene constraint.
CN202211671875.5A 2022-12-26 2022-12-26 Layout generation method and device based on discrete potential representation Active CN115659852B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211671875.5A CN115659852B (en) 2022-12-26 2022-12-26 Layout generation method and device based on discrete potential representation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211671875.5A CN115659852B (en) 2022-12-26 2022-12-26 Layout generation method and device based on discrete potential representation

Publications (2)

Publication Number Publication Date
CN115659852A true CN115659852A (en) 2023-01-31
CN115659852B CN115659852B (en) 2023-03-21

Family

ID=85023162

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211671875.5A Active CN115659852B (en) 2022-12-26 2022-12-26 Layout generation method and device based on discrete potential representation

Country Status (1)

Country Link
CN (1) CN115659852B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018224690A1 (en) * 2017-06-09 2018-12-13 Deepmind Technologies Limited Generating discrete latent representations of input data items
CN109360232A (en) * 2018-09-10 2019-02-19 南京邮电大学 The indoor scene layout estimation method and device of confrontation network are generated based on condition
CN112734873A (en) * 2020-12-31 2021-04-30 北京深尚科技有限公司 Image attribute editing method, device, equipment and medium for resisting generation network
CN113177633A (en) * 2021-04-20 2021-07-27 浙江大学 Deep decoupling time sequence prediction method
CN113393550A (en) * 2021-06-15 2021-09-14 杭州电子科技大学 Fashion garment design synthesis method guided by postures and textures
CN115169227A (en) * 2022-07-04 2022-10-11 四川大学 Design concept generation network construction method and concept scheme automatic generation method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018224690A1 (en) * 2017-06-09 2018-12-13 Deepmind Technologies Limited Generating discrete latent representations of input data items
CN109360232A (en) * 2018-09-10 2019-02-19 南京邮电大学 The indoor scene layout estimation method and device of confrontation network are generated based on condition
CN112734873A (en) * 2020-12-31 2021-04-30 北京深尚科技有限公司 Image attribute editing method, device, equipment and medium for resisting generation network
CN113177633A (en) * 2021-04-20 2021-07-27 浙江大学 Deep decoupling time sequence prediction method
CN113393550A (en) * 2021-06-15 2021-09-14 杭州电子科技大学 Fashion garment design synthesis method guided by postures and textures
CN115169227A (en) * 2022-07-04 2022-10-11 四川大学 Design concept generation network construction method and concept scheme automatic generation method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
兰红;刘秦邑;: "图注意力网络的场景图到图像生成模型" *

Also Published As

Publication number Publication date
CN115659852B (en) 2023-03-21

Similar Documents

Publication Publication Date Title
Frolov et al. Adversarial text-to-image synthesis: A review
He et al. InSituNet: Deep image synthesis for parameter space exploration of ensemble simulations
WO2021223567A1 (en) Content processing method and apparatus, computer device, and storage medium
Hou et al. Guidedstyle: Attribute knowledge guided style manipulation for semantic face editing
Bucak et al. Incremental subspace learning via non-negative matrix factorization
WO2021027256A1 (en) Method and apparatus for processing interactive sequence data
EP4181026A1 (en) Recommendation model training method and apparatus, recommendation method and apparatus, and computer-readable medium
Zhao et al. Modeling fonts in context: Font prediction on web designs
WO2015062209A1 (en) Visualized optimization processing method and device for random forest classification model
WO2021139415A1 (en) Data processing method and apparatus, computer readable storage medium, and electronic device
CN107545301B (en) Page display method and device
Zhang et al. Stylistic scene enhancement GAN: mixed stylistic enhancement generation for 3D indoor scenes
CN115599984B (en) Retrieval method
CN115994990A (en) Three-dimensional model automatic modeling method based on text information guidance
CN113868466B (en) Video recommendation method, device, equipment and storage medium
Li et al. Instant3d: Instant text-to-3d generation
CN115659852B (en) Layout generation method and device based on discrete potential representation
US8868478B2 (en) Tensor trace norm and inference systems and recommender systems using same
CN116703523A (en) Electronic commerce system based on big data and method thereof
CN117251622A (en) Method, device, computer equipment and storage medium for recommending objects
CN116204628A (en) Logistics knowledge neural collaborative filtering recommendation method with enhanced knowledge graph
CN115309997A (en) Commodity recommendation method and device based on multi-view self-coding features
Babaee et al. Immersive interactive information mining with application to earth observation data retrieval
CN110659962B (en) Commodity information output method and related device
CN113065321A (en) User behavior prediction method and system based on LSTM model and hypergraph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant