CN116383660A - Component classification model training method and device - Google Patents

Component classification model training method and device Download PDF

Info

Publication number
CN116383660A
CN116383660A CN202310409354.0A CN202310409354A CN116383660A CN 116383660 A CN116383660 A CN 116383660A CN 202310409354 A CN202310409354 A CN 202310409354A CN 116383660 A CN116383660 A CN 116383660A
Authority
CN
China
Prior art keywords
component
vector
structure tree
classification model
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310409354.0A
Other languages
Chinese (zh)
Inventor
孙子钧
刘洋
张天宇
吴通通
杨帆
赵薇
柳景明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kanyun Software Co ltd
Original Assignee
Beijing Kanyun Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kanyun Software Co ltd filed Critical Beijing Kanyun Software Co ltd
Priority to CN202310409354.0A priority Critical patent/CN116383660A/en
Publication of CN116383660A publication Critical patent/CN116383660A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/36Software reuse
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/38Creation or generation of source code for implementing user interfaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/75Structural analysis for program understanding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0985Hyperparameter optimisation; Meta-learning; Learning-to-learn
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Editing Of Facsimile Originals (AREA)

Abstract

The specification provides a component classification model training method and device, wherein the component classification model training method comprises the following steps: acquiring a component structure tree, wherein component nodes in the component structure tree contain component attribute information; reconstructing the component structure tree to obtain a component table for recording component attribute information, and performing embedding processing on the component attribute information recorded in the component table to obtain a component embedding vector; inputting the component embedded vector to an initial component classification model for processing to obtain predicted component classification information corresponding to component nodes in the component structure tree; and adjusting parameters of the initial component classification model based on the reference component classification information and the prediction component classification information corresponding to the component nodes in the component structure tree until the component classification model meeting the training stop condition is obtained.

Description

Component classification model training method and device
Technical Field
The present disclosure relates to the field of machine learning technologies, and in particular, to a method and an apparatus for training a component classification model.
Background
With the development of internet technology, more and more services start to be online, and various application programs and web pages are used as bridges for carrying services and interacting with users along with the online service, so that the service provider designs the services more attractive and convenient; when designing a component in an application program or a web page, a designer needs to build a component library uniformly in order to maintain design consistency, improve production efficiency, manage a design system and the like in the process of UI design, and share the component library with other designers, so that the component design efficiency can be improved, and the designed component can be managed more conveniently. However, when the designed components are placed in the warehouse, the components are required to be identified and classified to be stored, and in the prior art, most of the components are identified and classified manually, so that the efficiency is low, more time and manpower resources are consumed, and an effective scheme is needed to solve the problems.
Disclosure of Invention
In view of this, the present embodiments provide a component classification model training method. The present disclosure also relates to a component classification model training apparatus, a component classification method, a component classification apparatus, a computing device, and a computer-readable storage medium, for solving the technical drawbacks of the prior art.
According to a first aspect of embodiments of the present disclosure, there is provided a component classification model training method, including:
acquiring a component structure tree, wherein component nodes in the component structure tree contain component attribute information;
reconstructing the component structure tree to obtain a component table for recording component attribute information, and performing embedding processing on the component attribute information recorded in the component table to obtain a component embedding vector;
inputting the component embedded vector to an initial component classification model for processing to obtain predicted component classification information corresponding to component nodes in the component structure tree;
and adjusting parameters of the initial component classification model based on the reference component classification information and the prediction component classification information corresponding to the component nodes in the component structure tree until the component classification model meeting the training stop condition is obtained.
Optionally, the acquiring the component structure tree includes:
acquiring an initial component structure tree corresponding to a sample object;
performing structure detection on the initial component structure tree according to a training strategy of the initial component classification model;
taking the initial component structure tree as the component structure tree under the condition that the structure detection result meets the preset structure detection condition;
and under the condition that the structure detection result does not meet the preset structure detection condition, segmenting the initial assembly structure tree, and determining the assembly structure tree according to the segmentation result.
Optionally, the reconstructing the component structure tree to obtain a component table for recording component attribute information includes:
determining node attribute types corresponding to all the component nodes in the component structure tree according to the component attribute information contained in the component nodes in the component structure tree;
selecting an updating strategy for each component node in the component structure tree according to the node attribute type, and updating component attribute information contained in each component node by utilizing the updating strategy;
traversing the updated component attribute information contained in the component nodes in the component structure tree, and generating a component table for recording the updated component attribute information according to the traversing result.
Optionally, the generating the component table for recording the updated component attribute information according to the traversal result includes:
generating an initial component table for recording updated component attribute information according to the traversing result;
according to the model training strategy of the initial component classification model, inserting columns for recording component node level information and component node sequence information into the initial component table;
and generating a component table for recording the updated component attribute information according to the insertion result.
Optionally, the embedding processing is performed on the component attribute information recorded in the component table to obtain a component embedding vector, including:
respectively carrying out embedding processing on component attribute information corresponding to each row of tables in the component tables to obtain sub-component embedding vectors corresponding to each row of tables;
and merging the sub-component embedded vectors corresponding to each row of tables to obtain the component embedded vector.
Optionally, the determining of the sub-component embedded vector corresponding to any target row table in the component table includes:
determining target component attribute information corresponding to a target row table in the component table, and reading sequence identification information, hierarchy identification information, text information and characteristic information from the target component attribute information;
Respectively embedding the sequence identification information, the hierarchy identification information, the text information and the characteristic information to obtain a sequence identification vector, a hierarchy identification vector, a text vector and a characteristic vector;
and splicing the sequence identification vector, the hierarchy identification vector, the text vector and the feature vector to obtain a sub-component embedded vector corresponding to the target line table.
Optionally, the inputting the component embedded vector to an initial component classification model for processing to obtain predicted component classification information corresponding to component nodes in the component structure tree includes:
inputting the component embedded vector into the initial component classification model, and carrying out coding processing on the component embedded vector through an encoder in the initial component classification model to obtain a component coding vector;
determining a hidden state vector based on the component encoding vector, and taking the hidden state vector as a component classification vector;
and decoding the component classification vector through a decoder in the initial component classification model to obtain prediction component classification information corresponding to the component nodes in the component structure tree.
Optionally, the encoding the component embedded vector by an encoder in the initial component classification model to obtain a component encoded vector includes:
calculating the embedded coding vector and the weight matrix by using an encoder in the initial component classification model to obtain a query vector, a key vector and a value vector;
and carrying out similarity score calculation according to the query vector, the key vector and the value vector, and determining a component coding vector containing the hierarchical dependency relationship and the node dependency relationship according to a calculation result.
Optionally, the adjusting parameters of the initial component classification model based on the reference component classification information and the predicted component classification information corresponding to the component nodes in the component structure tree until obtaining a component classification model meeting the training stop condition includes:
acquiring reference component classification information corresponding to component nodes in the component structure tree;
calculating the reference component classification information and the prediction component classification information according to a cross entropy loss function to obtain a target loss value;
and adjusting parameters of the initial component classification model by utilizing the target loss value until the component classification model meeting the training stopping condition is obtained.
According to a second aspect of embodiments of the present specification, there is provided a component classification method, comprising:
acquiring a target component structure tree corresponding to a target object;
inputting the target component structure tree into a component classification model in the method to process, and obtaining component type information corresponding to each target component node in the target component structure tree;
and classifying the target components contained in the target object according to the component type information, and executing component processing tasks according to classification results.
According to a third aspect of embodiments of the present specification, there is provided a component classification model training apparatus, comprising:
the device comprises an acquisition module, a storage module and a storage module, wherein the acquisition module is configured to acquire a component structure tree, and component nodes in the component structure tree contain component attribute information;
the reconstruction module is configured to reconstruct the component structure tree to obtain a component table for recording component attribute information, and perform embedding processing on the component attribute information recorded in the component table to obtain a component embedding vector;
the processing module is configured to input the component embedding vector into an initial component classification model for processing to obtain prediction component classification information corresponding to component nodes in the component structure tree;
And the parameter adjusting module is configured to adjust parameters of the initial component classification model based on the reference component classification information and the prediction component classification information corresponding to the component nodes in the component structure tree until the component classification model meeting the training stop condition is obtained.
According to a fourth aspect of embodiments of the present specification, there is provided a component sorting apparatus comprising:
the acquisition structure tree module is configured to acquire a target component structure tree corresponding to the target object;
the input model module is configured to input the target component structure tree into the component classification model in the method to process, and component type information corresponding to each target component node in the target component structure tree is obtained;
and the classifying component module is configured to classify the target components contained in the target object according to the component type information and execute component processing tasks according to classification results.
According to a fifth aspect of embodiments of the present specification, there is provided a computing device comprising:
a memory and a processor;
the memory is configured to store computer-executable instructions that, when executed, implement the steps of the component classification model training method or the component classification method.
According to a sixth aspect of embodiments of the present specification, there is provided a computer readable storage medium storing computer executable instructions which, when executed by a processor, implement the steps of the component classification model training method or component classification method.
In order to improve the component classification efficiency, the component classification model training method provided by the specification can construct an initial component classification model capable of identifying and classifying components, then a component structure tree can be acquired firstly, component nodes in the component structure tree contain component attribute information, at the moment, the component structure tree can be reconstructed to obtain a component table for recording the component attribute information, at the moment, embedding processing is performed on the component attribute information recorded in the component table, and then component embedding vectors corresponding to the component structure tree can be obtained; further, the component embedded vector is input into an initial component classification model for processing, prediction component classification information corresponding to component nodes in the component structure tree can be obtained, and the initial component classification model is subjected to parameter adjustment based on reference component classification information and prediction component classification information corresponding to the component nodes in the component structure tree, so that one-time training can be completed, and the like until the component classification model meeting the training stop condition is obtained. The components are classified in a modeling mode, so that the component identification and classification efficiency can be effectively improved, and downstream service use is facilitated.
Drawings
FIG. 1 is a schematic diagram of a training method for a component classification model according to an embodiment of the present disclosure;
FIG. 2 is a flow chart of a component classification model training method according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of a component structure tree in a training method of a component classification model according to an embodiment of the present disclosure;
FIG. 4 is a flow chart of a component classification method according to an embodiment of the present disclosure;
FIG. 5 is a process flow diagram of a component classification method according to an embodiment of the present disclosure;
FIG. 6 is a schematic structural diagram of a training device for component classification model according to an embodiment of the present disclosure;
FIG. 7 is a schematic diagram of a component classification apparatus according to an embodiment of the disclosure;
FIG. 8 is a block diagram of a computing device according to one embodiment of the present disclosure.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many other forms than described herein and similarly generalized by those skilled in the art to whom this disclosure pertains without departing from the spirit of the disclosure and, therefore, this disclosure is not limited by the specific implementations disclosed below.
The terminology used in the one or more embodiments of the specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the specification. As used in this specification, one or more embodiments and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that, although the terms first, second, etc. may be used in one or more embodiments of this specification to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first may also be referred to as a second, and similarly, a second may also be referred to as a first, without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
First, terms related to one or more embodiments of the present specification will be explained.
Transformer model: a deep learning model adopts an attention mechanism to carry out differential weighting on the importance of each part of input data, and is widely applied to various natural language processing tasks.
In the present specification, a component classification model training method is provided, and the present specification relates to a component classification model training apparatus, a component classification method, a component classification apparatus, a computing device, and a computer-readable storage medium, which are described in detail one by one in the following embodiments.
Referring to the schematic diagram shown in fig. 1, in order to improve component classification efficiency, the component classification model training method provided in the present disclosure may construct an initial component classification model capable of identifying and classifying components, and then may first obtain a component structure tree, where component nodes in the component structure tree include component attribute information, and at this time, may reconstruct the component structure tree to obtain a component table for recording component attribute information, and at this time, perform embedding processing on the component attribute information recorded in the component table, so as to obtain a component embedding vector corresponding to the component structure tree; further, the component embedded vector is input into an initial component classification model for processing, prediction component classification information corresponding to component nodes in the component structure tree can be obtained, and the initial component classification model is subjected to parameter adjustment based on reference component classification information and prediction component classification information corresponding to the component nodes in the component structure tree, so that one-time training can be completed, and the like until the component classification model meeting the training stop condition is obtained. The components are classified in a modeling mode, so that the component identification and classification efficiency can be effectively improved, and downstream service use is facilitated.
Fig. 2 shows a flowchart of a component classification model training method according to an embodiment of the present disclosure, specifically including the following steps:
step S202, obtaining a component structure tree, wherein component nodes in the component structure tree contain component attribute information.
The component classification model trained by the component classification model training method provided by the embodiment can be used for identifying the types of components which are designed arbitrarily, so that the components with the identified types can be written into the component library conveniently, and the associated components can be selected according to the types for use in the application stage conveniently. The component refers to a reusable element for insertion into a user interaction page, and includes, but is not limited to, various design elements such as text boxes, shapes, icons, and touch buttons, which are not limited in this embodiment.
In practical application, a designer usually completes the design of a component on a component design draft and writes the component into a component library after the design is completed, so that the component library can be reused when a user interacts with the page design, in the design stage, one design draft may contain a plurality of designed components, and the designed components generally have an embedded relationship or an association relationship, that is, when the components are components which need to be matched for use, such as a page design of a shopping cart for adding goods, the designer usually designs a goods information display component, a goods price display component, a shopping cart component, a purchasing component and the like at the same time, and the components have a hierarchical relationship. However, after the structure tree is constructed, the determination of the component class is usually completed manually when writing data, which consumes a lot of manpower resources. In order to save human resources and improve the classification efficiency of the components, the component classification model training method provided by the embodiment realizes modeling aiming at the component classification task, and the modeled and trained component classification model is used for replacing manual completion of component classification, so that the component recognition and classification efficiency can be effectively improved, and the downstream service can be conveniently used.
Specifically, the component structure tree is a sample for training an initial component classification model, which is constructed according to a hierarchical relationship corresponding to components contained in a design draft, each component node is a node constructed by a component corresponding to a layer, and each component node contains component attribute information for recording the attribute of the component corresponding to the component node; correspondingly, the component attribute information specifically refers to attribute description information of a component corresponding to the component node, and the attribute description information includes component types, component attribute values, component hierarchical relationships and the like.
Based on the above, in the model training stage, in order to improve the model recognition precision and complete the type recognition of all the components contained in the design draft at the same time, the component structure tree serving as a sample can be acquired first, and the component nodes in the component structure tree contain component attribute information corresponding to the components, so that the model training can be completed by using the component attribute information.
Further, considering that the component structure tree acquired in the model training stage may not be a structure tree that can be directly applied, the acquired initial component structure tree may be segmented to obtain a component structure tree that meets the condition; in this embodiment, the specific implementation manner is as follows:
Acquiring an initial component structure tree corresponding to a sample object; performing structure detection on the initial component structure tree according to a training strategy of the initial component classification model; taking the initial component structure tree as the component structure tree under the condition that the structure detection result meets the preset structure detection condition; and under the condition that the structure detection result does not meet the preset structure detection condition, segmenting the initial assembly structure tree, and determining the assembly structure tree according to the segmentation result.
Specifically, the sample object specifically refers to a sample component design draft, and correspondingly, the initial component structure tree specifically refers to a component structure tree generated based on the sample component design draft; correspondingly, the training strategy specifically refers to a strategy to be observed in the training stage of the initial component classification model, and is used for limiting the size of the input content of the model. Correspondingly, the prediction structure detection condition specifically refers to a condition for detecting whether an initial component structure tree can be directly used for model training, so that the problem that the initial structure tree is too much in level and exceeds the upper prediction limit of a model and cannot be used is avoided.
Based on the above, after the initial component structure tree corresponding to the sample object is obtained, in order to avoid the problem that the initial component structure tree cannot be used, the initial component structure tree may be firstly subjected to structure detection according to a training strategy of the initial component classification model; under the condition that the structure detection result meets the preset structure detection condition, the initial component structure tree can be directly used, and the initial component structure tree is taken as a component structure tree; under the condition that the structure detection result does not meet the preset structure detection condition, the initial assembly structure tree is too large in size and cannot be directly used, so that the initial assembly structure tree can be segmented first, and the assembly structure tree can be determined according to the segmentation result. In practical application, when the initial component structure tree is segmented, the segmentation point may be selected according to the actual requirement, and the embodiment is not limited in any way.
For example, referring to the component structure tree shown in fig. 3 (a), the root node is root, three child nodes are located below the root node, and are respectively c1_1, c1_2, c1_3, c2_1 and c3_1 nodes located below c1_3, when the size of the component structure tree is determined to be too large, the c1_3 node may be selected as a splitting point, and then the component structure tree is split, so as to obtain a new component structure tree, where the c1_3 node is a root node, the child node is c2_1 node, and the child node of c2_1 is a c3_1 node for downstream service.
In summary, after the initial component structure tree is obtained, in order to facilitate downstream service usage, the component structure tree may be segmented according to actual requirements, so as to obtain a component structure tree that can be used for training a model, so as to improve model training accuracy.
Step S204, reconstructing the component structure tree to obtain a component table for recording component attribute information, and performing embedding processing on the component attribute information recorded in the component table to obtain a component embedding vector.
Specifically, after the component structure tree is obtained and the component nodes in the component structure tree contain component attribute information, further, in order to enable the model to learn to combine the hierarchical relationship among the component nodes in the component structure tree to complete component classification in a model training stage, the component structure tree can be firstly reconstructed to obtain a component table for recording the component attribute information, the component attribute information of the component corresponding to the component nodes is recorded in the component table, and then the component attribute information recorded in the table is subjected to embedding processing based on the table to obtain component embedding vectors corresponding to the component structure tree, so that the downstream service can conveniently train an initial component classification model by using the component embedding vectors of the component structure tree, the component hierarchical relationship and the dependency relationship among the components can be learned by the model, and the component classification accuracy of the component classification model is improved.
The component table specifically refers to a table in which each row corresponds to each component node, and each column in each row records component attribute information; accordingly, the embedding process specifically refers to a process of converting component attribute information in the component table into a low-dimensional vector for use in a subsequent model completion prediction.
Further, when the component structure tree is reconstructed, the component attribute information of different types is updated to obtain the component attribute information which can be recorded in the table; in this embodiment, the specific implementation manner is as follows:
determining node attribute types corresponding to all the component nodes in the component structure tree according to the component attribute information contained in the component nodes in the component structure tree; selecting an updating strategy for each component node in the component structure tree according to the node attribute type, and updating component attribute information contained in each component node by utilizing the updating strategy; traversing the updated component attribute information contained in the component nodes in the component structure tree, and generating a component table for recording the updated component attribute information according to the traversing result.
Specifically, the node attribute type specifically refers to a type corresponding to the component node, including but not limited to a text type, a digital type and an enumeration type; accordingly, the selection update policy specifically refers to a policy selected in combination with a node attribute type, for converting an enumeration type into a digital type, where the digital type and the text type are unchanged.
Based on this, after the component structure tree is obtained, in order to be able to use it to train the initial component classification model, the node attribute type corresponding to each component node in the component structure tree may be determined according to the component attribute information contained in the component nodes in the component structure tree; then, according to the node attribute type, selecting an updating strategy for each component node in the component structure tree, wherein the updating strategy is used for realizing that the component node selection for the text type and the digital type is not changed, the component node selection for the enumeration type is changed into the digital type, and then, updating the component attribute information contained in each component node by utilizing the updating strategy; the method can convert the types of the component nodes into text types and digital types, avoid the problem of recording more complex information caused by enumeration types, and then traverse the updated component attribute information contained in the component nodes in the component structure tree, so that a component table for recording the updated component attribute information can be generated according to the traversing result.
In summary, by unifying the types of the component nodes into the types of the convenient records, when the component table is generated, the component attribute information corresponding to each component node can be recorded through each row, so that the subsequent use of generating the embedded vector is convenient.
Furthermore, after traversing, in order to fuse the component hierarchy relationship and the component dependency relationship in the model training stage, the model can fully learn the capability, thereby improving the model training precision, and a new column can be inserted into the table; in this embodiment, the specific implementation manner is as follows:
generating an initial component table for recording updated component attribute information according to the traversing result; according to the model training strategy of the initial component classification model, inserting columns for recording component node level information and component node sequence information into the initial component table; and generating a component table for recording the updated component attribute information according to the insertion result.
Specifically, the initial component table specifically refers to a component table generated according to updated component attribute information, which records only own attributes of each table and does not record hierarchical relationships between nodes; correspondingly, the model training strategy specifically refers to an information strategy which needs to be used when training the model, and is used for processing the sample into a component form meeting the model training requirement in a sample preprocessing stage; correspondingly, the component node level information specifically refers to information for recording the level relationship among component nodes in the component structure tree; correspondingly, the component node sequence information specifically refers to information for recording the traversal sequence of the component nodes in the component structure tree.
Based on the above, after traversing the updated component attribute information contained in the component nodes in the component structure tree, an initial component table for recording the updated component attribute information can be generated according to the traversing result; in this case, the hierarchical relationship and the sequential relationship between the nodes are not recorded in the table, and the relationship is the basis that the training model can complete component classification in combination with the context, so that the columns for recording the hierarchical information of the component nodes and the sequential information of the component nodes can be inserted into the initial component table according to the model training strategy of the initial component classification model; and realizing the generation of a component table recording updated component attribute information according to the insertion result.
Along the above example, after obtaining the component structure tree as shown in (a) of fig. 3, it may be determined that the root node in the component structure tree is a root node, three child nodes are located under the root node, and are respectively c1_1, c1_2, c1_3, c2_1 and c2_1 under c1_3, and c3_1 are located under c2_1, where each component node corresponds to a respective attribute list, and it is determined that the component nodes refer to three types, that is, a text type, a numeric type and an enumeration type. The text type attribute information can be directly reserved in the text attribute field, and similarly, the digital type attribute information can also be directly reserved. And the attribute information of the enumeration type needs to be mapped to the digital type, when the value of the original attribute attr_t is one of (min, middle, max), the attribute map= { min:1, middle:2, max:3} can be obtained after the attribute map= { min, middle, max }, and the enumeration type is changed into the digital type.
At this time, each component node has a corresponding attribute list, and the root node is that the naming rule of each child node may be completed according to C { level } { order }, that is, the naming of each component node and the attribute list are determined as shown in fig. 3 (a), where c1_2 represents the 2 nd (order) node of the first layer (level) in the component structure tree, and similarly, explanation of other nodes may be referred to explanation of c1_2, which is not limited herein.
Further, after obtaining the component structure tree with the processed attribute information, a breadth-first traversal algorithm may be used to implement the conversion of the component structure tree into a component table according to the order of breadth-first traversal, where each row in the component table corresponds to one component node, and each column corresponds to attribute information of one dimension, which is used to implement the traversal of the component structure tree shown in fig. 3 (a) to obtain a component table shown in the following table (1):
Figure BDA0004185397810000071
Figure BDA0004185397810000081
(1)
the nid column records the component node identification, the pid column records the hierarchy in the component structure tree where the component node is located, and the oid column records the unique global sequence id of the component node, so that the hierarchy information in the original component structure tree can be reserved through pid and oid. text columns record text contents in the component node attribute list, and attr_1-attr_N columns record the component node attribute information in different dimensions, so that the subsequent embedding processing by using the component table shown in the table (1) is convenient, and the component node is used for training a component classification model.
In summary, by reconstructing the component structure tree into the component table, the hierarchical information and the global sequence information between the component nodes can be recorded through the component table, so that the information can be recorded after the embedded vector is constructed later, and learning can be realized in the model training stage, thereby improving the model prediction precision.
After the component table is obtained, embedding each row of table in the component table respectively to obtain component embedded vectors corresponding to the component structure tree; in this embodiment, the specific implementation manner is as follows:
respectively carrying out embedding processing on component attribute information corresponding to each row of tables in the component tables to obtain sub-component embedding vectors corresponding to each row of tables; and merging the sub-component embedded vectors corresponding to each row of tables to obtain the component embedded vector.
Specifically, the sub-component embedded vector specifically refers to a vector expression obtained after embedding component attribute information corresponding to each row of tables in the component tables. Based on the above, after obtaining the component table recording the component attribute information, the component attribute information corresponding to each row of table in the component table can be respectively embedded, so as to obtain the sub-component embedded vector corresponding to each row of table; and then merging the sub-component embedded vectors corresponding to each row of the tables to obtain the component embedded vector corresponding to the component structure tree, so as to facilitate the use of downstream services.
The determining of the sub-component embedded vector corresponding to any target row table in the component table comprises the following steps: determining target component attribute information corresponding to a target row table in the component table, and reading sequence identification information, hierarchy identification information, text information and characteristic information from the target component attribute information; respectively embedding the sequence identification information, the hierarchy identification information, the text information and the characteristic information to obtain a sequence identification vector, a hierarchy identification vector, a text vector and a characteristic vector; and splicing the sequence identification vector, the hierarchy identification vector, the text vector and the feature vector to obtain a sub-component embedded vector corresponding to the target line table.
Specifically, the target row table refers to a row table corresponding to any one of the component nodes in the component table, and in this embodiment, only one target row table is used as an example to describe the sub-embedded vector, and the other may refer to the same or corresponding description in this embodiment, which is not repeated herein. Correspondingly, the sequence identification information specifically refers to the global unique sequence identification of the corresponding component node of the target row table; correspondingly, the hierarchy identification information specifically refers to the identification of the hierarchy to which the component node belongs in the component structure tree; correspondingly, the text information specifically refers to text characters in the component attribute information; correspondingly, the feature vector specifically refers to attribute features corresponding to the component nodes in different dimensions.
Based on the above, firstly, the target component attribute information corresponding to the target row table can be determined in the component table, and secondly, the sequence identification information, the hierarchy identification information, the text information and the characteristic information are read from the target component attribute information; respectively embedding the sequence identification information, the hierarchy identification information, the text information and the characteristic information to obtain a sequence identification vector corresponding to the sequence identification information, a hierarchy identification vector corresponding to the hierarchy identification information, a text vector corresponding to the text information and a characteristic vector corresponding to the characteristic information; and finally, splicing the sequence identification vector, the hierarchy identification vector, the text vector and the feature vector to obtain the sub-component embedded vector corresponding to the target line table.
Along the above example, after obtaining the component table shown in table (1), the embedding processing operation may be performed in accordance with the embedding processing manner shown in fig. 3 (b). Namely: selecting pid corresponding to each component node in the component table, and then performing vector mapping processing on the pid value corresponding to each component node in the L_E stage by using a set function to obtain a vector level Embled with length 768 corresponding to each component node in the L_E stage; then Oid corresponding to each component node is selected from the component table, and then the set function can be used for carrying out vector mapping processing on the Oid value corresponding to each component node in the O_E stage to obtain a vector OrderEmbed with length 768 corresponding to each component node in the O_E stage; selecting text corresponding to each component in the component table, then using a BERT model to encode characters in a T_E stage, taking a hidden state vector of the last layer of the BERT model to perform averaging treatment, and obtaining a vector text Embed with length 768 corresponding to each component node in the T_E stage; and selecting attr_1-attr_N corresponding to each component node in the component table, and then performing vector mapping processing on the attr_1-attr_N value corresponding to each component node in the A_E stage by using a full connection layer to obtain a vector AttributeEb with the length 768 corresponding to each component node in the A_E stage.
At this time, the text information text of each component node is encoded into a vector textEmbed with the length 768 through a mapping function; the hierarchical information pid and oid is encoded into two OrderEmbedded and LevelEmbedded lengths 768; the attribute information attr_1-attr_N is encoded into an attributeEmbed with the length of 768; and finally, splicing TextEmbed, orderEmbed, levelEmbed and attributeEmbedded corresponding to each component node to obtain an embedded vector inputEmbedding corresponding to the component table, so that the subsequent training and use of the initial component classification model are facilitated. In practical applications, when oid is converted into a vector by using a set function, it is actually a function of converting information into 768 dimensions.
In summary, by performing embedding processing by combining component attribute information of different component nodes in the component table and fusing the sub-component embedded vectors obtained by the embedding processing, generation of the embedded vectors can be completed by fusing component attribute information, component level information and component sequence information, and when a model is trained subsequently, the model can learn context information to complete component classification by combining the information, so that higher component classification accuracy can be ensured.
Step S206, inputting the component embedding vector into an initial component classification model for processing to obtain prediction component classification information corresponding to the component nodes in the component structure tree.
Specifically, after the component embedded vector corresponding to the component structure tree is obtained, the component embedded vector can be input into the initial component classification model for processing at the moment, so that the initial component classification model is used as a classifier to classify components corresponding to each component node in the component structure tree, and predicted component classification information predicted by the initial component classification model is obtained, so that the label corresponding to the component structure tree can be combined conveniently, and parameter adjustment of the model can be completed.
The initial component classification model specifically refers to a model capable of classifying components corresponding to each component node in a component structure tree, and the model is realized by using Transformers. Correspondingly, the predicted component classification information specifically refers to class information obtained after the pointers identify the components corresponding to each component node in the component structure tree, and the class information includes, but is not limited to, text box class information, shape class information, icon class information, button class information and the like.
Further, when the initial component classification model recognizes the component category remembering corresponding to the component node in the component structure tree, the initial component classification model is actually the processing completed by the model with the encoding and decoding structure; in this embodiment, the specific implementation manner is as follows:
inputting the component embedded vector into the initial component classification model, and carrying out coding processing on the component embedded vector through an encoder in the initial component classification model to obtain a component coding vector; determining a hidden state vector based on the component encoding vector, and taking the hidden state vector as a component classification vector; and decoding the component classification vector through a decoder in the initial component classification model to obtain prediction component classification information corresponding to the component nodes in the component structure tree.
Specifically, the component coding vector specifically refers to a vector expression obtained by coding the component embedded vector by using an encoder in the initial component classification model, and is used for extracting the feature of the component corresponding to each component node, and the corresponding hidden state vector specifically refers to a vector expression formed by selecting the element of the first dimension in the component coding vector, which fuses the upper and lower hierarchical relationship of the component nodes, the dependency relationship among the nodes and the like.
Based on the above, after obtaining the component embedded vector corresponding to the component structure tree, the component embedded vector may be input to the initial component classification model first, so that the component embedded vector may be encoded by an encoder in the initial component classification model, and then a component encoded vector may be obtained; in order to classify the components corresponding to each component node in the component structure tree, the hidden state vector can be determined based on the component coding vector, and the hidden state vector is used as a component classification vector; and finally, decoding the component classification vector through a decoder in the initial component classification model to obtain the predicted component classification information corresponding to the component nodes in the component structure tree.
Furthermore, in the process of coding, the method is completed by combining a multi-head attention mechanism; in this embodiment, the specific implementation manner is as follows:
calculating the embedded coding vector and the weight matrix by using an encoder in the initial component classification model to obtain a query vector, a key vector and a value vector; and carrying out similarity score calculation according to the query vector, the key vector and the value vector, and determining a component coding vector containing the hierarchical dependency relationship and the node dependency relationship according to a calculation result.
Based on this, in the encoding stage with the encoder in the initial component classification model, the embedded encoding vector and the weight matrix may be calculated with the encoder in the initial component classification model to obtain a query vector, a key vector, and a value vector; and then, similarity score calculation is carried out according to the query vector, the key vector and the value vector, so that component coding vectors containing the hierarchical dependency relationship and the node dependency relationship can be determined according to calculation results.
That is, using the Transformers model as the component classification model, the core module self-intent module of the Transformers model may be used. The self-attention mechanism may treat each element in the input sequence (such as each word) as a query, a key, and a value, calculate similarity scores between them, and weight the weighted sum of all values with these scores to obtain an output result, i.e., a component code vector.
In an implementation, the input sequence may be represented as a matrix X (component embedded vector) in which each row represents an element (such as a word) and each column represents a dimension of the input vector. Thereafter, a vector representation of the query, key and value is calculated by multiplying the matrix X by three weight matrices (representing the query, key and value, respectively), namely: q=xw Q ,K=XW K ,V=XW v Where W Q, W and WV are both learned parameter matrices.
On this basis, the similarity score between the query vector and the key vector can be calculated by calculating the dot product between them and dividing the result by a scalar. Specifically, assume that there is a query vector q and a key vector k, whose dot product is: score (q, k) =q×k. And the subsequent calculation is performed by analogy. The implementation can capture long dependency through a transformation mechanism, can cope with dependency of different positions, and can better model the hierarchical relationship and the information of father and son nodes.
Along the above example, after obtaining the embedded vector inputEmbedding, the embedded vector inputEmbedding corresponding to the component structure tree may be input to the converters model for processing, and the processing determines that the component type corresponding to the C1_1 component node in the component structure tree is a text box, and the component type corresponding to the C1_2 component node is a text box; the component type corresponding to the C1_3 component node is an icon, the component type corresponding to the C2_1 component node is a button, and the component type corresponding to the C3_1 component node is a button, so that the Transformers model is used for carrying out parameter adjustment by combining with the sample label.
In sum, by adopting the Transformers model with the encoding and decoding structures as the component classification model, the hierarchical relationship and the dependency relationship between the component nodes can be fused in the component classification stage, so that the model can learn to combine the hierarchical relationship to finish the component classification, and the model classification precision is effectively improved.
And step S208, performing parameter adjustment on the initial component classification model based on the reference component classification information and the prediction component classification information corresponding to the component nodes in the component structure tree until a component classification model meeting the training stop condition is obtained.
Specifically, after the predicted component classification information corresponding to each component node in the component structure tree is obtained, further, in order to train a model meeting the use requirement, the initial component classification model may be adjusted based on the reference component classification information and the predicted component classification information corresponding to the component node in the component structure tree until a component classification model meeting the training stop condition is obtained.
The reference component classification information specifically refers to real component classification information corresponding to component nodes in the component structure tree, and the corresponding training stop condition specifically refers to a condition for stopping training of the initial component classification model, including, but not limited to, a loss value comparison condition, an iteration number condition, or a verification condition, which is not limited in this embodiment.
Furthermore, in the parameter adjustment stage, the calculation of the loss value can be completed by combining the prediction result and the label, so that the model is adjusted; in this embodiment, the specific implementation manner is as follows:
acquiring reference component classification information corresponding to component nodes in the component structure tree; calculating the reference component classification information and the prediction component classification information according to a cross entropy loss function to obtain a target loss value; and adjusting parameters of the initial component classification model by utilizing the target loss value until the component classification model meeting the training stopping condition is obtained.
Based on the reference, the reference component classification information corresponding to the component nodes in the component structure tree can be acquired at first in the parameter adjustment stage; calculating the reference component classification information and the prediction component classification information according to the cross entropy loss function to obtain a target loss value; at this time, the initial component classification model can be subjected to parameter adjustment by utilizing the target loss value, and under the condition that the training stop condition is not met after parameter adjustment, a new sample can be selected to continue training until the component classification model meeting the training stop condition is obtained.
That is, for the last layer of the Transformers model, the hidden state vector of the component coding vector can be taken as a classification vector, then the classification vector is processed through the softmax layer, so that the predicted component classification information corresponding to the component nodes in the component structure tree can be obtained, at this time, the reference component classification information corresponding to the component nodes in the component structure tree and the predicted component classification information corresponding to the component nodes in the component structure tree are utilized, a cross entropy loss function is calculated, and the model is subjected to parameter adjustment according to the calculated loss value until the component classification model meeting the training stop condition is obtained.
According to the method, after the prediction classification information corresponding to each component node in the component structure tree is obtained, the reference classification information corresponding to each component node in the component structure tree can be read again, loss value calculation is carried out by combining a cross entropy loss function, parameters are adjusted on the model by utilizing the calculated loss value, an intermediate component classification model can be obtained, a new sample can be selected to continue training until a model with the loss value smaller than a loss value threshold is obtained as the component classification model.
In order to improve the component classification efficiency, the component classification model training method provided by the specification can construct an initial component classification model capable of identifying and classifying components, then a component structure tree can be acquired firstly, component nodes in the component structure tree contain component attribute information, at the moment, the component structure tree can be reconstructed to obtain a component table for recording the component attribute information, at the moment, embedding processing is performed on the component attribute information recorded in the component table, and then component embedding vectors corresponding to the component structure tree can be obtained; further, the component embedded vector is input into an initial component classification model for processing, prediction component classification information corresponding to component nodes in the component structure tree can be obtained, and the initial component classification model is subjected to parameter adjustment based on reference component classification information and prediction component classification information corresponding to the component nodes in the component structure tree, so that one-time training can be completed, and the like until the component classification model meeting the training stop condition is obtained. The components are classified in a modeling mode, so that the component identification and classification efficiency can be effectively improved, and downstream service use is facilitated.
Corresponding to the above method embodiments, the present disclosure further provides an embodiment of a component classification method, and fig. 4 shows a flowchart of a component classification method according to an embodiment of the present disclosure. As shown in fig. 4, the method includes:
step S402, a target component structure tree corresponding to a target object is obtained;
step S404, inputting the target component structure tree into a component classification model in the method to process, and obtaining component type information corresponding to each target component node in the target component structure tree;
and step S406, classifying the target components contained in the target object according to the component type information, and executing component processing tasks according to classification results.
The details of the component classification method provided in this embodiment, which are not described in detail in the foregoing embodiments, may be the same as or corresponding to those described in the foregoing embodiments, which are not described in detail herein.
Specifically, the target object specifically refers to a component design draft, and the component processing task specifically refers to a task of processing a target component included in the target object by a pointer, including but not limited to a task stored in data.
Based on the above, after the designer finishes the component design, the target object can be submitted to the server, and the server can firstly acquire the target component structure tree corresponding to the target object at this time; inputting the target component structure tree into the component classification model in the method to process, so as to obtain component type information corresponding to each target component node in the target component structure tree; and then classifying the target components contained in the target object according to the component type information, so that the component processing task can be executed according to the classification result.
For example, after submitting a design draft to a server, the server generates a component structure tree corresponding to the design draft, then inputs the component structure tree to a component classification model for processing, determines component type information corresponding to each component node in the component structure tree through processing, can distribute the component type information to components corresponding to each component node, and then performs library dropping according to the class information corresponding to the components, thereby realizing writing of the components into a component library according to types.
In conclusion, the trained component classification model is adopted to carry out component classification processing, so that the component classification efficiency and accuracy can be effectively improved, and more manpower resources are saved.
The component classification method provided in the present specification will be further described with reference to fig. 5 by taking an application of the component classification method in a component recognition scenario in an application program as an example. Fig. 5 shows a process flow chart of a component classifying method according to an embodiment of the present disclosure, which specifically includes the following steps:
step S502, an initial component structure tree corresponding to the sample object is obtained, and structure detection is carried out on the initial component structure tree according to a training strategy of an initial component classification model.
Step S504, under the condition that the structure detection result does not meet the preset structure detection condition, splitting the initial assembly structure tree, and determining the assembly structure tree according to the splitting result; wherein the component nodes in the component structure tree contain component attribute information.
Step S506, determining the node attribute type corresponding to each component node in the component structure tree according to the component attribute information contained in the component nodes in the component structure tree.
Step S508, selecting an updating strategy for each component node in the component structure tree according to the node attribute type, and updating the component attribute information contained in each component node by using the updating strategy.
Step S510, traversing the updated component attribute information contained in the component nodes in the component structure tree, and generating a component table recording the updated component attribute information according to the traversing result.
Specifically, an initial component table for recording updated component attribute information is generated according to the traversing result; according to the model training strategy of the initial component classification model, inserting columns for recording component node level information and component node sequence information into an initial component table; generating a component table recording updated component attribute information according to the insertion result
Step S512, respectively performing embedding processing on the component attribute information corresponding to each row of tables in the component tables to obtain the sub-component embedding vectors corresponding to each row of tables.
Step S514, merging the sub-component embedded vectors corresponding to each row of tables to obtain the component embedded vector.
Specifically, the determining of the sub-component embedded vector corresponding to any target row table in the component table includes: determining target component attribute information corresponding to the target row table in the component table, and reading sequence identification information, hierarchy identification information, text information and characteristic information in the target component attribute information; respectively embedding the sequence identification information, the hierarchy identification information, the text information and the feature information to obtain a sequence identification vector, a hierarchy identification vector, a text vector and a feature vector; and splicing the sequence identification vector, the hierarchy identification vector, the text vector and the feature vector to obtain the sub-component embedded vector corresponding to the target line table.
In step S516, the component embedded vector is input to the initial component classification model, and the component embedded vector is encoded by an encoder in the initial component classification model to obtain a component encoded vector.
Specifically, an encoder in an initial component classification model is utilized to calculate an embedded coding vector and a weight matrix, so as to obtain a query vector, a key vector and a value vector; and (3) carrying out similarity score calculation according to the query vector, the key vector and the value vector, and determining a component coding vector containing the hierarchical dependency relationship and the node dependency relationship according to a calculation result.
In step S518, a hidden state vector is determined based on the component encoding vector, and the hidden state vector is used as a component classification vector.
Step S520, decoding the component classification vector by a decoder in the initial component classification model to obtain the predicted component classification information corresponding to the component nodes in the component structure tree.
Step S522, obtaining reference component classification information corresponding to the component nodes in the component structure tree.
And step S524, calculating the reference component classification information and the prediction component classification information according to the cross entropy loss function to obtain a target loss value.
And S526, performing parameter adjustment on the initial component classification model by using the target loss value until the component classification model meeting the training stop condition is obtained.
In step S528, the target component structure tree corresponding to the target object is obtained.
Step S530, inputting the target component structure tree into the component classification model for processing, and obtaining component type information corresponding to each target component node in the target component structure tree.
Step S532, classifying the target components contained in the target object according to the component type information, and executing the component processing task according to the classification result.
In order to improve the component classification efficiency, the component classification model training method provided by the specification can construct an initial component classification model capable of identifying and classifying components, then a component structure tree can be acquired firstly, component nodes in the component structure tree contain component attribute information, at the moment, the component structure tree can be reconstructed to obtain a component table for recording the component attribute information, at the moment, embedding processing is performed on the component attribute information recorded in the component table, and then component embedding vectors corresponding to the component structure tree can be obtained; further, the component embedded vector is input into an initial component classification model for processing, prediction component classification information corresponding to component nodes in the component structure tree can be obtained, and the initial component classification model is subjected to parameter adjustment based on reference component classification information and prediction component classification information corresponding to the component nodes in the component structure tree, so that one-time training can be completed, and the like until the component classification model meeting the training stop condition is obtained. The components are classified in a modeling mode, so that the component identification and classification efficiency can be effectively improved, and downstream service use is facilitated.
Corresponding to the method embodiment, the present disclosure further provides an embodiment of a training device for component classification model, and fig. 6 shows a schematic structural diagram of the training device for component classification model according to an embodiment of the present disclosure. As shown in fig. 6, the apparatus includes:
an obtaining module 602 configured to obtain a component structure tree, wherein component nodes in the component structure tree contain component attribute information;
a reconstruction module 604, configured to reconstruct the component structure tree to obtain a component table for recording component attribute information, and perform embedding processing on the component attribute information recorded in the component table to obtain a component embedding vector;
the processing module 606 is configured to input the component embedding vector into an initial component classification model for processing, so as to obtain predicted component classification information corresponding to component nodes in the component structure tree;
and a parameter tuning module 608, configured to tune the initial component classification model based on the reference component classification information and the predicted component classification information corresponding to the component nodes in the component structure tree until a component classification model satisfying a training stop condition is obtained.
In an alternative embodiment, the obtaining module 602 is further configured to:
acquiring an initial component structure tree corresponding to a sample object; performing structure detection on the initial component structure tree according to a training strategy of the initial component classification model; taking the initial component structure tree as the component structure tree under the condition that the structure detection result meets the preset structure detection condition; and under the condition that the structure detection result does not meet the preset structure detection condition, segmenting the initial assembly structure tree, and determining the assembly structure tree according to the segmentation result.
In an alternative embodiment, the reconstruction module 604 is further configured to:
determining node attribute types corresponding to all the component nodes in the component structure tree according to the component attribute information contained in the component nodes in the component structure tree; selecting an updating strategy for each component node in the component structure tree according to the node attribute type, and updating component attribute information contained in each component node by utilizing the updating strategy; traversing the updated component attribute information contained in the component nodes in the component structure tree, and generating a component table for recording the updated component attribute information according to the traversing result.
In an alternative embodiment, the reconstruction module 604 is further configured to:
generating an initial component table for recording updated component attribute information according to the traversing result; according to the model training strategy of the initial component classification model, inserting columns for recording component node level information and component node sequence information into the initial component table; and generating a component table for recording the updated component attribute information according to the insertion result.
In an alternative embodiment, the reconstruction module 604 is further configured to:
respectively carrying out embedding processing on component attribute information corresponding to each row of tables in the component tables to obtain sub-component embedding vectors corresponding to each row of tables; and merging the sub-component embedded vectors corresponding to each row of tables to obtain the component embedded vector.
In an optional embodiment, the determining the sub-component embedded vector corresponding to any target row table in the component table includes:
determining target component attribute information corresponding to a target row table in the component table, and reading sequence identification information, hierarchy identification information, text information and characteristic information from the target component attribute information; respectively embedding the sequence identification information, the hierarchy identification information, the text information and the characteristic information to obtain a sequence identification vector, a hierarchy identification vector, a text vector and a characteristic vector; and splicing the sequence identification vector, the hierarchy identification vector, the text vector and the feature vector to obtain a sub-component embedded vector corresponding to the target line table.
In an alternative embodiment, the processing module 606 is further configured to:
inputting the component embedded vector into the initial component classification model, and carrying out coding processing on the component embedded vector through an encoder in the initial component classification model to obtain a component coding vector; determining a hidden state vector based on the component encoding vector, and taking the hidden state vector as a component classification vector; and decoding the component classification vector through a decoder in the initial component classification model to obtain prediction component classification information corresponding to the component nodes in the component structure tree.
In an alternative embodiment, the processing module 606 is further configured to:
calculating the embedded coding vector and the weight matrix by using an encoder in the initial component classification model to obtain a query vector, a key vector and a value vector; and carrying out similarity score calculation according to the query vector, the key vector and the value vector, and determining a component coding vector containing the hierarchical dependency relationship and the node dependency relationship according to a calculation result.
In an alternative embodiment, the parameter tuning module 608 is further configured to:
Acquiring reference component classification information corresponding to component nodes in the component structure tree; calculating the reference component classification information and the prediction component classification information according to a cross entropy loss function to obtain a target loss value; and adjusting parameters of the initial component classification model by utilizing the target loss value until the component classification model meeting the training stopping condition is obtained.
In order to improve the component classification efficiency, the component classification model training device provided by the specification can construct an initial component classification model capable of identifying and classifying components, then a component structure tree can be acquired firstly, component nodes in the component structure tree contain component attribute information, at the moment, the component structure tree can be reconstructed to obtain a component table for recording the component attribute information, at the moment, embedding processing is performed on the component attribute information recorded in the component table, and then component embedding vectors corresponding to the component structure tree can be obtained; further, the component embedded vector is input into an initial component classification model for processing, prediction component classification information corresponding to component nodes in the component structure tree can be obtained, and the initial component classification model is subjected to parameter adjustment based on reference component classification information and prediction component classification information corresponding to the component nodes in the component structure tree, so that one-time training can be completed, and the like until the component classification model meeting the training stop condition is obtained. The components are classified in a modeling mode, so that the component identification and classification efficiency can be effectively improved, and downstream service use is facilitated.
The above is a schematic scheme of a component classification model training device of the present embodiment. It should be noted that, the technical solution of the component classification model training device and the technical solution of the component classification model training method belong to the same concept, and details of the technical solution of the component classification model training device which are not described in detail can be referred to the description of the technical solution of the component classification model training method.
Corresponding to the method embodiment, the present disclosure further provides an embodiment of a component classification device, and fig. 7 shows a schematic structural diagram of the component classification device according to an embodiment of the present disclosure. As shown in fig. 7, the apparatus includes:
an acquisition structural tree module 702 configured to acquire a target component structural tree corresponding to the target object;
an input model module 704, configured to input the target component structure tree into the component classification model in the above method to process, and obtain component type information corresponding to each target component node in the target component structure tree;
the classifying component module 706 is configured to classify the target component included in the target object according to the component type information, and execute a component processing task according to the classification result.
The above is a schematic version of a component sorting apparatus of the present embodiment. It should be noted that, the technical solution of the component classifying device and the technical solution of the component classifying method belong to the same concept, and details of the technical solution of the component classifying device which are not described in detail can be referred to the description of the technical solution of the component classifying method.
Fig. 8 illustrates a block diagram of a computing device 800 provided in accordance with an embodiment of the present specification. The components of computing device 800 include, but are not limited to, memory 810 and processor 820. Processor 820 is coupled to memory 810 through bus 830 and database 850 is used to hold data.
Computing device 800 also includes access device 840, access device 840 enabling computing device 800 to communicate via one or more networks 860. Examples of such networks include public switched telephone networks (PSTN, public Switched Telephone Network), local area networks (LAN, local Area Network), wide area networks (WAN, wide Area Network), personal area networks (PAN, personal Area Network), or combinations of communication networks such as the internet. Access device 840 may include one or more of any type of network interface, wired or wireless, such as a network interface card (NIC, network interface controller), such as an IEEE802.11 wireless local area network (WLAN, wireless Local Area Network) wireless interface, a worldwide interoperability for microwave access (Wi-MAX, worldwide Interoperability for Microwave Access) interface, an ethernet interface, a universal serial bus (USB, universal Serial Bus) interface, a cellular network interface, a bluetooth interface, a near field communication (NFC, near Field Communication) interface, and so forth.
In one embodiment of the present application, the above-described components of computing device 800, as well as other components not shown in FIG. 8, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device illustrated in FIG. 8 is for exemplary purposes only and is not intended to limit the scope of the present application. Those skilled in the art may add or replace other components as desired.
Computing device 800 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smart phone), wearable computing device (e.g., smart watch, smart glasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or personal computer (PC, personal Computer). Computing device 800 may also be a mobile or stationary server.
Wherein processor 820 is configured to implement the steps of a component classification model training method or a component classification method when executing computer-executable instructions.
The foregoing is a schematic illustration of a computing device of this embodiment. It should be noted that, the technical solution of the computing device and the technical solution of the component classification model training method or the component classification method belong to the same concept, and details of the technical solution of the computing device, which are not described in detail, can be referred to the description of the technical solution of the component classification model training method or the component classification method.
An embodiment of the present specification also provides a computer-readable storage medium storing computer instructions that, when executed by a processor, perform the steps of a component classification model training method or a component classification method.
The above is an exemplary version of a computer-readable storage medium of the present embodiment. It should be noted that, the technical solution of the storage medium and the technical solution of the component classification model training method or the component classification method belong to the same concept, and details of the technical solution of the storage medium which are not described in detail can be referred to the description of the technical solution of the component classification model training method or the component classification method.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The computer instructions include computer program code that may be in source code form, object code form, executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the content of the computer readable medium can be increased or decreased appropriately according to the requirements of the patent practice, for example, in some areas, according to the patent practice, the computer readable medium does not include an electric carrier signal and a telecommunication signal.
It should be noted that, for the sake of simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the present description is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present description. Further, those skilled in the art will appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily all necessary in the specification.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.
The preferred embodiments of the present specification disclosed above are merely used to help clarify the present specification. Alternative embodiments are not intended to be exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the disclosure and the practical application, to thereby enable others skilled in the art to best understand and utilize the disclosure. This specification is to be limited only by the claims and the full scope and equivalents thereof.

Claims (14)

1. A method for training a component classification model, comprising:
acquiring a component structure tree, wherein component nodes in the component structure tree contain component attribute information;
reconstructing the component structure tree to obtain a component table for recording component attribute information, and performing embedding processing on the component attribute information recorded in the component table to obtain a component embedding vector;
Inputting the component embedded vector to an initial component classification model for processing to obtain predicted component classification information corresponding to component nodes in the component structure tree;
and adjusting parameters of the initial component classification model based on the reference component classification information and the prediction component classification information corresponding to the component nodes in the component structure tree until the component classification model meeting the training stop condition is obtained.
2. The method of claim 1, wherein the acquiring a component structure tree comprises:
acquiring an initial component structure tree corresponding to a sample object;
performing structure detection on the initial component structure tree according to a training strategy of the initial component classification model;
taking the initial component structure tree as the component structure tree under the condition that the structure detection result meets the preset structure detection condition;
and under the condition that the structure detection result does not meet the preset structure detection condition, segmenting the initial assembly structure tree, and determining the assembly structure tree according to the segmentation result.
3. The method according to claim 1, wherein reconstructing the component structure tree to obtain a component table recording component attribute information comprises:
Determining node attribute types corresponding to all the component nodes in the component structure tree according to the component attribute information contained in the component nodes in the component structure tree;
selecting an updating strategy for each component node in the component structure tree according to the node attribute type, and updating component attribute information contained in each component node by utilizing the updating strategy;
traversing the updated component attribute information contained in the component nodes in the component structure tree, and generating a component table for recording the updated component attribute information according to the traversing result.
4. A method according to claim 3, wherein generating a component table recording updated component attribute information according to the traversal result, comprises:
generating an initial component table for recording updated component attribute information according to the traversing result;
according to the model training strategy of the initial component classification model, inserting columns for recording component node level information and component node sequence information into the initial component table;
and generating a component table for recording the updated component attribute information according to the insertion result.
5. The method according to claim 1, wherein the embedding process of the component attribute information recorded in the component table to obtain a component embedding vector includes:
Respectively carrying out embedding processing on component attribute information corresponding to each row of tables in the component tables to obtain sub-component embedding vectors corresponding to each row of tables;
and merging the sub-component embedded vectors corresponding to each row of tables to obtain the component embedded vector.
6. The method of claim 5, wherein determining the sub-component embedding vector corresponding to any target row table in the component table comprises:
determining target component attribute information corresponding to a target row table in the component table, and reading sequence identification information, hierarchy identification information, text information and characteristic information from the target component attribute information;
respectively embedding the sequence identification information, the hierarchy identification information, the text information and the characteristic information to obtain a sequence identification vector, a hierarchy identification vector, a text vector and a characteristic vector;
and splicing the sequence identification vector, the hierarchy identification vector, the text vector and the feature vector to obtain a sub-component embedded vector corresponding to the target line table.
7. The method according to claim 1, wherein the inputting the component embedding vector into an initial component classification model for processing to obtain predicted component classification information corresponding to component nodes in the component structure tree includes:
Inputting the component embedded vector into the initial component classification model, and carrying out coding processing on the component embedded vector through an encoder in the initial component classification model to obtain a component coding vector;
determining a hidden state vector based on the component encoding vector, and taking the hidden state vector as a component classification vector;
and decoding the component classification vector through a decoder in the initial component classification model to obtain prediction component classification information corresponding to the component nodes in the component structure tree.
8. The method of claim 7, wherein the encoding the component embedded vector by an encoder in the initial component classification model to obtain a component encoded vector comprises:
calculating the embedded coding vector and the weight matrix by using an encoder in the initial component classification model to obtain a query vector, a key vector and a value vector;
and carrying out similarity score calculation according to the query vector, the key vector and the value vector, and determining a component coding vector containing the hierarchical dependency relationship and the node dependency relationship according to a calculation result.
9. The method according to any one of claims 1-8, wherein said referencing the initial component classification model based on the reference component classification information and the predicted component classification information corresponding to the component nodes in the component structure tree until a component classification model satisfying a training stop condition is obtained comprises:
acquiring reference component classification information corresponding to component nodes in the component structure tree;
calculating the reference component classification information and the prediction component classification information according to a cross entropy loss function to obtain a target loss value;
and adjusting parameters of the initial component classification model by utilizing the target loss value until the component classification model meeting the training stopping condition is obtained.
10. A method of classifying components, comprising:
acquiring a target component structure tree corresponding to a target object;
inputting the target component structure tree into the component classification model in the method of any one of claims 1-9 for processing to obtain component type information corresponding to each target component node in the target component structure tree;
and classifying the target components contained in the target object according to the component type information, and executing component processing tasks according to classification results.
11. A component classification model training apparatus, comprising:
the device comprises an acquisition module, a storage module and a storage module, wherein the acquisition module is configured to acquire a component structure tree, and component nodes in the component structure tree contain component attribute information;
the reconstruction module is configured to reconstruct the component structure tree to obtain a component table for recording component attribute information, and perform embedding processing on the component attribute information recorded in the component table to obtain a component embedding vector;
the processing module is configured to input the component embedding vector into an initial component classification model for processing to obtain prediction component classification information corresponding to component nodes in the component structure tree;
and the parameter adjusting module is configured to adjust parameters of the initial component classification model based on the reference component classification information and the prediction component classification information corresponding to the component nodes in the component structure tree until the component classification model meeting the training stop condition is obtained.
12. A component classification apparatus, comprising:
the acquisition structure tree module is configured to acquire a target component structure tree corresponding to the target object;
an input model module configured to input the target component structure tree into the component classification model in the method of any one of claims 1-9 for processing, so as to obtain component type information corresponding to each target component node in the target component structure tree;
And the classifying component module is configured to classify the target components contained in the target object according to the component type information and execute component processing tasks according to classification results.
13. A computing device comprising a memory and a processor; the memory is configured to store computer executable instructions and the processor is configured to execute the computer executable instructions to implement the steps of the method of any one of claims 1 to 10.
14. A computer readable storage medium storing computer instructions which, when executed by a processor, implement the steps of the method of any one of claims 1 to 10.
CN202310409354.0A 2023-04-17 2023-04-17 Component classification model training method and device Pending CN116383660A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310409354.0A CN116383660A (en) 2023-04-17 2023-04-17 Component classification model training method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310409354.0A CN116383660A (en) 2023-04-17 2023-04-17 Component classification model training method and device

Publications (1)

Publication Number Publication Date
CN116383660A true CN116383660A (en) 2023-07-04

Family

ID=86967444

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310409354.0A Pending CN116383660A (en) 2023-04-17 2023-04-17 Component classification model training method and device

Country Status (1)

Country Link
CN (1) CN116383660A (en)

Similar Documents

Publication Publication Date Title
CN111291212B (en) Zero sample sketch image retrieval method and system based on graph convolution neural network
CN111241851A (en) Semantic similarity determination method and device and processing equipment
CN113011186B (en) Named entity recognition method, named entity recognition device, named entity recognition equipment and computer readable storage medium
El Mohadab et al. Predicting rank for scientific research papers using supervised learning
CN112417063B (en) Heterogeneous relation network-based compatible function item recommendation method
CN113220901A (en) Writing concept auxiliary system and network system based on enhanced intelligence
CN113052262A (en) Form generation method and device, computer equipment and storage medium
CN117216535A (en) Training method, device, equipment and medium for recommended text generation model
CN115905528A (en) Event multi-label classification method and device with time sequence characteristics and electronic equipment
Gui et al. Zero-shot generation of training data with denoising diffusion probabilistic model for handwritten Chinese character recognition
CN112148879B (en) Computer readable storage medium for automatically labeling code with data structure
Yin et al. A Cross‐Modal Image and Text Retrieval Method Based on Efficient Feature Extraction and Interactive Learning CAE
CN115204318B (en) Event automatic hierarchical classification method and electronic equipment
CN116485962A (en) Animation generation method and system based on contrast learning
CN116383660A (en) Component classification model training method and device
CN116910190A (en) Method, device and equipment for acquiring multi-task perception model and readable storage medium
CN116341564A (en) Problem reasoning method and device based on semantic understanding
CN113704466B (en) Text multi-label classification method and device based on iterative network and electronic equipment
CN113065336B (en) Text automatic generation method and device based on deep learning and content planning
CN113239215A (en) Multimedia resource classification method and device, electronic equipment and storage medium
CN114611609A (en) Graph network model node classification method, device, equipment and storage medium
CN113656579A (en) Text classification method, device, equipment and medium
CN113486180A (en) Remote supervision relation extraction method and system based on relation hierarchy interaction
Sonje et al. draw2code: Ai based auto web page generation from hand-drawn page mock-up
CN115994541B (en) Interface semantic data generation method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination