CN114896400A - Graph neural network text classification method based on regular constraint - Google Patents
Graph neural network text classification method based on regular constraint Download PDFInfo
- Publication number
- CN114896400A CN114896400A CN202210532864.2A CN202210532864A CN114896400A CN 114896400 A CN114896400 A CN 114896400A CN 202210532864 A CN202210532864 A CN 202210532864A CN 114896400 A CN114896400 A CN 114896400A
- Authority
- CN
- China
- Prior art keywords
- edge
- attention
- text
- neural network
- node
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 22
- 238000011176 pooling Methods 0.000 claims abstract description 14
- 230000003993 interaction Effects 0.000 claims abstract description 11
- 238000000059 patterning Methods 0.000 claims abstract description 10
- 230000008569 process Effects 0.000 claims abstract description 10
- 230000014509 gene expression Effects 0.000 claims abstract description 8
- 238000012549 training Methods 0.000 claims abstract description 5
- 230000004931 aggregating effect Effects 0.000 claims abstract description 4
- 238000005065 mining Methods 0.000 claims description 5
- OAICVXFJPJFONN-UHFFFAOYSA-N Phosphorus Chemical compound [P] OAICVXFJPJFONN-UHFFFAOYSA-N 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 230000017105 transposition Effects 0.000 claims description 3
- 230000006870 function Effects 0.000 abstract description 8
- 238000003058 natural language processing Methods 0.000 abstract description 3
- 241001465754 Metazoa Species 0.000 description 3
- 230000001939 inductive effect Effects 0.000 description 3
- 230000002776 aggregation Effects 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 241000282376 Panthera tigris Species 0.000 description 1
- 102100029563 Somatostatin Human genes 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000009792 diffusion process Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000026683 transduction Effects 0.000 description 1
- 238000010361 transduction Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to a graph neural network text classification method based on regular constraint, which belongs to the field of natural language processing and comprises the following steps: patterning: the method comprises the steps of patterning by adopting a patterning method of textING, adding semantic edges and grammar edges, defining types of different edges, initializing edge type characteristics Ec, and inputting the edge type characteristics Ec into a graph neural network for training; performing word interaction based on the graph neural network: GAT with various regular constraints is adopted to distribute different attention weights to neighborhood nodes to filter edge noise information and guide the attention score distribution to reduce overlapping; text representation: aggregating the word node characteristics into discourse expressions through maximum pooling and average pooling, obtaining the classification result of the text according to the discourse expressions, and defining a loss function to constrain the updating process of the node characteristics. The invention enriches the semantic relevance of grammar among words, improves the long-distance and discontinuous word interaction capability and improves the model expression capability.
Description
Technical Field
The invention belongs to the field of natural language processing, and relates to a graph neural network text classification method based on regular constraint.
Background
The text classification is the most basic technical support of most natural language processing tasks, the workload of manual management, classification and other operations on character resources is huge under the background of information explosion, and the text classification by deep learning can realize efficient and rapid management on massive text information and improve the information retrieval efficiency.
The key of text classification is to mine text context information to obtain accurate semantic representation. Neural networks, represented by TextCNN and TextRNN, although capable of mining text semantics quickly and efficiently, lack long distance and non-continuous word interactions. Recently, Graph neural Networks have been proposed to solve this problem, and Graph Convolution Networks (GCNs) and Graph Attention Networks (GATs) follow the paradigm of neighborhood aggregation, can model the sequence structure and the syntax structure of texts, and flexibly capture the relationships among words, sentences and chapters in the texts. For example, the textGCN constructs a text graph at a corpus level, and the GCN is adopted to convert a text classification task into a semi-supervised node classification task; on the basis, the Text-Level GNN introduces a message passing mechanism to reduce the memory consumption of the TextGCN. However, these methods of transduction Learning (Inductive Learning) are computationally inefficient, and TextING and HyperGAT construct separate text graphs for each text, and use GNN to capture higher-order context information of words, all can effectively perform Inductive Learning (Inductive Learning). Thereafter, DADGNN effectively enlarges the receptive field of the junction through a diffusion mechanism and decouples the GNN propagation process.
The existing text classification method has the following defects: (1) the edge type is single, words only depend on the neighbor to update semantic representation, text grammar and semantic related information are lacked, in addition, different edge types have rich information, but the edge types are not fully utilized in most current models, and the missing edge information is likely to have great influence on the overall tendency of the text. (2) The interference of noise from nodes and edges in the graph structure on the network is ignored, and in addition, as the iteration number of the graph structure is increased, the noise information is multiplied, so that the classification performance is sharply reduced.
Disclosure of Invention
In view of the above, the present invention provides a graph neural network text classification method based on regular constraint, which solves the problem of insufficient text classification performance caused by single edge type and noise interference in a text classification model based on a graph structure.
In order to achieve the purpose, the invention provides the following technical scheme:
a graph neural network text classification method based on regular constraint includes the following steps:
patterning: the method comprises the steps of patterning by adopting a patterning method of textING, adding semantic edges and grammar edges, defining types of different edges, initializing edge type characteristics Ec, and inputting the edge type characteristics Ec into a graph neural network for training;
performing word interaction based on the graph neural network: GAT with various regular constraints is adopted to distribute different attention weights to neighborhood nodes to filter edge noise information and guide the attention score distribution to reduce overlapping;
text representation: aggregating the word node characteristics into discourse expressions through maximum pooling and average pooling, obtaining the classification result of the text according to the discourse expressions, and defining a loss function to constrain the updating process of the node characteristics.
Further, the composition specifically includes:
s11: adding semantic edges on the basis of a text graph G (V, E) constructed only by adjacency relation to capture high-order correlation of words and subject words, firstly mining potential subjects T from the text by using a subject generation model (LDA), and mining each subject T i =(θ 1 ,…θ v ) Is expressed by probability distribution on words, wherein v represents the number of words, and connects the first N words with the maximum probability in the text sample and the corresponding subjects T i Obtaining the edge related to the theme;
s12: modeling a syntactic relation among words in a text sequence by adopting SpaCy, and if the syntactic relation exists among the words, establishing a syntactic edge for the words;
s13: different types of edges are defined, including seven edge types of adjacent edge, semantic edge, adjacent-semantic edge, grammar edge, adjacent-grammar edge, semantic-grammar edge and adjacent-semantic-grammar edge, which are respectively defined as edge 1, edge 2, edge 3, edge 4, edge 5, edge 6 and edge 7, and are initialized to seven different edge type characteristics Ec.
Further, the performing of word interaction based on the graph neural network specifically includes: distributing different attention weights to the neighborhood nodes by using multi-head attention based on various regular terms to filter noise information; h-h for each text input 1 ,h 2 ,…,h V },(h V ∈R d ) D is the feature dimension of each node, and a shared linear transformation W is applied to each node 1 ∈R d ×d And attention, attention coefficient e ij Obtained by the following formula:
wherein the weight vector a ∈ R 2d′ T denotes transposition, and | | denotes vector stitching operation;
the coefficients are normalized by a Softmax function, the attention score alpha ij Comprises the following steps:
linearly combining the attention scores and the node characteristics to serve as final output characteristics of each node;
expanding a single attention head into multi-head attention, and splicing the outputs of K attention heads as the output of multi-head attention:
after merging the edge types, the multi-head attention formula is updated as follows:
the use of various regularization terms among the attention heads encourages the attention score distribution to reduce overlap, so that the attention heads capture more different information, wherein the various regularization terms are as follows:
in the formula | · | non-conducting phosphor 2 Representing the L2 norm.
Further, the text representation specifically includes:
after the word node information is fully updated, average pooling and maximum pooling are carried out on the node characteristics, the node characteristics are aggregated into text representation, and final prediction is generated:
and (3) transmitting the pooled features into a Softmax layer predicted text label:
finally, the update process of the node features is constrained by minimizing the objective loss function:
in the formula, λ is a regular term coefficient.
The invention has the beneficial effects that: in the stage of constructing the text graph, word grammar and semantic related edges are added on the basis of the adjacency matrix, so that the grammar semantic relevance among words is enriched, and the long-distance and discontinuous word interaction capacity is improved; in addition, different edge types are modeled, the different edge types are coded into different characteristics and input into GAT to calculate attention coefficients among nodes, and the model expression capacity is improved. After a text graph is constructed, GAT is adopted for word interaction, and considering that the GAT lacks effective control over different attention heads, the GAT with various regular terms is introduced to distribute different attention weights for neighborhood nodes to filter noise information, so that the attention score distribution is encouraged to reduce overlapping, and the attention heads capture more different information.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.
Drawings
For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:
FIG. 1 is a diagram of a graph neural network text classification method based on canonical constraints;
fig. 2 is an example syntactic analysis result.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.
Wherein the showings are for the purpose of illustrating the invention only and not for the purpose of limiting the same, and in which there is shown by way of illustration only and not in the drawings in which there is no intention to limit the invention thereto; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there is an orientation or positional relationship indicated by terms such as "upper", "lower", "left", "right", "front", "rear", etc., based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not an indication or suggestion that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes, and are not to be construed as limiting the present invention, and the specific meaning of the terms may be understood by those skilled in the art according to specific situations.
Referring to fig. 1, the present invention provides a graph neural network text classification method based on regular constraint, which includes three parts:
(1) patterning: the invention adopts a composition method of textING, uses a sliding window to construct a text graph G (V, E) for a single text, wherein V is a vertex set which represents words in the text, and E is an edge set which represents the adjacency relation between the words.
Specifically, the invention adds semantic edges on the basis of a text graph G (V, E) constructed only by adjacency relations to capture the high-order correlation between words and subject words, firstly uses a subject generation model (LDA) to mine potential subjects T from the text, and for each subject T, the potential subject T is mined i =(θ 1 ,…θ v ) Is expressed by probability distribution on words, wherein v represents the number of words, and connects the first N words with the maximum probability in the text sample and the corresponding subjects T i . By utilizing the edges related to the topics, each text can be enrichedContext high-order semantics of words.
In addition, the invention adopts SpaCy to model the syntactic relation among words in the text sequence, including the main and subordinate relation (SBV), the moving object relation (VOB), the inter-object relation (IOB) and the like, and if the syntactic relation exists among the words, the syntactic edges are established for the words. The syntactic structure can effectively reveal the syntactic dependency among words in the text, and enriches the syntactic dependency among the words to improve the remote word interaction capability.
Therefore, there are seven edge types of adjacent edge, semantic edge, adjacent-semantic edge, grammar edge, adjacent-grammar edge, semantic-grammar edge, adjacent-semantic-grammar edge, and adjacent-semantic-grammar edge, which are defined as edge 1, edge 2, edge 3, edge 4, edge 5, edge 6, and edge 7, respectively, and are initialized to seven different edge type features EC, and EC is trained in the network update process.
For example, in the sentence "animal name given tiger louisiana state unity? "in, word adjacency is shown in table 1:
TABLE 1
Performing subject term extraction by adopting LDA, wherein "animal" is one of subject terms of the corpus, and establishing semantic edges between the "animal" and all words in the example; example was parsed with SpaCy, and the results are shown in fig. 2
Therefore, the relationships between words after adding grammatical, semantically related edges on the basis of adjacency relationships are shown in table 2:
TABLE 2
(2) GNN-based word interaction. Noise information from edges and nodes in the text graph structure can degrade the classification performance of the model. For example, in The sentence "The filmakers know to please The eye," but it is not always able to The neighbor nodes that The TextING constructs "eye-button" adjacent edge, "eye" updates The feature based on its neighbor "button", but The correlation between The central node and The neighbor nodes is not high, further aggregation may damage The performance of The model based on such noise edge, so The present invention uses GAT to assign different attention weights to The neighbor nodes to filter edge noise information. But in specific practice it is found that: the features extracted by a plurality of attention heads are relatively consistent. In order to constrain different attention heads to capture the characteristics of different characterization subspace information, the invention uses the GAT with various regular constraints to guide the attention score distribution to reduce the overlapping, thereby not only effectively filtering noise information, but also improving the capability of a model for learning the semantic representation of the word node context.
In particular, the present invention filters noise information using multi-headed attention based on diverse regularization terms to assign different attention weights to neighborhood nodes. H-h for each text input 1 ,h 2 ,…,h V },(h V ∈R d ) D is the feature dimension of each node, and a shared linear transformation W is applied to each node 1 ∈R d×d And attention, attention coefficient e ij Obtained by equation (1):
in the formula (1), the weight vector a ∈ R 2d′ T denotes transposition and | | denotes vector stitching operation.
To make the coefficients easy to compare between different nodes, the coefficients are normalized using the Softmax function, so the attention score α ij Obtained from equation (2):
the attention score is then linearly combined with the node features as the final output feature for each node. In order to learn richer features and stabilize the learning process of attention, a single attention head is expanded to multi-head attention, and the outputs of K attention heads are spliced together to be used as the output of multi-head attention:
after merging the edge types, the multi-head attention formula (3) is updated to formula (4):
although GAT assigns trainable, fine-grained weights to each node neighbor that can filter noise information to some extent, current research shows: the ability to capture different features simply using multiple heads of attention is difficult to guarantee. In specific practice it will be found that: the features extracted by a plurality of attention heads are relatively consistent. In order to restrict different attention heads from capturing the characteristics of different characterization subspace information, the invention uses various regular terms among the attention heads to encourage the attention score distribution to reduce the overlapping, so that the attention heads capture more different information. The multiple regularization term is shown in equation (5):
in the formula (5), | · non-woven phosphor 2 Representing the L2 norm.
(3) A textual representation. Aggregating the word node characteristics into discourse representation through maximum pooling and average pooling, obtaining the classification result of the text according to the discourse representation, and defining a new loss function to constrain the updating process of the node characteristics, wherein the loss function fully considers the constraint of various regular terms on GAT.
After the word node information is fully updated, average pooling and maximum pooling are carried out on the node characteristics, the node characteristics are aggregated into text representation, and final prediction is generated:
and (3) transmitting the pooled features into a Softmax layer predicted text label:
finally, the update process of the node features is constrained by minimizing the objective loss function, as shown in equation (8):
in equation (8), λ is a regular term coefficient.
The data set of this example is: 1) emotion classification data sets MR, SST1 and SST 2; 2) the road agency news classification data sets R8 and R52; 3) the subject classification data set TREC. The statistics of the data set are shown in table 3. The training set is randomly divided into training data and verification data according to the ratio of 9:1 to carry out experiments.
TABLE 3
The evaluation index adopted in this embodiment is Accuracy, and the calculation method is as follows:
in the formula (10), T p 、F p 、T n And F n The numbers of true positive, false positive, true negative and false negative are shown respectively.
Setting experimental parameters: the word vector is initialized using a 300-dimensional GloVe. Preprocessing, filtering stop words and punctuation marks in the text, removing 10% of words with low TF-IDF values, and eliminating the noise influence of common words; specific parameter settings are shown in table 4.
TABLE 4
Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.
Claims (4)
1. A graph neural network text classification method based on regular constraint is characterized in that: the method comprises the following steps:
patterning: the method comprises the steps of patterning by adopting a patterning method of textING, adding semantic edges and grammar edges, defining types of different edges, initializing edge type characteristics Ec, and inputting the edge type characteristics Ec into a graph neural network for training;
performing word interaction based on the graph neural network: GAT with various regular constraints is adopted to distribute different attention weights to neighborhood nodes to filter edge noise information and guide the attention score distribution to reduce overlapping;
text representation: aggregating the word node characteristics into discourse expressions through maximum pooling and average pooling, obtaining the classification result of the text according to the discourse expressions, and defining a loss function to constrain the updating process of the node characteristics.
2. The canonical constraint-based graph neural network text classification method according to claim 1, wherein: in the composition, the method specifically comprises the following steps:
s11: adding semantic edges on the basis of a text graph G (V, E) constructed only by adjacency relation to capture high-order correlation of words and subject words, firstly mining potential subjects T from the text by using a subject generation model (LDA), and mining each subject T i =(θ 1 ,…θ v ) Using a word-related synopsisThe rate distribution is expressed, wherein v represents the number of words, and the top N words with the highest probability in the text sample and the corresponding subjects T are connected i Obtaining the edge related to the theme;
s12: modeling a syntactic relation among words in a text sequence by adopting SpaCy, and if the syntactic relation exists among the words, establishing a syntactic edge for the words;
s13: different types of edges are defined, including seven edge types of adjacent edge, semantic edge, adjacent-semantic edge, grammar edge, adjacent-grammar edge, semantic-grammar edge and adjacent-semantic-grammar edge, which are respectively defined as edge 1, edge 2, edge 3, edge 4, edge 5, edge 6 and edge 7, and are initialized to seven different edge type characteristics Ec.
3. The canonical constraint-based graph neural network text classification method according to claim 1, wherein: the performing of word interaction based on the graph neural network specifically includes: distributing different attention weights to the neighborhood nodes by using multi-head attention based on various regular terms to filter noise information; h-h for each text input 1 ,h 2 ,…,h V },(h V ∈R d ) D is the feature dimension of each node, and a shared linear transformation W is applied to each node 1 ∈R d×d And attention, attention coefficient e ij Obtained by the following formula:
wherein the weight vector a ∈ R 2d′ T denotes transposition, and | | denotes vector stitching operation;
the coefficients are normalized by a Softmax function, the attention score alpha ij Comprises the following steps:
linearly combining the attention scores and the node characteristics to serve as final output characteristics of each node;
expanding a single attention head into multi-head attention, and splicing the outputs of K attention heads as the output of multi-head attention:
after merging the edge types, the multi-head attention formula is updated as follows:
the use of various regularization terms among the attention heads encourages the attention score distribution to reduce overlap, so that the attention heads capture more different information, wherein the various regularization terms are as follows:
in the formula | · | non-conducting phosphor 2 Representing the L2 norm.
4. The canonical constraint-based graph neural network text classification method according to claim 1, wherein: the textual representation specifically includes:
after the word node information is fully updated, average pooling and maximum pooling are carried out on the node characteristics, the node characteristics are aggregated into text representation, and final prediction is generated:
and (3) transmitting the pooled features into a Softmax layer predicted text label:
finally, the update process of the node features is constrained by minimizing the objective loss function:
in the formula, λ is a regular term coefficient.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210532864.2A CN114896400B (en) | 2022-05-11 | 2022-05-11 | Graph neural network text classification method based on regular constraint |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210532864.2A CN114896400B (en) | 2022-05-11 | 2022-05-11 | Graph neural network text classification method based on regular constraint |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114896400A true CN114896400A (en) | 2022-08-12 |
CN114896400B CN114896400B (en) | 2024-06-21 |
Family
ID=82723321
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210532864.2A Active CN114896400B (en) | 2022-05-11 | 2022-05-11 | Graph neural network text classification method based on regular constraint |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114896400B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA3061717A1 (en) * | 2018-11-16 | 2020-05-16 | Royal Bank Of Canada | System and method for a convolutional neural network for multi-label classification with partial annotations |
CN112269874A (en) * | 2020-10-10 | 2021-01-26 | 北京物资学院 | Text classification method and system |
CN112667818A (en) * | 2021-01-04 | 2021-04-16 | 福州大学 | GCN and multi-granularity attention fused user comment sentiment analysis method and system |
CN112711953A (en) * | 2021-01-19 | 2021-04-27 | 湖南大学 | Text multi-label classification method and system based on attention mechanism and GCN |
CN113254648A (en) * | 2021-06-22 | 2021-08-13 | 暨南大学 | Text emotion analysis method based on multilevel graph pooling |
US20210400059A1 (en) * | 2020-06-22 | 2021-12-23 | Wangsu Science & Technology Co., Ltd. | Network attack detection method, system and device based on graph neural network |
CN114186062A (en) * | 2021-12-13 | 2022-03-15 | 安徽大学 | Text classification method based on graph neural network topic model |
-
2022
- 2022-05-11 CN CN202210532864.2A patent/CN114896400B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA3061717A1 (en) * | 2018-11-16 | 2020-05-16 | Royal Bank Of Canada | System and method for a convolutional neural network for multi-label classification with partial annotations |
US20210400059A1 (en) * | 2020-06-22 | 2021-12-23 | Wangsu Science & Technology Co., Ltd. | Network attack detection method, system and device based on graph neural network |
CN112269874A (en) * | 2020-10-10 | 2021-01-26 | 北京物资学院 | Text classification method and system |
CN112667818A (en) * | 2021-01-04 | 2021-04-16 | 福州大学 | GCN and multi-granularity attention fused user comment sentiment analysis method and system |
CN112711953A (en) * | 2021-01-19 | 2021-04-27 | 湖南大学 | Text multi-label classification method and system based on attention mechanism and GCN |
CN113254648A (en) * | 2021-06-22 | 2021-08-13 | 暨南大学 | Text emotion analysis method based on multilevel graph pooling |
CN114186062A (en) * | 2021-12-13 | 2022-03-15 | 安徽大学 | Text classification method based on graph neural network topic model |
Non-Patent Citations (3)
Title |
---|
ANKIT PAL等: "Multi-label text classification using attention-based graph neural network", 《ARXIV PREPRINT ARXIV:2003.11644 - 2020》, 22 March 2020 (2020-03-22), pages 1 - 12 * |
甘玲等: "基于正则约束的分层仿射图神经网络文本分类模型", 《重庆邮电大学学报(自然科学版)》, vol. 35, no. 04, 15 August 2023 (2023-08-15), pages 715 - 721 * |
袁自勇等: "基于异构图卷积网络的小样本短文本分类方法", 《计算机工程》, vol. 47, no. 12, 16 December 2020 (2020-12-16), pages 87 - 94 * |
Also Published As
Publication number | Publication date |
---|---|
CN114896400B (en) | 2024-06-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108573411B (en) | Mixed recommendation method based on deep emotion analysis and multi-source recommendation view fusion of user comments | |
CN108255805B (en) | Public opinion analysis method and device, storage medium and electronic equipment | |
Zhang et al. | Convolutional multi-head self-attention on memory for aspect sentiment classification | |
CN107766324B (en) | Text consistency analysis method based on deep neural network | |
US10740678B2 (en) | Concept hierarchies | |
US10831796B2 (en) | Tone optimization for digital content | |
CN108681557B (en) | Short text topic discovery method and system based on self-expansion representation and similar bidirectional constraint | |
CN108038492A (en) | A kind of perceptual term vector and sensibility classification method based on deep learning | |
US20170169355A1 (en) | Ground Truth Improvement Via Machine Learned Similar Passage Detection | |
Nagamanjula et al. | A novel framework based on bi-objective optimization and LAN2FIS for Twitter sentiment analysis | |
CN111274790A (en) | Chapter-level event embedding method and device based on syntactic dependency graph | |
JP6729095B2 (en) | Information processing device and program | |
Lu | Semi-supervised microblog sentiment analysis using social relation and text similarity | |
CN111460158B (en) | Microblog topic public emotion prediction method based on emotion analysis | |
CN113392651A (en) | Training word weight model, and method, device, equipment and medium for extracting core words | |
CN105956158A (en) | Automatic extraction method of network neologism on the basis of mass microblog texts and use information | |
CN114742071A (en) | Chinese cross-language viewpoint object recognition and analysis method based on graph neural network | |
Kathuria et al. | AOH-Senti: aspect-oriented hybrid approach to sentiment analysis of students’ feedback | |
WO2022073341A1 (en) | Disease entity matching method and apparatus based on voice semantics, and computer device | |
CN104484437A (en) | Network brief comment sentiment mining method | |
CN116402166B (en) | Training method and device of prediction model, electronic equipment and storage medium | |
Zhu et al. | Causality extraction model based on two-stage GCN | |
CN111382333A (en) | Case element extraction method in news text sentence based on case correlation joint learning and graph convolution | |
CN114896400A (en) | Graph neural network text classification method based on regular constraint | |
CN115293479A (en) | Public opinion analysis workflow system and method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |