CN116304061A - Text classification method, device and medium based on hierarchical text graph structure learning - Google Patents

Text classification method, device and medium based on hierarchical text graph structure learning Download PDF

Info

Publication number
CN116304061A
CN116304061A CN202310551919.9A CN202310551919A CN116304061A CN 116304061 A CN116304061 A CN 116304061A CN 202310551919 A CN202310551919 A CN 202310551919A CN 116304061 A CN116304061 A CN 116304061A
Authority
CN
China
Prior art keywords
text
graph
representing
edge
graph structure
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310551919.9A
Other languages
Chinese (zh)
Other versions
CN116304061B (en
Inventor
龙军
王子冬
杨柳
陈庭轩
黄金彩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN202310551919.9A priority Critical patent/CN116304061B/en
Publication of CN116304061A publication Critical patent/CN116304061A/en
Application granted granted Critical
Publication of CN116304061B publication Critical patent/CN116304061B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a text classification method, a device and a medium based on hierarchical text graph structure learning, wherein the method comprises the following steps: step S1: preprocessing the training set text according to three linguistic features to obtain three graph structure matrixes; step S2: performing edge level diagram structure learning to obtain three types of edge vectors; step S3: removing redundancy to obtain three text edge vectors; step S4: weighted summation is carried out to obtain a text graph structural representation; step S5: processing by adopting a graph convolution neural network, and generating graph-level text representation through a graph pooling layer; step S6: and carrying out softmax classification, wherein the class with the highest probability is the final classification result. The method has the advantages that three linguistic features are adopted to preprocess the training set text, and the text classification problem is converted into the graph classification problem; according to the invention, through multi-granularity graph structure learning, different graph structures are integrated, so that the semantic loss of the graph structure in the subsequent learning process is prevented.

Description

Text classification method, device and medium based on hierarchical text graph structure learning
Technical Field
The invention relates to the field of natural language processing, in particular to a text classification method, device and medium based on hierarchical text graph structure learning.
Background
The text classification is used as a basic technology in the field of natural language processing and is widely applied to realistic scenes such as knowledge question-answering, emotion analysis and the like. Currently, with the development of deep learning, the graphic neural network has made remarkable progress in text classification. However, how to graphically represent text is a difficulty. Existing methods of representing text using drawings do not take into account the accuracy and integrity of the original text drawing. In the diagram construction stage, due to errors existing in the algorithm, errors are likely to exist in the text diagram constructed by using methods such as entity/relation extraction and the like, so that the edges of the errors in the diagram are sparse or redundant, and the performance of a subsequent text classification task is affected. And limited by the priori knowledge of humans, the predefined graph structure only carries part of the information of the system, which prevents understanding of the underlying mechanism of how edges in the graph affect subsequent tasks, thereby limiting the application of the graph method in text classification.
In view of the foregoing, there is an urgent need for a text classification method, apparatus, and medium based on hierarchical text graph structure learning to solve the problems in the prior art.
Disclosure of Invention
The invention aims to overcome the defects of the existing text classification technology, and provides a text classification method, a text classification device and a text classification medium based on hierarchical text graph structure learning, so that updating and error correction of the text graph structure are realized, and the accuracy and the robustness of text graph classification are improved. The method provided by the invention adopts a local to global view angle to learn the graph structure hierarchically, thereby enriching the structural representation of the text graph, reducing the error introduced by the initial graph structure and modeling the relationship among nodes in fine granularity, and the specific technical scheme is as follows:
the text classification method based on hierarchical text graph structure learning comprises the following steps:
step S1: inputting and preprocessing the training set text to be classified according to three different linguistic characteristics to obtain node sets and edge sets of the three training set text, namely three graph structure matrixes; the three linguistic features are a text co-occurrence diagram, a text grammar diagram and a text semantic diagram respectively;
step S2: adopting a characteristic representation model based on edge level graph structure learning to process the three node sets and the graph structures of the edge sets to obtain three edge vectors;
step S3: removing redundancy of the three types of edge vectors according to the measurement standard of mutual information to obtain three types of text edge vectors;
step S4: carrying out weighted summation on the three text edge vectors to obtain text graph structural representation;
step S5: processing the text graph structural representation obtained in the step S4 and text semantic features corresponding to the text graph structural representation by adopting a graph convolution neural network, and generating graph-level text representation through a graph pooling layer;
step S6: and (5) carrying out softmax classification on the graph-level text representation obtained in the step (S5), and taking the category with the highest probability as a final classification result.
Preferably, in step S1, the text co-occurrence diagram construction mode specifically includes: will be in text
Figure SMS_1
Is->
Figure SMS_2
Expressed as text co-occurrence diagram->
Figure SMS_3
Node->
Figure SMS_4
The edge weight between any two word nodes in the graph adopts the point-to-point information of the word nodes
Figure SMS_5
The edge weight expression for the text co-occurrence graph is represented as follows:
Figure SMS_6
wherein,,
Figure SMS_7
edge weight representing text co-occurrence graph, +.>
Figure SMS_8
Representation word node->
Figure SMS_9
He word node->
Figure SMS_10
Is a piece of dot mutual information.
Preferably, in step S1, the text grammar map construction mode specifically includes: extracting text using parsing tools
Figure SMS_11
Is->
Figure SMS_12
Syntax dependency of->
Figure SMS_13
Generating a relation triplet->
Figure SMS_14
Use +.>
Figure SMS_15
As nodes of the text grammar, the dependency relationships are used as edges among the nodes, the edge weights are expressed by using the frequencies of the dependency relationships in the data set, and the edge weight expression of the text grammar is as follows:
Figure SMS_16
wherein,,
Figure SMS_17
edge weights representing text grammar map, +.>
Figure SMS_18
Representing the number of times two words have syntactic dependencies in all sentences of the corpus, ++>
Figure SMS_19
Representing the number of times two words are present in the same sentence in all sentences of the corpus.
Preferably, in step S1, the text semantic graph construction method specifically includes: encoding text using BERT model
Figure SMS_20
Arbitrary word +.>
Figure SMS_21
Obtaining a feature vector->
Figure SMS_22
Using cosine similarity to calculate semantic similarity between feature vectors, if the semantic similarity is greater than a set threshold +.>
Figure SMS_23
Then it is indicated that there is a semantic relationship to the word, and the edge weight expression of the text semantic graph is as follows:
Figure SMS_24
wherein,,
Figure SMS_25
edge weights representing text semantic graphs, +.>
Figure SMS_26
Representing the number of times two words have semantic relations in all sentences of the corpus, +.>
Figure SMS_27
Representing the number of times two words are present in the same sentence in all sentences of the corpus.
Preferably, in step S2, the process of learning the graph structure specifically includes: giving confidence to the graph structure matrix, and optimizing the graph structure matrix based on the confidence; using Laplace regularization to restrict the characteristics of the nodes, and using the characteristics as likelihood functions of Bayesian estimation; setting a learning process of a priori function constraint adjacency matrix; combining the likelihood function and the prior function, and restraining the adjacency matrix of the learned graph through a Bayesian estimation framework;
the above optimization and constraint are performed on the three text graphs respectively, and the final loss function expression is as follows:
Figure SMS_28
wherein,,
Figure SMS_29
loss function representing graph structure learning at edge level, +.>
Figure SMS_30
Representation->
Figure SMS_31
Constraint function for constraining the adjacency matrix of +.>
Figure SMS_32
Representing the structure of a learned text semantic graph, +.>
Figure SMS_33
Representing a learned text dependency graph structure, +.>
Figure SMS_34
Representing the learned text co-occurrence graph structure.
Preferably, in step S3, the redundancy elimination process is specifically: three text graph structure feature mappings generated in graph structure learning of opposite side levels of a graph convolution neural network are used to obtain mapped feature vectors, mutual information of different nodes in the same text graph is maximized, mutual information of nodes in different text graphs is minimized, estimation is carried out on the three text graphs based on the mutual information, and an optimization objective function is as follows:
Figure SMS_35
wherein,,
Figure SMS_36
representing an optimized objective function>
Figure SMS_37
Representation->
Figure SMS_38
And->
Figure SMS_39
Mutual information estimation between +.>
Figure SMS_40
Side vector representing text semantic graph, +.>
Figure SMS_41
Side vector representing text dependency graph, +.>
Figure SMS_42
An edge vector representing a text co-line graph.
Preferably, in step S4, the edge vectors of the three text graphs are weighted and summed to obtain the final optimized graph structure
Figure SMS_43
The expression is as follows:
Figure SMS_44
wherein,,
Figure SMS_45
preferably, in step S5, the process of generating the graph-level text representation specifically includes: processing the final optimized text graph structure in step S4 using a graph convolution neural network
Figure SMS_46
And its features, update text semantic features +.>
Figure SMS_47
For->
Figure SMS_48
And carrying out global pooling processing to obtain a graph-level text representation, wherein the expression is as follows:
Figure SMS_49
wherein,,
Figure SMS_50
representation of the text at the representation level->
Figure SMS_51
For node->
Figure SMS_52
Is characterized by->
Figure SMS_53
Representing global pooling, ->
Figure SMS_54
A set of nodes representing a text graph.
In addition, the invention also discloses a text classification device based on hierarchical text graph structure learning, which comprises:
at least one processor;
and a memory communicatively coupled to the at least one processor;
wherein the memory is used for storing a computer program;
the processor is configured to implement the text classification method as described above when executing the computer program.
In addition, the invention also discloses a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the text classification method when being executed by a processor.
The technical scheme of the invention has the following beneficial effects:
(1) The invention adopts three different methods to extract three linguistic features of the text, describes the characteristics of the text from different aspects, carries out pretreatment on the three linguistic features, converts the text classification problem into the graph classification problem, converts words into nodes in the graph, and converts the relationship between the words into edges in the graph.
(2) The invention adopts the graph structure learning of the edge level and the graph level to learn the graph structures of the three text graphs with different granularities, the edge level and the graph level capture the structural characteristics of the graph from a thin granularity and a thick granularity respectively, and the confidence is given to each edge for learning the characteristic relation among the nodes with fine granularity. For the graph level, the graph structures of multiple sources are adaptively integrated to obtain an optimal combination mode of the graph structures of different sources, and the graph structures of different semantic features are integrated on the basis of removing repeated information through multi-granularity graph structure learning, so that the occurrence of the phenomenon of semantic loss of the graph structures in the subsequent learning process is prevented, and the model performance is further improved.
In addition to the objects, features and advantages described above, the present invention has other objects, features and advantages. The present invention will be described in further detail with reference to the drawings.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention. In the drawings:
FIG. 1 is a flow chart of the steps of a text classification method in a preferred embodiment of the invention;
FIG. 2 is a text co-occurrence diagram of a simulation experiment in a preferred embodiment of the present invention;
FIG. 3 is a text semantic graph of a simulation experiment in a preferred embodiment of the present invention;
FIG. 4 is a text grammar diagram of a simulation experiment in a preferred embodiment of the present invention;
fig. 5 is a final text diagram of a simulation experiment in a preferred embodiment of the present invention.
Detailed Description
Embodiments of the invention are described in detail below with reference to the attached drawings, but the invention can be implemented in a number of different ways, which are defined and covered by the claims.
Examples:
referring to fig. 1, the embodiment discloses a text classification method based on hierarchical text graph structure learning, which comprises the following steps:
step S1: inputting and preprocessing the training set text to be classified according to three different linguistic characteristics to obtain a node set and an edge set of the training set text, namely three graph structure matrixes; the three linguistic features are a text co-occurrence diagram, a text grammar diagram and a text semantic diagram respectively;
step S2: processing the three node sets and the graph structures of the edge sets by adopting a feature representation model based on graph structure learning of the edge level to obtain three edge vectors containing semantic information;
step S3: removing redundancy from the three edge vectors according to the measurement standard of mutual information to obtain three independent text edge vectors;
step S4: carrying out weighted summation on the three text edge vectors subjected to redundancy elimination to obtain a text graph structural representation integrating the global relationship;
step S5: processing the text graph structure obtained in the step S4 and the text semantic features corresponding to the text graph structure by adopting a graph convolution neural network, and generating graph-level text representation through a graph pooling layer;
step S6: and (5) carrying out softmax classification on the graph-level text representation obtained in the step (S5), and taking the category with the highest probability as a final classification result.
Applying the above method to a text data set
Figure SMS_58
Text classification is carried out, wherein->
Figure SMS_59
Represents the->
Figure SMS_62
Sample number->
Figure SMS_63
Indicate->
Figure SMS_65
Text units->
Figure SMS_67
Representing the corresponding tag. For->
Figure SMS_68
A text co-occurrence map is constructed using three patterning methods>
Figure SMS_56
Text grammar map->
Figure SMS_60
Text semantic graph
Figure SMS_61
Wherein->
Figure SMS_64
Representing text diagram->
Figure SMS_66
Node of->
Figure SMS_69
Adjacency matrix representing text diagram for representing text diagram +.>
Figure SMS_70
Is a topology of (a). From the dataset->
Figure SMS_71
Is selected to be +.>
Figure SMS_55
Text of->
Figure SMS_57
Specifically, in step S1, the text co-occurrence diagram construction mode specifically includes: will be in text
Figure SMS_72
Is->
Figure SMS_73
Expressed as text co-occurrence diagram->
Figure SMS_74
Node->
Figure SMS_75
The edge weight between any two word nodes in the graph adopts the point-to-point information of the word nodes
Figure SMS_76
The edge weight expression for the text co-occurrence graph is represented as follows:
Figure SMS_77
wherein,,
Figure SMS_78
edge weight representing text co-occurrence graph, +.>
Figure SMS_79
Representation word node->
Figure SMS_80
He word node->
Figure SMS_81
Is a piece of dot mutual information.
Specifically, in step S1, the text grammar map construction mode specifically includes: make the following stepsExtracting text with parsing tool
Figure SMS_82
Is->
Figure SMS_83
Syntax dependency of->
Figure SMS_84
Generating a relation triplet->
Figure SMS_85
Use +.>
Figure SMS_86
As nodes of the text grammar, the dependency relationships are used as edges among the nodes, the edge weights are expressed by using the frequencies of the dependency relationships in the data set, and the edge weight expression of the text grammar is as follows:
Figure SMS_87
wherein,,
Figure SMS_88
edge weights representing text grammar map, +.>
Figure SMS_89
Representing the number of times two words have syntactic dependencies in all sentences of the corpus, ++>
Figure SMS_90
Representing the number of times two words are present in the same sentence in all sentences of the corpus.
Specifically, in step S1, the text semantic graph construction method specifically includes: processing text using a pre-trained BERT model
Figure SMS_91
Arbitrary word +.>
Figure SMS_92
Obtaining a specialSyndrome vector->
Figure SMS_93
Using cosine similarity to calculate semantic similarity between feature vectors, if the semantic similarity is greater than a set threshold +.>
Figure SMS_94
Then it is indicated that there is a semantic relationship to the word, and the edge weight expression of the text semantic graph is as follows:
Figure SMS_95
wherein,,
Figure SMS_96
edge weights representing text semantic graphs, +.>
Figure SMS_97
Representing the number of times two words have semantic relations in all sentences of the corpus, +.>
Figure SMS_98
Representing the number of times two words are present in the same sentence in all sentences of the corpus.
Further, text graph structure learning is performed on the three constructed text graphs, including graph structure learning at the edge level and graph structure learning at the graph level. The edge level and the graph level capture the structural characteristics of the graph from a thin granularity and a thick granularity respectively, and confidence is given to each edge to learn the characteristic relation among nodes with the thin granularity. For the graph level, the graph structures of multiple sources are adaptively integrated to obtain an optimal combination mode of the graph structures of different sources. Modeling feature relationships from the fine granularity of edges enables more precise control of the flow of information during learning from the bottom layer, whereas a single source graph structure represents only graph structure data that is described from one perspective, potentially resulting in bias in classification results.
Specifically, in step S2, the edge level diagram structure learning (edge-level graph structure learning) process is specifically as follows:
because errors exist in the composition process, the original graph structure may be wrong and incomplete, firstly, the graph structure at the edge level is learned to endow confidence to the original graph structure matrix, the graph structure matrix is optimized based on the confidence, and the confidence optimization expression is as follows:
Figure SMS_99
wherein,,
Figure SMS_100
for confidence matrix, ++>
Figure SMS_101
For the original graph matrix, ">
Figure SMS_102
For linear mapping +.>
Figure SMS_103
Matrix with all elements 1, +.>
Figure SMS_104
Representing the optimized graph structure.
In this embodiment, when defining the confidence matrix, the assumption that the features of adjacent nodes are similar is adopted, and the relationship between nodes is modeled by using the attention of the global graph, and the elements
Figure SMS_105
Is defined as follows:
Figure SMS_106
wherein,,
Figure SMS_107
representing node->
Figure SMS_108
And node->
Figure SMS_109
Relation between->
Figure SMS_110
To activate the function.
Element(s)
Figure SMS_111
Is substituted into the confidence optimization expression to obtain the final adjacency matrix +/at each iteration>
Figure SMS_112
In order to further enhance the smoothness of the graph nodes, the present embodiment uses laplace regularization to constrain the characteristics of the nodes, and uses the characteristics as a likelihood function of bayesian estimation, and the expression is as follows:
Figure SMS_113
wherein,,
Figure SMS_114
features representing nodes of the graph, < >>
Figure SMS_115
Representing a normalized Laplace matrix, +.>
Figure SMS_116
Representing the parameters that are predefined and,
Figure SMS_117
the smaller the difference between the neighboring nodes of the graph, the smaller the difference, indicating that the two nodes are more similar.
In order to further give the learned graph symmetry and simplicity properties, it is necessary to constrain the learning process of the adjacency matrix by an a priori function defined as follows:
Figure SMS_118
wherein,,
Figure SMS_120
for the constraint of symmetry on the graph, +.>
Figure SMS_121
Is a constraint on simplicity in the drawing. Symmetry and simplicity for the graph satisfying both constraints, +.>
Figure SMS_122
And->
Figure SMS_123
For manually adjusted superparameters,/->
Figure SMS_124
For learned graph structure->
Figure SMS_125
Transposed matrix of>
Figure SMS_126
Representing the diagram structure->
Figure SMS_119
Is a matrix of the matrix.
And combining the likelihood function and the prior function, and restraining the adjacency matrix of the learned graph through a Bayesian estimation framework, wherein the expression is as follows:
Figure SMS_127
wherein,,
Figure SMS_128
representation->
Figure SMS_129
Constraint function for constraining the adjacency matrix of +.>
Figure SMS_130
For manually adjusted superparameters,/->
Figure SMS_131
Expressed as natural constant->
Figure SMS_132
An exponential function of the base.
The above optimization and constraint are performed on the three text graphs respectively, and the final loss function expression is as follows:
Figure SMS_133
wherein,,
Figure SMS_134
loss function representing learning of edge level graph structure, +.>
Figure SMS_135
Representing the structure of a learned text semantic graph, +.>
Figure SMS_136
Representing a learned text dependency graph structure, +.>
Figure SMS_137
Representing the learned text co-occurrence graph structure.
By learning the graph structures at the edge level for the three text graphs, three optimized graph structures are obtained, each of which contains some unique information. Integration of these three graph structures is required to produce the final graph structure. Since the three text graphs may contain repeated redundant information, the redundant information needs to be removed, so that the independence of the text graphs is improved. The present invention uses mutual information as a measure of text graph independence.
If the correlation between two text graphs is high, the mutual information between them is also large, and vice versa. However, in practical applications, it is difficult to directly calculate the mutual information of the graph, and thus the InfoNCE method is used to estimate the lower bound of the mutual information.
Specifically, in step S3, the redundancy elimination process specifically includes: feature mapping is carried out on three text graph structures generated in edge level graph structure learning by using graph convolution neural network GCN, mapped feature vectors are obtained, and semantic text graphs are used
Figure SMS_138
For example, a mapped feature vector +.>
Figure SMS_139
The other two types of text graphs are mapped in the same manner, and are not repeated here.
After the feature vectors of three texts are obtained, infoNCE is used for limiting the relation between the text graphs, the mutual information of different nodes in the same text graph is maximized, the mutual information of the nodes in different text graphs is minimized, the relation between the text semantic graph and the text dependency graph is taken as an example, and the expression is as follows:
Figure SMS_140
wherein,,
Figure SMS_141
representing node->
Figure SMS_142
And node->
Figure SMS_143
Similarity between->
Figure SMS_144
Is the temperature coefficient in InfoNCE.
And estimating the three text graphs based on mutual information, wherein the optimization objective function is as follows:
Figure SMS_145
wherein,,
Figure SMS_146
representing an optimized objective function>
Figure SMS_147
Representation->
Figure SMS_148
And->
Figure SMS_149
Mutual information estimation between +.>
Figure SMS_150
Side vector representing text semantic graph, +.>
Figure SMS_151
Side vector representing text dependency graph, +.>
Figure SMS_152
An edge vector representing a text co-line graph.
Specifically, in step S4, the edge vectors of the three text graphs are weighted and summed to obtain a final optimized graph structure
Figure SMS_153
The expression is as follows:
Figure SMS_154
wherein,,
Figure SMS_155
specifically, in step S5, the process of generating the graph-level text representation specifically includes: processing the final optimized graph structure in step S4 using a graph convolution neural network
Figure SMS_156
And its features->
Figure SMS_157
Update semantic feature of text->
Figure SMS_158
For->
Figure SMS_159
And carrying out global pooling processing to obtain a graph-level text representation, wherein the expression is as follows:
Figure SMS_160
wherein,,
Figure SMS_161
representation of the text at the representation level->
Figure SMS_162
For node->
Figure SMS_163
Is characterized by->
Figure SMS_164
Representing a global pooling process.
Specifically, in step S6, the expression of the softmax classification is as follows:
Figure SMS_165
wherein,,
Figure SMS_166
representing the final result of the classification,/->
Figure SMS_167
Is a learnable mapping matrix +.>
Figure SMS_168
In the event of a bias that is to be learnable,
Figure SMS_169
representation->
Figure SMS_170
A function.
According to an embodiment of the present invention, step S1 is actually a preprocessing stage of the label text of the training set, in which process the text classification problem is actually converted into the graph classification problem, the words are converted into nodes in the graph, the relationships between the words are converted into edges in the graph, and the three different linguistic features are converted. The steps S2, S3 and S4 are the process of learning the graph structure of different granularity for the input text graph, and integrate the representation of the graph structure under different viewing angles. The above steps have two advantages over conventional methods: (1) When the text graph is generated, various linguistic rules can be utilized to extract graph nodes and edges, so that the accuracy of text classification is effectively improved; (2) Through multi-granularity graph structure learning, the graph structures with different semantic features are integrated on the basis of removing repeated information, so that the occurrence of the phenomenon of graph structure semantic loss in the subsequent learning process is prevented, and the model performance is further improved.
Simulation experiment:
the present example performed a simulation experiment on the public dataset MR. MR is a movie rating dataset that contains user reviews of movies and corresponding categories. These categories are classified into positive and negative evaluations. In the embodiment, a simulation experiment is performed by randomly extracting one comment sample in the MR data set, so as to evaluate whether the method disclosed by the invention achieves the effect of relearning the graph structure of the sample. Three text graphs, namely a text co-occurrence graph shown in fig. 2, a text grammar graph shown in fig. 3 and a text semantic graph shown in fig. 4, are constructed according to the step S1 of the invention for the randomly extracted sentence "Take Care of My Cat offers a refreshingly different slice of Asian cinema". From these three text graphs, the final text graph, i.e., the graph-level text representation, as shown in fig. 5 is obtained and text-classified. The abscissa in fig. 2, 3, 4, 5 represents the unique word in this comment sample, the left ordinate also represents the unique word in this comment sample, and the right ordinate represents the strength of the relationship between the words. Color blocks in the matrix are larger than 0 to indicate that the relation between the corresponding words is positive, and the larger the numerical value is, the stronger the positive relation between the words is, namely the positive effect on the classification result is, the smaller the color blocks are, the smaller the numerical value is, the relation between the corresponding words is negative, and the stronger the negative relation between the words is, namely the stronger the negative effect on the classification result is. Color bars equal to 0 indicate that the relationship between the two words in the comment sample has no effect on the classification result. In summary, relearned FIG. 5 facilitates the method disclosed in this example in evaluating whether this comment sample belongs to a positive or negative rating.
The translation of the randomly extracted comment sample "Take Care of My Cat offers a refreshingly different slice of Asian cinema" is that "care for my cat" provides a completely new bridge for asian movies. The true classification result of this comment sample is a positive evaluation. As can be seen from the finally learned text chart 5, the method disclosed in this embodiment relearns the text chart and discards some erroneous relations between words in the original text chart, such as (wake, care) in the text co-occurrence chart 2 and the text grammar chart 3, and these relations exist in the movie title Take Care of My Cat, and have no positive effect on correctly judging this comment text as a classification result of the forward evaluation. In addition, the learned final text fig. 5 also adds some new relationships, such as (references), which helps the method disclosed in this example to determine this text as a positive evaluation category.
In addition, the embodiment also discloses a text classification device based on hierarchical text graph structure learning, which comprises:
at least one processor;
and a memory communicatively coupled to the at least one processor;
wherein the memory is used for storing a computer program;
the processor is configured to implement the text classification method as described above when executing the computer program.
In addition, the embodiment also discloses a computer readable storage medium, and the computer readable storage medium stores a computer program, and the computer program realizes the text classification method when being executed by a processor.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. The text classification method based on hierarchical text graph structure learning is characterized by comprising the following steps:
step S1: inputting and preprocessing the training set text to be classified according to three different linguistic characteristics to obtain node sets and edge sets of the three training set text, namely three graph structure matrixes; the three linguistic features are a text co-occurrence diagram, a text grammar diagram and a text semantic diagram respectively;
step S2: adopting a characteristic representation model based on edge level graph structure learning to process the three node sets and the graph structures of the edge sets to obtain three edge vectors;
step S3: removing redundancy of the three types of edge vectors according to the measurement standard of mutual information to obtain three types of text edge vectors;
step S4: carrying out weighted summation on the three text edge vectors to obtain text graph structural representation;
step S5: processing the text graph structural representation obtained in the step S4 and the text semantic features corresponding to the text structural representation by adopting a graph convolution neural network, and generating a graph-level text representation through a graph pooling layer;
step S6: and (5) carrying out softmax classification on the graph-level text representation obtained in the step (S5), and taking the category with the highest probability as a final classification result.
2. The text classification method according to claim 1, wherein in step S1, the text co-occurrence graph construction mode specifically includes: will be in text
Figure QLYQS_1
Is->
Figure QLYQS_2
Expressed as text co-occurrence diagram->
Figure QLYQS_3
Node->
Figure QLYQS_4
The edge weight between any two word nodes in the graph adopts the dot mutual information of the word nodes +.>
Figure QLYQS_5
The edge weight expression for the text co-occurrence graph is represented as follows:
Figure QLYQS_6
wherein,,
Figure QLYQS_7
edge weight representing text co-occurrence graph, +.>
Figure QLYQS_8
Representation word node->
Figure QLYQS_9
He word node->
Figure QLYQS_10
Is a piece of dot mutual information.
3. The text classification method according to claim 1, wherein in step S1, the text grammar map is constructed in a manner that: extracting text using parsing tools
Figure QLYQS_11
Is->
Figure QLYQS_12
Syntax dependency of->
Figure QLYQS_13
Generating a relation triplet->
Figure QLYQS_14
Use +.>
Figure QLYQS_15
As nodes of the text grammar, the dependency relationships are used as edges among the nodes, the edge weights are expressed by using the frequencies of the dependency relationships in the data set, and the edge weight expression of the text grammar is as follows:
Figure QLYQS_16
wherein,,
Figure QLYQS_17
edge weights representing text grammar map, +.>
Figure QLYQS_18
Representing the number of times two words have syntactic dependencies in all sentences of the corpus, ++>
Figure QLYQS_19
Representing the number of times two words are present in the same sentence in all sentences of the corpus.
4. The text classification method according to claim 1, wherein in step S1, the text semantic graph is constructed specifically by: encoding text using BERT model
Figure QLYQS_20
Arbitrary word +.>
Figure QLYQS_21
Obtaining a feature vector->
Figure QLYQS_22
Using cosine similarity to calculate semantic similarity between feature vectors, if the semantic similarity is greater than a set threshold +.>
Figure QLYQS_23
Then it is indicated that there is a semantic relationship to the word, and the edge weight expression of the text semantic graph is as follows:
Figure QLYQS_24
wherein,,
Figure QLYQS_25
edge weights representing text semantic graphs, +.>
Figure QLYQS_26
Representing the number of times two words have semantic relations in all sentences of the corpus, +.>
Figure QLYQS_27
Representing the number of times two words are present in the same sentence in all sentences of the corpus.
5. The text classification method according to claim 1, wherein in step S2, the process of learning the graph structure is specifically: giving confidence to the graph structure matrix, and optimizing the graph structure matrix based on the confidence; using Laplace regularization to restrict the characteristics of the nodes, and using the characteristics as likelihood functions of Bayesian estimation; setting a learning process of a priori function constraint adjacency matrix; combining the likelihood function and the prior function, and restraining the adjacency matrix of the learned graph through a Bayesian estimation framework;
the above optimization and constraint are performed on the three text graphs respectively, and the final loss function expression is as follows:
Figure QLYQS_28
wherein,,
Figure QLYQS_29
loss function representing graph structure learning at edge level, +.>
Figure QLYQS_30
Representation->
Figure QLYQS_31
Constraint function for constraining the adjacency matrix of +.>
Figure QLYQS_32
Representing the structure of a learned text semantic graph, +.>
Figure QLYQS_33
Representing a learned text dependency graph structure, +.>
Figure QLYQS_34
Representing the learned text co-occurrence graph structure.
6. The text classification method according to claim 5, wherein in step S3, the redundancy removing process is specifically: three text graph structure feature mappings generated in graph structure learning of opposite side levels of a graph convolution neural network are used to obtain mapped feature vectors, mutual information of different nodes in the same text graph is maximized, mutual information of nodes in different text graphs is minimized, estimation is carried out on the three text graphs based on the mutual information, and an optimization objective function is as follows:
Figure QLYQS_35
wherein,,
Figure QLYQS_36
representing an optimized objective function>
Figure QLYQS_37
Representation->
Figure QLYQS_38
And->
Figure QLYQS_39
Mutual information estimation between +.>
Figure QLYQS_40
Side vector representing text semantic graph, +.>
Figure QLYQS_41
Side vector representing text dependency graph, +.>
Figure QLYQS_42
An edge vector representing a text co-line graph.
7. The text classification method of claim 6, wherein in step S4, the edge vectors of the three text graphs are weighted and summed to obtain the final optimized graph structure
Figure QLYQS_43
The expression is as follows:
Figure QLYQS_44
wherein,,
Figure QLYQS_45
8. the text classification method according to claim 7, characterized in that in step S5, a graph-level text is generatedThe process of the present representation is specifically: processing the final optimized text graph structure in step S4 using a graph convolution neural network
Figure QLYQS_46
And its features, update text semantic features +.>
Figure QLYQS_47
For->
Figure QLYQS_48
And carrying out global pooling processing to obtain a graph-level text representation, wherein the expression is as follows:
Figure QLYQS_49
wherein,,
Figure QLYQS_50
representation of the text at the representation level->
Figure QLYQS_51
For node->
Figure QLYQS_52
Is characterized by->
Figure QLYQS_53
Representing global pooling, ->
Figure QLYQS_54
A set of nodes representing a text graph.
9. Text classification device based on hierarchical text graph structure study, characterized by comprising:
at least one processor;
and a memory communicatively coupled to the at least one processor;
wherein the memory is used for storing a computer program;
the processor is configured to implement the text classification method according to any of claims 1 to 8 when executing the computer program.
10. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the text classification method according to any of claims 1 to 8.
CN202310551919.9A 2023-05-17 2023-05-17 Text classification method, device and medium based on hierarchical text graph structure learning Active CN116304061B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310551919.9A CN116304061B (en) 2023-05-17 2023-05-17 Text classification method, device and medium based on hierarchical text graph structure learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310551919.9A CN116304061B (en) 2023-05-17 2023-05-17 Text classification method, device and medium based on hierarchical text graph structure learning

Publications (2)

Publication Number Publication Date
CN116304061A true CN116304061A (en) 2023-06-23
CN116304061B CN116304061B (en) 2023-07-21

Family

ID=86794469

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310551919.9A Active CN116304061B (en) 2023-05-17 2023-05-17 Text classification method, device and medium based on hierarchical text graph structure learning

Country Status (1)

Country Link
CN (1) CN116304061B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116805059A (en) * 2023-06-26 2023-09-26 重庆邮电大学 Patent classification method based on big data
CN117435747A (en) * 2023-12-18 2024-01-23 中南大学 Few-sample link prediction drug recycling method based on multilevel refinement network

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200364409A1 (en) * 2019-05-17 2020-11-19 Naver Corporation Implicit discourse relation classification with contextualized word representation
US20210248425A1 (en) * 2020-02-12 2021-08-12 Nec Laboratories America, Inc. Reinforced text representation learning
WO2022001333A1 (en) * 2020-06-30 2022-01-06 首都师范大学 Hyperbolic space representation and label text interaction-based fine-grained entity recognition method
CN114186063A (en) * 2021-12-14 2022-03-15 合肥工业大学 Training method and classification method of cross-domain text emotion classification model
CN114528374A (en) * 2022-01-19 2022-05-24 浙江工业大学 Movie comment emotion classification method and device based on graph neural network
CN114548099A (en) * 2022-02-25 2022-05-27 桂林电子科技大学 Method for jointly extracting and detecting aspect words and aspect categories based on multitask framework
CN115858725A (en) * 2022-11-22 2023-03-28 广西壮族自治区通信产业服务有限公司技术服务分公司 Method and system for screening text noise based on unsupervised graph neural network
CN115858788A (en) * 2022-12-19 2023-03-28 福州大学 Visual angle level text emotion classification system based on double-graph convolutional neural network
CN115878800A (en) * 2022-12-12 2023-03-31 上海理工大学 Double-graph neural network fusing co-occurrence graph and dependency graph and construction method thereof

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200364409A1 (en) * 2019-05-17 2020-11-19 Naver Corporation Implicit discourse relation classification with contextualized word representation
US20210248425A1 (en) * 2020-02-12 2021-08-12 Nec Laboratories America, Inc. Reinforced text representation learning
WO2022001333A1 (en) * 2020-06-30 2022-01-06 首都师范大学 Hyperbolic space representation and label text interaction-based fine-grained entity recognition method
CN114186063A (en) * 2021-12-14 2022-03-15 合肥工业大学 Training method and classification method of cross-domain text emotion classification model
CN114528374A (en) * 2022-01-19 2022-05-24 浙江工业大学 Movie comment emotion classification method and device based on graph neural network
CN114548099A (en) * 2022-02-25 2022-05-27 桂林电子科技大学 Method for jointly extracting and detecting aspect words and aspect categories based on multitask framework
CN115858725A (en) * 2022-11-22 2023-03-28 广西壮族自治区通信产业服务有限公司技术服务分公司 Method and system for screening text noise based on unsupervised graph neural network
CN115878800A (en) * 2022-12-12 2023-03-31 上海理工大学 Double-graph neural network fusing co-occurrence graph and dependency graph and construction method thereof
CN115858788A (en) * 2022-12-19 2023-03-28 福州大学 Visual angle level text emotion classification system based on double-graph convolutional neural network

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BINGXIN XUE等: "The Study on the Text Classification Based on Graph Convolutional Network and BiLSTM", ICCAI \'22: PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON COMPUTING AND ARTIFICIAL INTELLIGENCE *
吴思竹;张智雄;钱庆;: "基于语言网络的文本表示模型研究", 情报科学, no. 12 *
李纲;毛进;: "文本图表示模型及其在文本挖掘中的应用", 情报学报, no. 12 *
陈科文等: "文本分类中基于熵的词权重计算方法研究", 计算机科学与探索, vol. 10, no. 9 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116805059A (en) * 2023-06-26 2023-09-26 重庆邮电大学 Patent classification method based on big data
CN116805059B (en) * 2023-06-26 2024-04-09 重庆邮电大学 Patent classification method based on big data
CN117435747A (en) * 2023-12-18 2024-01-23 中南大学 Few-sample link prediction drug recycling method based on multilevel refinement network
CN117435747B (en) * 2023-12-18 2024-03-29 中南大学 Few-sample link prediction drug recycling method based on multilevel refinement network

Also Published As

Publication number Publication date
CN116304061B (en) 2023-07-21

Similar Documents

Publication Publication Date Title
CN116304061B (en) Text classification method, device and medium based on hierarchical text graph structure learning
CN109214006B (en) Natural language reasoning method for image enhanced hierarchical semantic representation
CN111666758B (en) Chinese word segmentation method, training device and computer readable storage medium
CN110046262A (en) A kind of Context Reasoning method based on law expert&#39;s knowledge base
CN113051399B (en) Small sample fine-grained entity classification method based on relational graph convolutional network
CN115099219A (en) Aspect level emotion analysis method based on enhancement graph convolutional neural network
CN111274790A (en) Chapter-level event embedding method and device based on syntactic dependency graph
CN113254581B (en) Financial text formula extraction method and device based on neural semantic analysis
CN112836051B (en) Online self-learning court electronic file text classification method
CN114722820A (en) Chinese entity relation extraction method based on gating mechanism and graph attention network
CN113705196A (en) Chinese open information extraction method and device based on graph neural network
CN116521882A (en) Domain length text classification method and system based on knowledge graph
CN111881256A (en) Text entity relation extraction method and device and computer readable storage medium equipment
CN112632253A (en) Answer extraction method and device based on graph convolution network and related components
CN114861636A (en) Training method and device of text error correction model and text error correction method and device
CN114722833A (en) Semantic classification method and device
CN113191150A (en) Multi-feature fusion Chinese medical text named entity identification method
CN117251522A (en) Entity and relationship joint extraction model method based on latent layer relationship enhancement
CN112100342A (en) Knowledge graph question-answering method based on knowledge representation learning technology
CN117034916A (en) Method, device and equipment for constructing word vector representation model and word vector representation
CN117009213A (en) Metamorphic testing method and system for logic reasoning function of intelligent question-answering system
Sekiyama et al. Automated proof synthesis for propositional logic with deep neural networks
Wei Recommended methods for teaching resources in public English MOOC based on data chunking
Sekiyama et al. Automated proof synthesis for the minimal propositional logic with deep neural networks
JP6586055B2 (en) Deep case analysis device, deep case learning device, deep case estimation device, method, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant