CN110781304A - Sentence coding method using word information clustering - Google Patents

Sentence coding method using word information clustering Download PDF

Info

Publication number
CN110781304A
CN110781304A CN201911039124.XA CN201911039124A CN110781304A CN 110781304 A CN110781304 A CN 110781304A CN 201911039124 A CN201911039124 A CN 201911039124A CN 110781304 A CN110781304 A CN 110781304A
Authority
CN
China
Prior art keywords
capsule
target
word
layer
original
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911039124.XA
Other languages
Chinese (zh)
Other versions
CN110781304B (en
Inventor
曹杰
郭翔
王有权
申冬琴
李秀怡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunjing Business Intelligence Research Institute Nanjing Co Ltd
Nanjing University of Finance and Economics
Original Assignee
Yunjing Business Intelligence Research Institute Nanjing Co Ltd
Nanjing University of Finance and Economics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunjing Business Intelligence Research Institute Nanjing Co Ltd, Nanjing University of Finance and Economics filed Critical Yunjing Business Intelligence Research Institute Nanjing Co Ltd
Priority to CN201911039124.XA priority Critical patent/CN110781304B/en
Publication of CN110781304A publication Critical patent/CN110781304A/en
Application granted granted Critical
Publication of CN110781304B publication Critical patent/CN110781304B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3347Query execution using vector based model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a sentence coding method using word information clustering. In one embodiment, mapping each word in a sentence sequence with a specific length into a word vector space, and obtaining a word vector of each word; acquiring a coding vector of each word vector and performing nonlinear extrusion on each coding vector to obtain a capsule; obtaining a plurality of capsules to form an original capsule layer, and extracting semantic information of words with specific semantic features from the original capsule layer by using a capsule protocol algorithm to form a first target capsule layer; and performing information conversion on the first target capsules in the first target capsule layer by utilizing a capsule protocol algorithm to form a second target capsule layer with the classified number. By adopting a capsule protocol algorithm, information is transmitted according to different requirements of the target capsule on the original capsule, longer sentence characteristics can be obtained, and the accuracy of sentence classification can be effectively improved.

Description

Sentence coding method using word information clustering
Technical Field
The invention relates to the technical field of information clustering, in particular to a sentence coding method utilizing word information clustering.
Background
Deep learning makes a major breakthrough in the natural language field by performing deep semantic modeling on a text, however, how to learn to express high-quality features is a great challenge, and from extracting sentence local sequence features by using n-garm convolution, extracting important features in a local sequence by using a maximum pooling layer, and then to modeling the text sequence by using RNN, convolution focuses more on extraction of local sequence features than RNN, but is affected by n-gram, and is not easy to capture longer sentence features, and RNN can capture longer sentence features, but extraction of sentence features is not as convolution.
Disclosure of Invention
In view of the above, embodiments of the present application provide a sentence encoding method using word information clustering.
In a first aspect, the present invention provides a sentence encoding method using word information clustering, including:
mapping each word in the sentence sequence with the specific length into a word vector space, and acquiring a word vector of each word;
acquiring a coding vector of each word vector and performing nonlinear extrusion on each coding vector to obtain a capsule;
obtaining a plurality of capsules to form an original capsule layer, extracting semantic information of words with specific semantic features from the original capsule layer by utilizing a capsule protocol algorithm, and forming a first target capsule layer;
and performing information conversion on the first target capsules in the first target capsule layer by utilizing a capsule protocol algorithm to form a second target capsule layer with the classified number.
Optionally, the obtaining of the encoding vector of each word in the word vector space includes: inputting the word vector of each word into a bi-directional LSTM (bi-LSTM) model, and respectively acquiring the sentence sequence information of forward propagation of the word vector
Figure BDA0002252370810000021
And back propagation sequence information
Figure BDA0002252370810000022
Then the two vectors are spliced to form the required coding vector h i
Figure BDA0002252370810000023
Figure BDA0002252370810000025
Thus, the vector output formed by the BiLSTM encoding is:
H=[h 1,h 2,…h L]。
optionally, the obtaining a plurality of capsule-forming raw capsule layers comprises:
P=[p 1,p 2…p L]
s i=σ(w sp i+b s)
k i=tanh(w kp i+b k)
u i=s i·k i
where P denotes the original set of capsules formed by the coding layers, P iShowing the original capsule layeri capsules, w sRepresenting contribution matrix parameters, b sThe bias parameter, σ, represents the sigmod activation function, passes through the equation s ═ σ (w) sp i+b s) Supply gate, w, forming the original capsule i kA matrix of significant values representing the original capsule, b kRepresents the offset value, and is expressed by the formula k ═ tanh (w) kp i+b k) And obtaining an effective value of the original capsule i, and forming a value u which can be contributed by the capsule i through a formula u-s.
Optionally, the first target capsule comprises:
Y=[y 1,y 2…y m]
n j=σ(w ny j+b n)
c j=tanh(w cy j+b c)
v j=n j·c j
wherein Y represents a first set of target capsules, Y jDenotes the jth capsule in the first target capsule layer, w nRepresenting the parameters of the demand matrix, b nThe bias parameter, σ, represents the sigmod activation function, and is given by the formula n ═ σ (w) ny j+b n) A demand gate, w, forming a first target capsule j cTabular state matrix parameters, b cOffset parameter, via c ═ tanh (w) cy j+b c) And forming a current state value of the first target capsule i, and forming a content value, namely v, required by the capsule j in the current state through the formula v-n-c.
Optionally, the extracting semantic information of words with specific semantic features from the initialized capsule layer as an original capsule layer by using a capsule protocol algorithm, and forming a first target capsule layer includes:
f ij=u i·v j
Figure BDA0002252370810000032
Figure BDA0002252370810000033
f ij=u i·v jrepresenting a similarity relationship between information available from the original capsule i and information required for the first target capsule j, using Representing the magnitude of its calculated similarity, F ijNormalization by softmax function to form a ijRepresenting the amount of information converted from the original capsule i to the first target capsule j; initializing the state value c of the first target capsule j jAnd the value a absorbed from each capsule in the original capsule layer ijThe new first target capsule state value is formed by addition.
Optionally, the method further comprises:
representing the probability of the content characterized by each second target capsule by the vector length of the second target capsule; calculating the vector length of each second target capsule by using the L2 norm; and determining the final class of each second target capsule according to the vector length of each second target capsule.
Optionally, the method further comprises: the textual losses of the capsule in the classification layer are calculated using a space loss function.
Optionally, the calculating the textual loss of the classification layer capsule using the interval loss function includes:
L e=T emax(0,m +-||v e||) 2+λ(1-T e)max(0,||v e||-m -) 2
L eis the loss value, T, of the e-th capsule of the classification layer eFor indicating the function, the value is 1 or 0, when in class e, T eIs 1, otherwise is 0, m +=0.9,λ=0.5,m -=0.1,m +Is an upper bound, m -Is the lower bound. With total loss of individual capsules in separate layersThe sum of the losses.
In one embodiment, the method for coding sentences clustered by using word information utilizes a BILSTM network to code words in a sequence-based mode, and utilizes a capsule protocol algorithm to transmit information according to different requirements of a target capsule on an original capsule, namely, the characteristics of a high-level sentence capsule are formed according to the characteristics which can be provided by each word capsule. The sentence classification accuracy is effectively improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a flow chart of a sentence encoding method using word information clustering according to the present invention.
Detailed Description
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Inspired by the capsule network, the high-level capsule network is provided, an algorithm with a guiding function for the information transmission of the low-level capsule network is used for coding words in a sequence-based mode by using the BILSTM network, and then the provided capsule protocol algorithm is used for transmitting information according to different requirements of the high-level capsule on the low-level capsule, namely, the characteristics of the high-level capsule are formed according to the characteristics provided by each word capsule.
FIG. 1 is a flow chart of a sentence encoding method using word information clustering according to the present invention. As shown in fig. 1, comprises the steps of:
step S101: mapping each word in the sentence sequence with the specific length into a word vector space, and acquiring a word vector of each word;
at the word vector embedding level, for a given specific lengthSentence sequence S ═ w 1,w 2,w 3…w LEach of w iAll are symbols, and one-hot representation is adopted, so that the direct relation between words cannot be calculated, and the direct relation cannot be directly applied to a neural network model, therefore, the first step is to map each word to a d-dimensional word vector space, so that even if the relation between the words is possessed, the words can be used as the input of the neural network model:
X=[x 1,x 2,x 3,…,x L](1)
the word vectors of the word vector space are generated by random initialization.
Step S102: in the word vector space, obtaining the coding vector of each word vector and carrying out nonlinear extrusion on each coding vector to obtain a capsule;
in the word vector space, each x iThe method is independent from other words in the sentence X, generally, for semantic understanding of a sentence, the dependency relationship presented by each word in the sentence X is needed, in order to obtain the dependency relationship between the words in the sentence, bi-directional LSTM (bilSTM) is adopted to carry out the operation of dividing each word X in the sentence X into two words iInputting, respectively obtaining the sentence sequence information of forward propagation thereof And back propagation sequence information
Figure BDA0002252370810000052
Then the 2 vectors are spliced to form the required coding vector h in the coding layer i
Figure BDA0002252370810000054
Figure BDA0002252370810000055
Thus, the vector output formed by the BiLSTM encoding is:
H=[h 1,h 2,…h L](5)
step S103: the method comprises the steps of obtaining a plurality of capsules to form an original capsule layer, extracting semantic information of words with specific semantic features from the original capsule layer by utilizing a capsule protocol algorithm, and forming a first target capsule layer.
H to be formed by the coding layer iThrough which is passed
Figure BDA0002252370810000061
Non-linear extrusion as a capsule p iForming an original capsule layer P ═ P 1,p 2…p L]The coding layer makes each word in the sentence have a dependency relationship either directly or indirectly, and thus, each p iAll have certain semantic information.
Since sentences often contain words that are not useful for the final task, these words also form useless semantic information. Since such useless semantic information weakens the information of important words, it is necessary to remove the useless information in the low-level capsule layer, and specifically, it is possible to remove the semantic information of words that are useless for the last task in the original capsule layer, that is, extract the semantic information of important words that can act on the last task from the original capsule layer to form the target capsule layer.
Semantic information of words with specific semantic features is extracted from the original capsule by adopting a capsule protocol algorithm to form a first target capsule. In particular, the original set of capsules P is formed by the coding layers:
P=[p 1,p 2…p L](7)
then forming the donor door s of the original capsule i i
s i=σ(w sp i+b s) (8)
Wherein p is iI capsule representing the original capsule layer, w sRepresenting contribution matrix parameters, b sThe bias parameter, σ, represents the sigmod activation function.
Further, a donor door k for forming the original capsule i i
k i=tanh(w kp i+b k) (9)
Wherein, w kA matrix of significant values representing the original capsule, b kIndicating the offset value.
Supply s of original capsules i iAnd the donor door k of the original capsule i iMultiplication by element position forms the contributory value u of capsule i:
u i=s i·k i(10)
for a first target capsule, randomly generating a first target capsule set Y in an initial state
Y=[y 1,y 2…y m](11)
Then the demand gate n of the first target capsule j is formed j
n j=σ(w ny j+b n) (12)
Wherein, y jDenotes the jth capsule in the first target capsule layer, w nRepresenting the parameters of the demand matrix, b nThe bias parameter, σ, represents the sigmod activation function.
Further, a current state value c of the first target capsule i is acquired j
c j=tanh(w cy j+b c) (13)
Wherein, w cTabular state matrix parameters, b cA bias parameter.
The demand gate n for the first target capsule j jAnd the current state value c of the first target capsule i jAnd multiplying according to the element positions to obtain a content value v required by the first target capsule j in the current state.
v j=n j·c j(14)
Multiplying the formula (10) and the formula (14) to obtain an information conversion function from the original capsule to the first target capsule:
f ij=u i·v j(15)
wherein the formula (15) is that the original capsule contributory value is multiplied by the content of the first target capsule according to the element position, and represents the similarity relationship between the information donated by the original capsule i and the information required by the first target capsule j.
Further, the similarity degree calculated in the formula (15) is calculated,
Figure BDA0002252370810000071
f is to be ijNormalization by softmax function to form a ijIndicating the amount of information transferred between the original capsule i to the first target capsule j,
finally, the initialized state value c of the first target capsule j is calculated jAnd the value a absorbed from each capsule in the original capsule layer ijThe new first target capsule state value is formed by addition.
Figure BDA0002252370810000082
S104: performing information conversion on a first target capsule in the first target capsule layer by utilizing a capsule protocol algorithm to form a second target capsule layer with classified number;
in one possible embodiment, the m raw capsules are subjected to a first capsule protocol algorithm to form n first target capsules. And secondly, using a capsule protocol algorithm for n first target capsules to form a second target capsule layer with L classifications, wherein m, n and L are natural numbers which are more than or equal to 1.
In one possible embodiment, the number of classifications L is set in advance.
In the classification layer, the vector length of each second target capsule is calculated, and the vector length of each second target capsule is used for representing the probability of the content characterized by the second target capsule. And determining the class of the second target capsule according to the vector length of the second target capsule. Specifically, the vector length of each second target capsule is calculated using the L2 norm, and the second target capsule is assigned to the class with the largest vector length.
In one possible embodiment, the loss of the text of each capsule in the classification layer is calculated by using a spacing loss function,
L e=T emax(0,m +-||v e||) 2+λ(1-T e)max(0,||v e||-m -) 2(19)
wherein L is eIs the loss value, T, of the e-th capsule of the classification layer eFor indicating the function, the value is 1 or 0, when in class e, T eIs 1, otherwise is 0, m +=0.9,λ=0.5,m -=0.1,m +Is an upper bound, m -Is the lower bound. The total loss is the sum of the losses of the individual capsules of the classification layer.
In one possible embodiment, the model effect proposed by the present application was evaluated by performing experiments on 3 common data sets.
The 3 common data sets include:
subj is a subjective data set, and the task is to classify sentences subjectively or objectively;
TREC, TREC query dataset problem dataset, task to classify a problem into 6 categories (about people, location, numerical information, etc.);
AG' news: classifying news topics;
the model proposed by the present application was trained as described in table 1 using the Subj, TREC, AG' news data sets:
TABLE 1
Data set C (number of classification) I (sentence length) train test
Subj 2 23 9000 1000
TREC 6 10 5452 500
AG’S 4 233 120000 7600
Using the trained model, 3 public data sets (Subj, TREC, AG' news) were tested and the results shown in table 1 were obtained:
TABLE 2
Data set Accuracy rate
TREC 98.1%
AG’S 91.4%
Subj 91.43%
Experiments were performed on 3 public data sets (Subj, TREC, AG' news) using other data models, with results as shown in table 3:
TABLE 3
Figure BDA0002252370810000101
The classification accuracy of the same common data set is compared with the classification accuracy of the model in table 3, so that the accuracy of the model provided by the invention application is higher.
It should be noted that the model proposed in the present application is not limited to the classification of 3 common data sets (Subj, TREC, AG' news) from experiments performed in the examples of the application. The 3 common data sets (Subj, TREC, AG' news) of the experiments performed in the examples of the application are only one specific implementation in the examples of the application.
Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in this invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the present invention should be included in the scope of the present invention.

Claims (8)

1. A sentence encoding method using word information clustering, comprising:
mapping each word in the sentence sequence with the specific length into a word vector space, and acquiring a word vector of each word;
acquiring a coding vector of each word vector and performing nonlinear extrusion on each coding vector to obtain a capsule;
obtaining a plurality of capsules to form an original capsule layer, and extracting semantic information of words with specific semantic features from the original capsule layer by using a capsule protocol algorithm to form a first target capsule layer;
and performing information conversion on the first target capsules in the first target capsule layer by utilizing a capsule protocol algorithm to form a second target capsule layer with the classified number.
2. The method of claim 1, wherein obtaining the code vector of each word in the word vector space comprises: inputting the word vector of each word into a bi-directional LSTM (bi-LSTM) model, and respectively acquiring the sentence sequence information of forward propagation of the word vector
Figure FDA0002252370800000011
And back propagation sequence information
Figure FDA0002252370800000012
Then the two vectors are spliced to form the required coding vector h i
Figure FDA0002252370800000013
Figure FDA0002252370800000014
Figure FDA0002252370800000015
Thus, the vector output formed by the BiLSTM encoding is:
H=[h 1,h 2,…h L]。
3. the method of claim 1, wherein said obtaining a plurality of capsule forming raw capsule layers comprises:
P=[p 1,p 2…p L]
s i=σ(w sp i+b s)
k i=tanh(w kp i+b k)
u i=s i·k i
where P denotes the original set of capsules formed by the coding layers, P iI capsule representing the original capsule layer, w sRepresenting contribution matrix parameters, b sThe bias parameter, σ, represents the sigmod activation function, passes through the equation s ═ σ (w) sp i+b s) Supply gate, w, forming the original capsule i kA matrix of significant values representing the original capsule, b kRepresents the offset value, and is expressed by the formula k ═ tanh (w) kp i+b k) And obtaining an effective value of the original capsule i, and forming a value u which can be contributed by the capsule i through a formula u-s.
4. The method of claim 1, wherein the first target capsule comprises:
Y=[y 1,y 2…y m]
n j=σ(w ny j+b n)
c j=tanh(w cy j+b c)
v j=n j·c j
wherein Y represents a first set of target capsules, Y jDenotes the jth capsule in the first target capsule layer, w nRepresenting the parameters of the demand matrix, b nThe bias parameter, σ, represents the sigmod activation function, and is given by the formula n ═ σ (w) ny j+b n) A demand gate, w, forming a first target capsule j cTabular state matrix parameters, b cOffset parameter, via c ═ tanh (w) cy j+b c) And forming a current state value of the first target capsule i, and forming a content value, namely v, required by the capsule j in the current state through the formula v-n-c.
5. The method of claim 1, wherein extracting semantic information of words with specific semantic features from the original capsule layer using a capsule protocol algorithm to form a first target capsule layer comprises:
f ij=u i·v j
Figure FDA0002252370800000021
Figure FDA0002252370800000022
Figure FDA0002252370800000023
f ij=u i·v jrepresenting a similarity relationship between information available from the original capsule i and information required for the first target capsule j, using
Figure FDA0002252370800000031
Representing the magnitude of its calculated similarity, F ijNormalization by softmax function to form a ijRepresenting the amount of information converted from the original capsule i to the first target capsule j; to be the first eyeInitial state value c of target capsule j jAnd the value a absorbed from each capsule in the original capsule layer ijThe new first target capsule state value is formed by addition.
6. The method of claim 1, further comprising:
representing the probability of the content characterized by each second target capsule by the vector length of the second target capsule; calculating the vector length of each second target capsule by using the L2 norm; and determining the final class of each second target capsule according to the vector length of each second target capsule.
7. The method of claim 1, further comprising: the textual losses of the capsule in the classification layer are calculated using a space loss function.
8. The method of claim 7, wherein calculating the textual losses of the classified layered capsule using a spacing loss function comprises:
L e=T emax(0,m +-||v e||) 2+λ(1-T e)max(0,||v e||-m -) 2
L eis the loss value, T, of the e-th capsule of the classification layer eFor indicating the function, the value is 1 or 0, when in class e, T eIs 1, otherwise is 0, m +=0.9,λ=0.5,m -=0.1,m +Is an upper bound, m -Is the lower bound. The total loss is the sum of the losses of the individual capsules of the classification layer.
CN201911039124.XA 2019-10-29 2019-10-29 Sentence coding method using word information clustering Active CN110781304B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911039124.XA CN110781304B (en) 2019-10-29 2019-10-29 Sentence coding method using word information clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911039124.XA CN110781304B (en) 2019-10-29 2019-10-29 Sentence coding method using word information clustering

Publications (2)

Publication Number Publication Date
CN110781304A true CN110781304A (en) 2020-02-11
CN110781304B CN110781304B (en) 2023-09-26

Family

ID=69387404

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911039124.XA Active CN110781304B (en) 2019-10-29 2019-10-29 Sentence coding method using word information clustering

Country Status (1)

Country Link
CN (1) CN110781304B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109241283A (en) * 2018-08-08 2019-01-18 广东工业大学 A kind of file classification method based on multi-angle capsule network
CN109299262A (en) * 2018-10-09 2019-02-01 中山大学 A kind of text implication relation recognition methods for merging more granular informations
CN109410917A (en) * 2018-09-26 2019-03-01 河海大学常州校区 Voice data classification method based on modified capsule network
CN109410575A (en) * 2018-10-29 2019-03-01 北京航空航天大学 A kind of road network trend prediction method based on capsule network and the long Memory Neural Networks in short-term of nested type
CN110046671A (en) * 2019-04-24 2019-07-23 吉林大学 A kind of file classification method based on capsule network
CN110046249A (en) * 2019-03-11 2019-07-23 中国科学院深圳先进技术研究院 Training method, classification method, system, equipment and the storage medium of capsule network
CN110188195A (en) * 2019-04-29 2019-08-30 苏宁易购集团股份有限公司 A kind of text intension recognizing method, device and equipment based on deep learning

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109241283A (en) * 2018-08-08 2019-01-18 广东工业大学 A kind of file classification method based on multi-angle capsule network
CN109410917A (en) * 2018-09-26 2019-03-01 河海大学常州校区 Voice data classification method based on modified capsule network
CN109299262A (en) * 2018-10-09 2019-02-01 中山大学 A kind of text implication relation recognition methods for merging more granular informations
CN109410575A (en) * 2018-10-29 2019-03-01 北京航空航天大学 A kind of road network trend prediction method based on capsule network and the long Memory Neural Networks in short-term of nested type
CN110046249A (en) * 2019-03-11 2019-07-23 中国科学院深圳先进技术研究院 Training method, classification method, system, equipment and the storage medium of capsule network
CN110046671A (en) * 2019-04-24 2019-07-23 吉林大学 A kind of file classification method based on capsule network
CN110188195A (en) * 2019-04-29 2019-08-30 苏宁易购集团股份有限公司 A kind of text intension recognizing method, device and equipment based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GEOFFREY E. HINTON: "Dynamic Routing Between Capsules" *
YUSHI YAO: "Bi-directional LSTM Recurrent Neural Network for Chinese Word Segmentation" *

Also Published As

Publication number Publication date
CN110781304B (en) 2023-09-26

Similar Documents

Publication Publication Date Title
CN111368996B (en) Retraining projection network capable of transmitting natural language representation
US10824949B2 (en) Method and system for extracting information from graphs
CN110347835B (en) Text clustering method, electronic device and storage medium
US10824653B2 (en) Method and system for extracting information from graphs
EP3180742B1 (en) Generating and using a knowledge-enhanced model
CN107704625B (en) Method and device for field matching
CN109885824B (en) Hierarchical Chinese named entity recognition method, hierarchical Chinese named entity recognition device and readable storage medium
CN109471946B (en) Chinese text classification method and system
CN111221944B (en) Text intention recognition method, device, equipment and storage medium
CN110969020A (en) CNN and attention mechanism-based Chinese named entity identification method, system and medium
CN108108354B (en) Microblog user gender prediction method based on deep learning
CN105139237A (en) Information push method and apparatus
CN111602128A (en) Computer-implemented method and system for determining
CN112328742A (en) Training method and device based on artificial intelligence, computer equipment and storage medium
JP6738769B2 (en) Sentence pair classification device, sentence pair classification learning device, method, and program
CN112988963B (en) User intention prediction method, device, equipment and medium based on multi-flow nodes
CN111651986B (en) Event keyword extraction method, device, equipment and medium
CN111191457A (en) Natural language semantic recognition method and device, computer equipment and storage medium
CN113360654B (en) Text classification method, apparatus, electronic device and readable storage medium
CN110276396B (en) Image description generation method based on object saliency and cross-modal fusion features
WO2014073206A1 (en) Information-processing device and information-processing method
CN111145913A (en) Classification method, device and equipment based on multiple attention models
CN113705196A (en) Chinese open information extraction method and device based on graph neural network
CN111241843B (en) Semantic relation inference system and method based on composite neural network
CN113239668B (en) Keyword intelligent extraction method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant