CN112732907B - Financial public opinion analysis method based on multi-scale circulation neural network - Google Patents

Financial public opinion analysis method based on multi-scale circulation neural network Download PDF

Info

Publication number
CN112732907B
CN112732907B CN202011578594.6A CN202011578594A CN112732907B CN 112732907 B CN112732907 B CN 112732907B CN 202011578594 A CN202011578594 A CN 202011578594A CN 112732907 B CN112732907 B CN 112732907B
Authority
CN
China
Prior art keywords
text
scale
neural network
financial
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011578594.6A
Other languages
Chinese (zh)
Other versions
CN112732907A (en
Inventor
马千里
林镇溪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202011578594.6A priority Critical patent/CN112732907B/en
Publication of CN112732907A publication Critical patent/CN112732907A/en
Application granted granted Critical
Publication of CN112732907B publication Critical patent/CN112732907B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a financial public opinion analysis method based on a multi-scale recurrent neural network, which comprises the following steps: acquiring financial text data, and preprocessing the data; sampling the preprocessed financial text data by using a sliding window to obtain a subsequence of each time step, inputting the subsequence into a group of recurrent neural networks to extract local feature representation of the text sequence, and obtaining significance feature representation of the text sequence through maximum pooling operation; extracting different significance characteristic representations of the text sequence by using a plurality of sliding windows with different scales, and finally splicing to obtain multi-scale characteristic representations of the sequence; and inputting the multi-scale feature representation into a full connection layer and a softmax layer for classification. The method uses sliding windows with different scales to sample text subsequences, models local phrase characteristics with different scales through a group circulation neural network, and fuses the characteristics with different scales to obtain semantic characteristics of the text, so that the accuracy of financial public opinion analysis is further improved.

Description

Financial public opinion analysis method based on multi-scale circulation neural network
Technical Field
The invention relates to the technical field of financial public opinion analysis, in particular to a financial public opinion analysis method based on a multi-scale circulation neural network.
Background
With the rapid development of internet technology, a large amount of information is generated every day, and how to discriminate and extract a large amount of information is very important. Particularly in the financial field, various financial texts reflect the emotion of investors, and the emotion of investors determines the behavior of investors, so that the trend of the whole market is influenced. By carrying out public opinion analysis on the financial texts, the development trend of the financial market can be known, and monitoring of the financial market and abnormal processing of stock prices are facilitated. Therefore, the public sentiment analysis of the financial text is of great significance.
The traditional financial public opinion analysis method is mainly based on an emotion dictionary and a machine learning theory, the emotion dictionary analyzes corresponding emotion polarities through the number of positive and negative emotion words in financial texts, and the machine learning method comprises a bag-of-words model, naive Bayes, logistic regression and the like. However, the traditional method relies on the characteristics of manual design, has high cost, and cannot fully model semantic information and multi-scale information of financial texts. Because the neural network can automatically extract the characteristics of the text, many current neural network-based methods are applied to the financial public opinion analysis, and the convolutional neural network and the long-short term memory network are more common and effective. The convolutional neural network can capture local continuous phrase information of the financial text, but because the convolution operation is linear, discontinuous phrase structures in the text cannot be sufficiently modeled, such as expression of some emotional transitions. The long-short term memory network can effectively model the sequence information of the financial text, however, the long-short term memory network is a biased model, tends to the information at the end of the text, and cannot model multi-scale information in the financial text. Due to the fact that the financial public opinion text data set with the label is limited, the scale of the current model parameter is relatively large, model overfitting and characteristic redundancy are easily caused, and the accuracy of public opinion analysis is reduced.
Generally, the financial public opinion has the characteristics of timeliness, subjectivity, wide spreading and the like. In order to discriminate core information, the key of financial public opinion analysis is to extract a plurality of key phrase information in a text so as to understand semantic information and emotional tendency contained in the text, and the phrase information generally has different scales. In order to better model text semantic information and multi-scale information, a more time-efficient financial public opinion analysis method is urgently needed to be provided at present.
Disclosure of Invention
The invention aims to solve the defects in the prior art and provides a financial public opinion analysis method based on a multi-scale recurrent neural network.
The purpose of the invention can be achieved by adopting the following technical scheme:
a financial public opinion analysis method based on a multi-scale recurrent neural network comprises the following steps:
s1, acquiring financial text data, and preprocessing the financial text data to obtain a text sequence;
s2, sampling the text sequence obtained in the step S1 by using a sliding window to obtain a subsequence of each time step, inputting the subsequence into a group recurrent neural network GRNN to extract local feature representation of the text sequence, and then obtaining significance feature representation of the text sequence by using maximum pooling operation;
s3, extracting different significance characteristic representations of the text sequence by using a plurality of sliding windows with different scales, and finally obtaining the multi-scale characteristic representation of the text sequence through splicing operation;
and S4, inputting the full connection and the softmax layer into the multi-scale feature representation obtained in the step S3 for classification.
Further, the calculation process of the saliency-feature representation of the text sequence in step S2 is as follows:
s2.1, semantic information of a general text is understood by some keywords or phrases, and although a traditional CNN network has the capability of capturing local phrases, the discontinuous dependence capability of the text is difficult to model due to the operation of convolution linearity; while RNN networks have the ability to model discontinuous dependencies, but ignore the context ahead of the text due to bias. In order to better model semantic features of local phrases in a sequence, a text subsequence is adopted by using a sliding window with the size of s, local feature representations of each position are extracted by utilizing GRNN, and the mode comprises the capability of CNN local modeling and the capability of RNN discontinuous dependence modeling.
In particular, given an input text sequence X ═ { X1,x2,…,xt…,xTWhere T is the length of the text sequence,
Figure BDA0002864174280000031
is the word input at time step T, T is 1, 2, …, T, d0Is the input dimension for each word. Sampling words of the first s time steps of the time step t to form a text sub-wordSequence Xt={xt-s+1,…,xtH, converting the subsequence XtInputting the data into a group recurrent neural network GRNN, and capturing the discontinuous dependence of the subsequences by using a recurrent structure. The GRNN is a recurrent neural network composed of K different initialized long-term and short-term memory networks, and each long-term and short-term memory network is responsible for modeling different semantic features of a sequence and is beneficial to understanding word ambiguity.
Inputting the subsequence into GRNN, and taking the hidden state output at the last time step as the local characteristic representation of the t-th time step of the text sequence
Figure BDA0002864174280000032
S2.2, obtaining local characteristic representation of t-th time step of GRNN (group recurrent neural network) by splicing hidden state representations of K long-term and short-term memory networks
Figure BDA0002864174280000033
Figure BDA0002864174280000034
Wherein
Figure BDA0002864174280000035
Representing the hidden state of the t-th time step obtained by the K-th long-short term memory network, wherein K is 1, 2, … and K, d is the dimension of each hidden state,
Figure BDA0002864174280000036
the calculation formula of (a) is as follows:
Figure BDA0002864174280000037
Figure BDA0002864174280000038
Figure BDA0002864174280000039
wherein
Figure BDA0002864174280000041
Respectively an input gate, a forgetting gate and an output gate of the kth long-short term memory network,
Figure BDA0002864174280000042
respectively information of the current joining of the kth long-short term memory network and information of the memory unit, sigma and tanh are nonlinear activation functions,
Figure BDA0002864174280000043
it is shown that the multiplication is element-by-element,
Figure BDA0002864174280000044
and
Figure BDA0002864174280000045
is a trainable parameter in the kth long-short term memory network;
obtaining local feature representation of each time step through the above calculation formula
Figure BDA0002864174280000046
Splicing to form a feature matrix
Figure BDA0002864174280000047
Figure BDA0002864174280000048
S2.3, feature matrix is paired along time dimension
Figure BDA0002864174280000049
Performing maximal pooling operation to obtain significant feature representation of the sequence
Figure BDA00028641742800000410
Figure BDA00028641742800000411
FiRepresents the value of the ith dimension of the vector F,
Figure BDA00028641742800000412
to represent
Figure BDA00028641742800000413
And taking the value of the ith dimension, wherein max represents the operation of taking the maximum value. The significance signature F represents a discriminating signature that can play the most important role in classification, such as some phrases or keywords with emotional expressions, while filtering unimportant information.
Furthermore, K long and short term memory networks in the group recurrent neural network GRNN can be calculated in parallel, and the running time is accelerated.
Further, the multi-scale feature representation calculation process of the text sequence in step S3 is as follows:
since text naturally contains multi-scale information, e.g., phrases having different lengths. In order to extract multi-scale information in a text sequence, M sliding windows with different scales are used for extracting significance characteristic representations of different sequences, wherein the scale size of the mth sliding window is sm(ii) a Repeating the operation of step S2, the significance signature obtained for the sliding window of the m-th scale is denoted as FmAnd M is 1, 2, … and M, and the multi-scale feature representation of the sequence is obtained through a splicing operation
Figure BDA00028641742800000414
Figure BDA00028641742800000415
Further, the classification process in step S4 is as follows:
representing the multi-scale features obtained in the step S3
Figure BDA00028641742800000416
Inputting a full connection layer and a softmax layer for classification, wherein the formula is as follows:
Figure BDA00028641742800000417
wherein, WfIs an affine transformation matrix composed of trainable parameters, bfIs the bias term, ReLU and softmax are non-linear activation functions,
Figure BDA0002864174280000051
is a predicted distribution;
training with the loss L as a training target, wherein the loss L is expressed as follows:
Figure BDA0002864174280000052
where N is the number of samples of the dataset, J is the number of classes of the dataset, M is the number of different scale sliding windows, y is the true distribution of the samples,
Figure BDA0002864174280000053
is the predicted distribution of the samples and is,
Figure BDA0002864174280000054
is the jth dimension of the distribution y in the nth sample,
Figure BDA0002864174280000055
is the distribution in the nth sample
Figure BDA0002864174280000056
The (j) th dimension of (a),
Figure BDA0002864174280000057
is the dimension smThe penalty of (2) is lost,λ is a hyper-parameter that is used to weigh the weight of the classification penalty and the penalty.
Further, without conditional constraints, different long-short term memory networks in GRNN may learn similar feature representations, resulting in feature redundancy. To increase the diversity of features and avoid feature redundancy, the scale smPenalty loss of
Figure BDA0002864174280000058
Comprises the following steps:
Figure BDA0002864174280000059
wherein, the matrix I is an identity matrix,
Figure BDA00028641742800000510
respectively is of dimension smThe trainable parameters of the ith long-short term memory network and the jth long-short term memory network of GRNN, | · |2Representing a 2 norm.
Compared with the prior art, the invention has the following advantages and effects:
(1) the invention provides a novel multi-scale recurrent neural network for financial public opinion analysis, which combines the capability of CNN network local phrase feature modeling and the capability of RNN network discontinuous dependence modeling. Compared with CNN and RNN, the method can obtain better accuracy in financial text analysis.
(2) In order to better model financial text semantic information and multi-scale information, the text subsequence is sampled by using sliding windows with different scales, local phrase characteristics with different scales are modeled by a recurrent neural network GRNN, and the semantic characteristics of the text are obtained by fusing the characteristics with different scales.
Drawings
Fig. 1 is a flowchart of a method for financial public opinion analysis based on a multi-scale recurrent neural network disclosed in an embodiment of the present invention;
fig. 2 is a network structure diagram of the group recurrent neural network GRNN disclosed in the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Examples
As shown in fig. 2, the embodiment specifically discloses a financial public opinion analysis method based on a multi-scale recurrent neural network, which includes the following steps:
and S1, acquiring financial text data, and preprocessing the financial text data to obtain a text sequence. In practice, the data used is derived from the "SmoothNLP" public data set, which contains approximately 2 ten thousand financial messaging news texts.
And S2, sampling the text sequence obtained in the step S1 by using a sliding window to obtain a subsequence of each time step, inputting the subsequence into a group recurrent neural network GRNN to extract local feature representation of the text sequence, and then obtaining the saliency feature representation of the text sequence by using a maximum pooling operation. The specific process is as follows:
s2.1, semantic information of a general text is understood by some keywords or phrases, and although a traditional CNN network has the capability of capturing local phrases, the discontinuous dependence capability of the text is difficult to model due to the operation of convolution linearity; while RNN networks have the ability to model discontinuous dependencies, but ignore the context ahead of the text due to bias. In order to better model the semantic features of local phrases in a sequence, as shown in fig. 2, a sliding window with the size of 3 is used for taking a text subsequence, and a local feature representation of each position is extracted by using GRNN, wherein the method comprises the capability of local modeling of CNN and the capability of non-continuously dependent modeling of RNN.
Specifically, as shown in fig. 2, given an input sequence X ═ { X ═ X1,x2,x3,x4The sentence "I happy",
Figure BDA0002864174280000071
is a 300-dimensional word vector input at time step t. The word of the first 3 time steps of the sampling time step t forms a subsequence Xt={xt-2,xt-1,xtH, converting the subsequence XtInputting the data into a group recurrent neural network GRNN, and capturing the discontinuous dependence of the subsequences by using a recurrent structure. The GRNN is a recurrent neural network composed of 4 different initialized long-term and short-term memory networks, and each long-term and short-term memory network is responsible for modeling different semantic features of a sequence and is helpful for understanding word ambiguity.
Inputting the subsequence into GRNN, and taking the hidden state output at the last time step as the local characteristic representation of the t-th time step of the text sequence
Figure BDA0002864174280000072
S2.2, obtaining local feature representation of t-th time step of GRNN (group recurrent neural network) by splicing hidden state representations of 4 long-term and short-term memory networks
Figure BDA0002864174280000073
Figure BDA0002864174280000074
Wherein
Figure BDA0002864174280000075
A 50-dimensional hidden state representing the t-th time step obtained by the k-th long-short term memory network, wherein k is 1, 2, 3 and 4,
Figure BDA0002864174280000076
the calculation formula of (a) is as follows:
Figure BDA0002864174280000077
Figure BDA0002864174280000078
Figure BDA0002864174280000079
wherein
Figure BDA00028641742800000710
Respectively an input gate, a forgetting gate and an output gate of the kth long-short term memory network,
Figure BDA00028641742800000711
respectively information of the current joining of the kth long-short term memory network and information of the memory unit, sigma and tanh are nonlinear activation functions,
Figure BDA00028641742800000712
it is shown that the multiplication is element-by-element,
Figure BDA00028641742800000713
and
Figure BDA00028641742800000714
is a trainable parameter in the kth long-short term memory network;
obtaining local feature representation of 4 time steps by the above calculation formula
Figure BDA00028641742800000715
Splicing to form a feature matrix
Figure BDA00028641742800000716
Figure BDA00028641742800000717
S2.3, feature matrix is paired along time dimension
Figure BDA0002864174280000081
Performing maximal pooling operation to obtain significant feature representation of the sequence
Figure BDA0002864174280000082
Figure BDA0002864174280000083
FiRepresents the value of the ith dimension of the vector F,
Figure BDA0002864174280000084
represent
Figure BDA0002864174280000085
And taking the value of the ith dimension, wherein max represents the operation of taking the maximum value. The significance signature F represents a discriminating signature that can play the most important role in classification, such as some phrases or keywords with emotional expressions, while filtering unimportant information. For example, the emotional word "happy" in the sentence "i happy" is the most important information in the classification.
The 4 long and short term memory networks in the group circulation neural network GRNN can perform parallel computation, and the running time is shortened.
And S3, extracting different salient feature representations of the text sequence by using a plurality of sliding windows with different scales, and finally obtaining the multi-scale feature representation of the text sequence through splicing operation. The specific process is as follows:
since text naturally contains multi-scale information, e.g., phrases having different lengths. In order to extract multi-scale information in a text sequence, as shown in fig. 2, 2 sliding windows with different scales are used to extract different significance feature representations of the sequence, and the scale sizes of the sliding windows are 3 and 2 respectively; repeating the operation of step S2 to obtain the significance characteristic table for the sliding window of the m scaleShown as FmAnd m is 1 and 2, and the multi-scale feature representation of the sequence is obtained through the splicing operation
Figure BDA0002864174280000086
Figure BDA0002864174280000087
And S4, inputting the full connection and the softmax layer into the multi-scale feature representation obtained in the step S3 for classification. The specific classification process is as follows:
representing the multi-scale features obtained in the step S3
Figure BDA0002864174280000088
Inputting a full connection layer and a softmax layer for classification, wherein the formula is as follows:
Figure BDA0002864174280000089
wherein, WfIs an affine transformation matrix composed of trainable parameters, bfIs the bias term, ReLU and softmax are non-linear activation functions,
Figure BDA00028641742800000810
is a predicted distribution;
training with the loss L as a training target, wherein the loss L is expressed as follows:
Figure BDA0002864174280000091
where N is the number of samples of the dataset, J is the number of classes of the dataset, M2 is the number of different scale sliding windows, y is the true distribution of the samples,
Figure BDA0002864174280000092
is the predicted distribution of the samples and is,
Figure BDA0002864174280000093
is the jth dimension of the distribution y in the nth sample,
Figure BDA0002864174280000094
is the distribution in the nth sample
Figure BDA0002864174280000095
The (j) th dimension of (a),
Figure BDA0002864174280000096
is the dimension smλ ═ 0.001 is a hyper-parameter, and is used to weigh the classification loss and the penalty loss.
Without conditional constraints, different long-short term memory networks in GRNN may learn similar feature representations, resulting in feature redundancy. To increase the diversity of features and avoid feature redundancy, the scale smPenalty loss of
Figure BDA0002864174280000097
Comprises the following steps:
Figure BDA0002864174280000098
wherein, the matrix I is an identity matrix,
Figure BDA0002864174280000099
respectively is of dimension smThe trainable parameters of the ith long-short term memory network and the jth long-short term memory network of GRNN, | · |2Representing a 2 norm.
In summary, the embodiment combines the advantages of the CNN and the RNN, uses the sliding window sampling subsequence, uses the group recurrent neural network GRNN encoding subsequence feature as the local semantic representation of the text, and extracts the multi-scale local phrase feature of the text by adopting the sliding windows of different scales, which is beneficial to better understanding the text semantic representation. Compared with CNN and RNN, the method has better accuracy in financial text analysis, is beneficial to discrimination and filtering of information, and analyzes the future trend of financial markets.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (2)

1. The financial public opinion analysis method based on the multi-scale recurrent neural network is characterized by comprising the following steps of:
s1, acquiring financial text data, and preprocessing the financial text data to obtain a text sequence;
s2, sampling the text sequence obtained in the step S1 by using a sliding window to obtain a subsequence of each time step, inputting the subsequence into a group recurrent neural network GRNN to extract local feature representation of the text sequence, and then obtaining significance feature representation of the text sequence by using maximum pooling operation;
wherein, the calculation process of the saliency characteristic representation of the text sequence in step S2 is as follows:
s2.1, given the input text sequence X ═ { X1,x2,…,xt…,xTWhere T is the length of the text sequence,
Figure FDA0003607441780000011
is the word input at time step T, T is 1, 2, …, T, d0Is the input dimension of each word, and a sliding window with the size of s is used for extracting the local feature representation of each position; the words of the first s time steps of the sampling time step t form a subsequence Xt={xt-s+1,…,xtH, converting the subsequence XtInputting the hidden state into a group recurrent neural network GRNN, wherein the group recurrent neural network GRNN is a recurrent neural network consisting of K different initialized long-term and short-term memory networks, and the hidden state output at the last time step of the subsequence is taken as the tth time of the text sequenceLocal feature representation of a step
Figure FDA0003607441780000012
S2.2, obtaining local characteristic representation of t-th time step of GRNN (group recurrent neural network) by splicing hidden state representations of K long-term and short-term memory networks
Figure FDA0003607441780000013
Figure FDA0003607441780000014
Wherein
Figure FDA0003607441780000015
Representing the hidden state of the t-th time step obtained by the K-th long-short term memory network, wherein K is 1, 2, … and K, d is the dimension of each hidden state,
Figure FDA0003607441780000016
the calculation formula of (a) is as follows:
Figure FDA0003607441780000021
Figure FDA0003607441780000022
Figure FDA0003607441780000023
wherein
Figure FDA0003607441780000024
An input gate, a forgetting gate and an output gate of the kth long-short term memory network respectively,
Figure FDA0003607441780000025
Respectively, the information of the current addition of the kth long-and-short term memory network and the information of the memory cell, σ, tanh are nonlinear activation functions, which indicate element-by-element multiplication,
Figure FDA0003607441780000026
and
Figure FDA0003607441780000027
is a trainable parameter in the kth long-short term memory network;
obtaining local feature representation of each time step through the above calculation formula
Figure FDA0003607441780000028
Splicing to form a feature matrix
Figure FDA0003607441780000029
Figure FDA00036074417800000210
S2.3, feature matrix is paired along time dimension
Figure FDA00036074417800000211
Performing maximal pooling operation to obtain significant feature representation of the sequence
Figure FDA00036074417800000212
Figure FDA00036074417800000213
FiRepresents the value of the ith dimension of the vector F,
Figure FDA00036074417800000214
to represent
Figure FDA00036074417800000215
Taking the value of the ith dimension, wherein max represents the operation of taking the maximum value; s3, extracting different significance characteristic representations of the text sequence by using a plurality of sliding windows with different scales, and finally obtaining the multi-scale characteristic representation of the text sequence through splicing operation;
in step S3, the multi-scale feature representation calculation process of the text sequence is as follows:
extracting significance feature representations with different sequences by using M sliding windows with different scales, wherein the scale size of the mth sliding window is sm(ii) a Repeating the operation of step S2, the significance signature obtained for the sliding window of the m-th scale is denoted as FmAnd M is 1, 2, … and M, and the multi-scale feature representation of the sequence is obtained through a splicing operation
Figure FDA00036074417800000216
Figure FDA00036074417800000313
S4, inputting the multi-scale feature representation obtained in the step S3 into a full connection layer and a softmax layer for classification, wherein the classification process in the step S4 is as follows:
representing the multi-scale features obtained in the step S3
Figure FDA0003607441780000031
Inputting a full connection layer and a softmax layer for classification, wherein the formula is as follows:
Figure FDA0003607441780000032
wherein, WfAffine transformation moments consisting of trainable parametersArray, bfIs the offset term, ReLU and softmax are non-linear activation functions,
Figure FDA0003607441780000033
is a predicted distribution;
training with the loss L as a training target, wherein the loss L is expressed as follows:
Figure FDA0003607441780000034
where N is the number of samples of the dataset, J is the number of classes of the dataset, M is the number of different scale sliding windows, y is the true distribution of the samples,
Figure FDA0003607441780000035
is the predicted distribution of the samples and is,
Figure FDA0003607441780000036
is the jth dimension of the distribution y in the nth sample,
Figure FDA0003607441780000037
is the distribution in the nth sample
Figure FDA0003607441780000038
The (j) th dimension of (a),
Figure FDA0003607441780000039
is the dimension smλ is a hyper-parameter, used to weigh the weight of classification loss and penalty loss;
the penalty loss
Figure FDA00036074417800000310
The expression of (a) is as follows:
Figure FDA00036074417800000311
wherein, the matrix I is an identity matrix,
Figure FDA00036074417800000312
respectively is of dimension smThe trainable parameters of the ith long-short term memory network and jth long-short term memory network of GRNN, | · |2Representing a 2 norm.
2. The method of claim 1, wherein K long-term and short-term memory networks in the GRNN are computed in parallel to accelerate the running time.
CN202011578594.6A 2020-12-28 2020-12-28 Financial public opinion analysis method based on multi-scale circulation neural network Active CN112732907B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011578594.6A CN112732907B (en) 2020-12-28 2020-12-28 Financial public opinion analysis method based on multi-scale circulation neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011578594.6A CN112732907B (en) 2020-12-28 2020-12-28 Financial public opinion analysis method based on multi-scale circulation neural network

Publications (2)

Publication Number Publication Date
CN112732907A CN112732907A (en) 2021-04-30
CN112732907B true CN112732907B (en) 2022-06-10

Family

ID=75606453

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011578594.6A Active CN112732907B (en) 2020-12-28 2020-12-28 Financial public opinion analysis method based on multi-scale circulation neural network

Country Status (1)

Country Link
CN (1) CN112732907B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108717856A (en) * 2018-06-16 2018-10-30 台州学院 A kind of speech-emotion recognition method based on multiple dimensioned depth convolution loop neural network
CN108830295A (en) * 2018-05-10 2018-11-16 华南理工大学 Multivariate Time Series classification method based on Multiple Time Scales echo state network
CN110083700A (en) * 2019-03-19 2019-08-02 北京中兴通网络科技股份有限公司 A kind of enterprise's public sentiment sensibility classification method and system based on convolutional neural networks
CN110189800A (en) * 2019-05-06 2019-08-30 浙江大学 Furnace oxygen content soft-measuring modeling method based on more granularities cascade Recognition with Recurrent Neural Network
CN110705692A (en) * 2019-09-25 2020-01-17 中南大学 Method for predicting product quality of industrial nonlinear dynamic process by long-short term memory network based on space and time attention

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108830295A (en) * 2018-05-10 2018-11-16 华南理工大学 Multivariate Time Series classification method based on Multiple Time Scales echo state network
CN108717856A (en) * 2018-06-16 2018-10-30 台州学院 A kind of speech-emotion recognition method based on multiple dimensioned depth convolution loop neural network
CN110083700A (en) * 2019-03-19 2019-08-02 北京中兴通网络科技股份有限公司 A kind of enterprise's public sentiment sensibility classification method and system based on convolutional neural networks
CN110189800A (en) * 2019-05-06 2019-08-30 浙江大学 Furnace oxygen content soft-measuring modeling method based on more granularities cascade Recognition with Recurrent Neural Network
CN110705692A (en) * 2019-09-25 2020-01-17 中南大学 Method for predicting product quality of industrial nonlinear dynamic process by long-short term memory network based on space and time attention

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Qianli Ma et al.Temporal Pyramid Recurrent Neural Network.《Thirty-Fourth AAAI Conference on Artificial Intelligence(AAAI 2020)》.2020, *
陈恩欢.面向序列数据建模的多尺度循环神经网络.《中国优秀博硕士学位论文全文数据库(硕士)》.2020,(第1期),I140-335. *

Also Published As

Publication number Publication date
CN112732907A (en) 2021-04-30

Similar Documents

Publication Publication Date Title
CN109902293B (en) Text classification method based on local and global mutual attention mechanism
CN108628823B (en) Named entity recognition method combining attention mechanism and multi-task collaborative training
CN111401061A (en) Method for identifying news opinion involved in case based on BERT and Bi L STM-Attention
CN110263325B (en) Chinese word segmentation system
CN110472042B (en) Fine-grained emotion classification method
Zhang et al. Sentiment Classification Based on Piecewise Pooling Convolutional Neural Network.
CN110287323B (en) Target-oriented emotion classification method
CN110929034A (en) Commodity comment fine-grained emotion classification method based on improved LSTM
CN113591483A (en) Document-level event argument extraction method based on sequence labeling
CN109214006B (en) Natural language reasoning method for image enhanced hierarchical semantic representation
CN111382565A (en) Multi-label-based emotion-reason pair extraction method and system
CN111984791B (en) Attention mechanism-based long text classification method
CN112231478B (en) Aspect-level emotion classification method based on BERT and multi-layer attention mechanism
CN115906816A (en) Text emotion analysis method of two-channel Attention model based on Bert
CN114357167B (en) Bi-LSTM-GCN-based multi-label text classification method and system
CN113239694B (en) Argument role identification method based on argument phrase
CN114648029A (en) Electric power field named entity identification method based on BiLSTM-CRF model
CN111723572B (en) Chinese short text correlation measurement method based on CNN convolutional layer and BilSTM
Liu et al. Research on advertising content recognition based on convolutional neural network and recurrent neural network
CN117216265A (en) Improved graph annotation meaning network news topic classification method
CN112732907B (en) Financial public opinion analysis method based on multi-scale circulation neural network
Wakchaure et al. A scheme of answer selection in community question answering using machine learning techniques
CN115510230A (en) Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism
Zhu et al. Attention based BiLSTM-MCNN for sentiment analysis
CN114357166A (en) Text classification method based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant