CN111639252A

CN111639252A - False news identification method based on news-comment relevance analysis

Info

Publication number: CN111639252A
Application number: CN202010420460.5A
Authority: CN
Inventors: 李玉华; 张文杰; 李瑞轩; 辜希武
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2020-05-18
Filing date: 2020-05-18
Publication date: 2020-09-08

Abstract

The invention belongs to the field of news detection, and particularly relates to a false news identification method based on news-comment relevance analysis, which comprises the following steps: constructing a two-dimensional news characteristic matrix based on the content of each text clause in news, constructing a one-dimensional characteristic vector of each comment according to the content of each comment, and constructing a plurality of comment trees by taking each initial comment as a root node and each reply comment as a child node; combining each node feature vector in each comment tree with a parent node context association feature vector thereof, calculating all leaf node context association feature vectors in the comment tree and performing weighted calculation to obtain comment tree feature vectors, wherein all comment tree feature vectors form a two-dimensional comment feature matrix; and matching the relevance between the news characteristic matrix and the comment characteristic matrix to obtain a news characteristic vector and a comment characteristic vector so as to judge the authenticity of news. The method makes full use of the news text and the information generated in the spreading process, has strong accuracy, and is suitable for large-scale social networks.

Description

False news identification method based on news-comment relevance analysis

Technical Field

The invention belongs to the field of news detection, and particularly relates to a false news identification method based on news-comment relevance analysis.

Background

The explosion of network technology makes the acquisition cost of information lower and lower, and the network technology is ubiquitous and provides a foundation for the rise of social networks. The user can easily and conveniently acquire and publish information from the social network, and the convenience reduces the threshold of generation and dissemination of false news. False news can cause severe public opinion pressure and social panic through wild propagation of social networks, exploiting the untimely nature of information disclosure. False news seriously affects social network environment and creates group anxiety, so effective identification of false news in social networks is a problem to be solved in the current social background.

The identification of false news is primarily directed to news text. The method mainly relates to two aspects, namely, (1) extracting knowledge related to news, and comparing the knowledge with a knowledge gallery; (2) the text sentence is analyzed in syntax and whether uncertain description often appears in the related expression is judged. With the rise of social networks, how to reasonably utilize social network information and improve news authenticity identification capability becomes the most worthy of discussion. Therefore, recently, an analysis method is started to put emphasis on a propagation process or a comment text, (1) the propagation process is analyzed from the macroscopic and microscopic fields, and news authenticity is inferred according to the propagation scale; (2) according to the trusted degree of the users in the propagation path, the rating of the user quality in the propagation network is obtained, and further the authenticity of news is judged; (3) the truth of news is analyzed according to the conflict degree of the opinions in the comments, intense discussion is triggered, the opinions with conflict opinions can enable people to create enough doubt on the truth of information, and a certain effect is achieved by simulating the process of understanding the information by human beings.

However, the existing method only focuses on news texts or only focuses on the form of a spreading process, and the existing method is too dependent on news contents, and has poor adaptability to the brand new field with poor current knowledge. The social robot has certain interference on the construction of a transmission network, and the enhanced exposure rate of the social robot can enhance the transmission behavior of users in the whole network, so that the method of throwing away news and only paying attention to the transmission process has certain limitation.

Disclosure of Invention

The invention provides a false news identification method based on news-comment relevance analysis, which is used for solving the technical problem of low identification precision caused by the fact that news texts are concentrated on one side or networks are spread in the existing false news identification.

The technical scheme for solving the technical problems is as follows: a false news identification method based on news-comment relevance analysis comprises the following steps:

s1, constructing a news feature matrix based on the content of news to be identified, and constructing a feature vector of each comment based on the content of each comment of the news to be identified; meanwhile, according to the reply relation among the comments, constructing a plurality of comment trees by taking each initial comment as a root node and each reply comment as a child node;

s2, associating the feature vector of each node in each comment tree with the context associated feature vector of the father node of the comment tree, obtaining the context associated feature vectors of all leaf nodes of the comment tree through recursive calculation, and performing weighted calculation to obtain the feature vector of the comment tree;

s3, matching the relevance between the news characteristic matrix and the characteristic vectors of all comment trees to obtain attention weights between news clauses considering comments, weighting the vectors corresponding to all text clauses in the news characteristic matrix to obtain news characteristic vectors, obtaining attention weights between comment trees considering news, weighting the characteristic vectors of all comment trees to obtain comment characteristic vectors, and judging the authenticity of news based on the news characteristic vectors and the comment characteristic vectors.

The invention has the beneficial effects that: the method fully utilizes the content inducing discussion in the news and the comment information as the key content for identifying the authenticity of the news, and the authenticity of the news text is deduced based on the matching degree of the core viewpoints of the news and the comment information. Wherein, a comment tree of each initial comment is constructed, each initial comment is used as a root node, each reply comment is used as a child node, each comment information depends on the context information contained in the father node, thus by combining the feature vector of each node in each review tree with the feature vector of its parent's associated context information, to compute the feature vector of the associated context information for that node, and since each leaf node represents the end of a discussion, therefore, weighting calculation is carried out among the feature vectors of the associated context information of all leaf nodes in each comment tree, finally a one-dimensional feature vector of the comment tree (namely each initial comment) is obtained, the one-dimensional feature vector of each initial comment obtained by the method is fully fused with the key information of the discussion, the information utilization rate is high, and the accuracy of news judgment is guaranteed. In addition, the method also matches the relevance between the news characteristic matrix and all the comment tree characteristic vectors, and sufficiently matches and considers the news characteristic matrix and all the comment tree characteristic vectors to respectively generate the attention weight between news clauses considering comments and the attention weight between comment trees considering news, so that the finally obtained news characteristic vector and comment characteristic vector can be effectively used for news identification. The method overcomes the phenomenon that news texts are focused on one side or networks are spread in the prior art, can combine key information in the comments, particularly more key information introduced in the comment reply discussion process, has high news judgment accuracy, and can adapt to false news identification in a large-scale social network.

On the basis of the technical scheme, the invention can be further improved as follows.

Further, the method for constructing the news feature matrix specifically comprises the following steps:

acquiring text content of news to be identified, segmenting sentences and words of the text content, and performing word vector conversion on words after word segmentation; converting all the word vectors into hidden state vectors of associated context information by adopting a recurrent neural network; and weighting all the hidden state vectors corresponding to each clause obtained by the clause by adopting an attention mechanism, representing the clause as a one-dimensional characteristic vector, wherein the characteristic vectors of all the clauses form a two-dimensional news characteristic matrix of news to be identified.

The invention has the further beneficial effects that: the recurrent neural network can effectively retain the context information in an iterative manner, so that words can be associated with one another. For the semantic understanding process, different information in the text sequence has different degrees of influence, the attention mechanism can observe from different angles in a longer text sequence, the most key information in the text sequence is found and higher weight is given, so that the most key information in the text sequence plays a more important role in subsequent characterization vectors, therefore, the information expressed in the text can be more accurately obtained by utilizing the recurrent neural network and the attention mechanism, and the prediction effect of the model is improved.

Further, the constructing of the one-dimensional feature vector of each comment based on the content of each comment of the news to be identified specifically includes:

acquiring text content of each comment, segmenting the text content into words, and performing word vector conversion on the segmented words; converting all the word vectors into hidden state vectors of associated context information by adopting a recurrent neural network; all the hidden state vectors are weighted by an attention mechanism, and the comment is expressed as a one-dimensional feature vector.

The invention has the further beneficial effects that: because the comment information is short relative to the news text, sentence-level splitting is not performed any more, and the comments are directly regarded as a sentence, so that the comment text is converted into vector representation for subsequent association of news and comments.

Further, in S1, all the recurrent neural networks are bidirectional long-short term memory networks.

The invention has the further beneficial effects that: the bidirectional long-short term memory network can effectively acquire the context information, has the capabilities of selective memory and selective forgetting, and can better retain the key context information with longer distance. In a training model with a longer input text sequence, the long-term and short-term memory network can effectively solve the problem of gradient disappearance, obtain a better training effect and ensure that the method can be suitable for false news identification in a large-scale social network.

Further, in S2, a gate loop unit is used to obtain context associated feature vectors of all leaf nodes through recursive computation.

The invention has the further beneficial effects that: compared with other cyclic neural network methods, the gate cycle unit can effectively solve the problem of gradient disappearance during model training by using reset gating and update gating when the tree structure is deeper, namely, when the discussion amount is large, and the method can be suitable for false news identification in a large-scale social network. Meanwhile, effective discussion information in the comment tree can be effectively acquired by utilizing two gates, model parameters are reduced, and training speed is effectively improved.

Further, in S2, the feature vector construction method of each comment tree is as follows:

combining the feature vector of the current node with the hidden state vector of the father node of each comment tree from top to bottom based on a gate cycle unit, calculating reset gating used for retaining partial hidden state information of the father node and update gating used for adjusting the retention proportion of the hidden state information of the father node of the node, and calculating the hidden state vectors of all the nodes in the comment tree through recursive processing; and processing the hidden state vectors of all leaf nodes of the comment tree by using a pooling method to obtain the feature vector of the comment tree.

The invention has the further beneficial effects that: the method comprises the steps of calculating reset gating for retaining partial hidden state information of a father node, calculating update gating for adjusting the retention proportion of the hidden state information of the father node, guaranteeing the fusion degree of each node and the father node based on the two parameters, calculating more reasonable and accurate context associated feature vectors of each node, and performing normalization weighting by adopting a pooling method, so that the method is simple and convenient

Further, the reset gating r_iThe calculation formula is as follows: r is_i＝σ(W_rcⁱ+U_rh^p(i)) Said update gating z_iThe calculation formula is as follows: z is a radical of_i＝σ(W_zcⁱ+U_zh^p(i)) In the formula, W_r、W_zAre all parameter matrices, U_r、U_zAre all parameter vectors, σ is the activation function, h^p ⁽ⁱ⁾And hiding the state vector for the parent node of the ith node.

Further, the S3 includes:

matching the relevance between the news characteristic matrix and the comment characteristic matrix by adopting a collaborative attention network to construct a similarity matrix, wherein the comment characteristic matrix is formed by characteristic vectors of all comment trees;

using a similarity matrix to correlate the news characteristic matrix with the comment characteristic matrix so as to update the news characteristic matrix and the comment characteristic matrix, and obtaining a new news characteristic matrix fused with comment information and a new comment characteristic matrix fused with news information;

calculating to obtain a collaborative attention weight among news clauses based on the new news characteristic matrix, and calculating to obtain a collaborative attention weight among comment trees based on the new comment characteristic matrix;

weighting vectors corresponding to all text clauses in the news characteristic matrix before updating by adopting the cooperative attention weight among news clauses to obtain a news characteristic vector, and weighting the characteristic vectors of all comment trees in the comment characteristic matrix before updating by adopting the cooperative attention weight among comment trees to obtain a comment characteristic vector;

and fully connecting the news characteristic vector with the comment characteristic vector to judge the authenticity of the news.

The invention has the further beneficial effects that: and a collaborative attention network is adopted to correlate the two matrixes so as to calculate the collaborative attention weight between the news clauses fused with the comments and the collaborative attention weight between the comment trees fused with the news, so that the reliability is high.

Further, the update formula of the news characteristic matrix is as follows: h^s＝tanh(W_sS+(W_cC)F)，The updating formula of the comment feature matrix is as follows: h^c＝tanh(W_cC+(W_sS)F^T) In the formula, H^sFor the updated new news feature matrix, H^cFor a new comment feature matrix after updating, S is the news feature matrix before updating, C is the comment feature matrix before updating, F is a similarity matrix, W is a similarity matrix_c、W_sAre all parameter matrices.

The present invention also provides a machine-readable storage medium having stored thereon machine-executable instructions that, when invoked and executed by a processor, cause the processor to implement any of the above-described false news identification methods based on news-comment relevance analysis.

Drawings

Fig. 1 is a flowchart of a false news identification method based on news-comment relevance analysis according to an embodiment of the present invention;

fig. 2 is a schematic diagram of false news identification based on news-comment relevance analysis according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

Example one

A false news identification method 100 based on news-comment relevance analysis, as shown in fig. 1, includes:

step 110, constructing a news characteristic matrix based on the content of news to be identified, and constructing a characteristic vector of each comment based on the content of each comment of the news to be identified; meanwhile, according to the reply relation among the comments, constructing a plurality of comment trees by taking each initial comment as a root node and each reply comment as a child node;

step 120, associating the feature vector of each node in each comment tree with the context associated feature vector of the father node of the comment tree, obtaining the context associated feature vectors of all leaf nodes of the comment tree through recursive calculation, and performing weighted calculation to obtain the feature vector of the comment tree;

and step 130, matching the relevance between the news characteristic matrix and the characteristic vectors of all the comment trees to obtain attention weights between the news clauses considering comments, weighting the vectors corresponding to all the text clauses in the news characteristic matrix to obtain news characteristic vectors, obtaining attention weights between the comment trees considering news, weighting the characteristic vectors of all the comment trees to obtain comment characteristic vectors, and judging the authenticity of news based on the news characteristic vectors and the comment characteristic vectors.

In the method, a comment tree of each initial comment is constructed, each initial comment is used as a root node, each reply comment is used as a child node, each comment information depends on the context information contained in the parent node of the comment information, thus by combining the feature vector of each node in each review tree with the feature vector of its parent's associated context information, to compute the feature vector of the associated context information for that node, and since each leaf node represents the end of a discussion, therefore, weighting calculation is carried out among the feature vectors of the associated context information of all leaf nodes in each comment tree, finally a one-dimensional feature vector of the comment tree (namely each initial comment) is obtained, the one-dimensional feature vector of each initial comment obtained by the method is fully fused with the key information of the discussion, the information utilization rate is high, and the accuracy of news judgment is guaranteed. In addition, the method also matches the relevance between the news characteristic matrix and all the comment tree characteristic vectors, and sufficiently matches and considers the news characteristic matrix and all the comment tree characteristic vectors to respectively generate the attention weight between news clauses considering comments and the attention weight between comment trees considering news, so that the finally obtained news characteristic vector and comment characteristic vector can be effectively used for news identification.

Therefore, the method is a novel false news identification method in a social network, and comprises five processes of data collection and processing, news text processing, comment text processing, news-comment cooperative processing and relevance result analysis. The method uses a new angle of similarity of key contents in news and comments to judge authenticity, fully utilizes news texts and information generated in the social network transmission process, and overcomes the phenomenon that the news texts are heavily focused or the network is transmitted in the prior art. The method can relieve the information one-sided problem caused by excessively depending on news texts, can provide powerful help for authenticity judgment by combining key information in the comments, particularly more key information introduced in the comment reply discussion process, can adapt to false news identification in a large-scale social network, and can solve the problem that news content is difficult to automatically verify.

Preferably, in step 110, a recurrent neural network and an attention mechanism are respectively adopted, a two-dimensional news feature matrix of the news is constructed based on the content of each text clause in the news to be identified, and a one-dimensional feature vector of each comment is constructed according to the content of each comment of the news to be identified. In step 120, a cyclic neural network is adopted to combine the feature vector of the current node in each comment tree with the hidden state vector of the father node of the comment tree, calculate the hidden state vector of the current node, and perform pooling processing on the hidden state vectors of all leaf nodes of the comment tree to obtain the feature vector of the comment tree. In step 130, a collaborative attention network is adopted to match the correlation between the news feature matrix and the comment feature matrix composed of feature vectors of all comment trees, so as to obtain the collaborative attention weight between news clauses and the collaborative attention weight between comment trees.

The method comprises the steps of firstly obtaining news text content, obtaining vector representation of the whole news text, specifically, firstly using a recurrent neural network and an attention mechanism for the word-level vector of the whole news text, and obtaining feature representation of each sentence. And then, a recurrent neural network is used for the feature vectors at the sentence level, so that each sentence obtains the context information similar to the sentence. Through the use of a hierarchical attention model, key feature information in news text is converted into a feature vector representation of the text information.

In addition, comment text content is obtained, and for feature vector representation of the comment text, a word-level recurrent neural network and an attention mechanism are used to obtain feature representation of each comment. The comments have relevance with each other, a tree-shaped comment structure (namely a comment tree) is constructed according to the reply relation of the comments, and reply information and replied information are related through the tree structure, so that the context information of each comment can be more fully understood.

The method comprises the steps of obtaining vector representation of a comment tree by using a tree neural network, specifically, taking an initial comment in each comment tree structure as a root node and a reply of each node as a child node of a current node, wherein each comment information depends on context information contained in a father node of the comment tree structure due to the fact that the comment adopts the tree structure, each leaf node represents the end of one discussion, information in the comment tree is processed by a top-down method, a cyclic neural network is used for calculation, and a hidden state vector h of the father node is calculated^p(i)Comment information (i.e. one-dimensional feature vector of comment) c with current nodeⁱCalculating the hidden state vector h of the current node in combinationⁱ。

The feature vector representation of the news text and the vector representation of the review tree are input into the collaborative attention network. By using the collaborative attention network, the relevance of the comments and the text information can be combined, the collaborative attention weight among all text sentences of the news is generated, and then the news text is weighted. Meanwhile, the method can also generate the weight relation of each comment tree and weight the comment trees. And constructing a guide vector of the news text-comment tree through correlation between the news text and the comment tree, and inputting the guide vector into a full-link layer to judge the authenticity label of the news.

The above mentioned news text and comment information need to be vectorized and represented. And splitting the text information of the related field into independent words by using a word segmentation tool. And after sequencing according to the occurrence frequency, constructing a mapping relation of vocabulary-index and index-vocabulary. And constructing a co-occurrence matrix according to the vocabulary and the occurrence positions in the context window, and obtaining word vector representation w through iterative training according to the similarity between the vocabulary and the co-occurrence matrix. The pre-training method can embody the relevance and similarity among vocabularies through the form of vectors, and through the mode, certain semantic features are captured in word vectors, so that the vocabulary information can be more conveniently utilized through the operation of the vectors.

It should be noted that the news text information is mainly the content of the main body part of the news, and the hyperlinks mentioned in the text need to be replaced uniformly during processing. The news comment information is obtained by searching news titles in the social network to obtain text information of relevant social network comment content, and then obtaining a tree structure of comments through a mutual reply process among the comments, wherein the tree structure contains certain information of a propagation network.

Preferably, the method for constructing the news feature matrix specifically comprises the following steps:

Specifically, as shown in fig. 2, a news text is divided into sentences according to punctuations, a clause is obtained after the sentence division, the clause is converted into an independent word by using a word division tool, and the word after the word division is subjected to word vector conversion. All vectors are spliced to obtain a vector matrix

Wherein

Representing a splicing operation, the news text S is composed of n clauses, S_iRepresenting the ith clause in the news text. For each clause

Is composed of m words, where w_jThe j-th word vector representation in the clause is represented, thus converting the news text into a three-dimensional vector representation. Inputting the three-dimensional vector into a bidirectional long-short term memory network to obtain the hidden state of each vocabulary

Wherein

The hidden state of the jth word representing the ith clause is respectively composed of a forward long-term memory network and a backward short-term memory network, and the associated context information representation of each word in the clauses is obtained. And combining the hidden states of all words in the clause with the attention weights of the words to obtain the vector representation of the clause. By calculating word attention weights

Wherein

Combining the hidden vector representation to obtain the representation result of the clause

All clause vector representations in the text are input into a bidirectional long-short term memory network to obtain the hidden state of each clause

Wherein

sⁱAnd the hidden state of the ith clause is represented and respectively consists of a forward long-term memory network and a backward short-term memory network, and the associated context information of each clause in the text is obtained to be represented.

Preferably, the one-dimensional feature vector of each comment of the news to be identified is constructed according to the content of the comment, and specifically includes:

Because each piece of comment information and the corresponding reply content thereof construct a comment tree, the reply information is associated with the replied information through the tree structure, the context information of each comment can be more fully understood, on the basis of the next time, each piece of information is converted into an independent word by using a word segmentation tool, and then the word vector conversion is carried out on the word after word segmentation. t ═ c₁⊙c₂⊙…⊙c_i⊙…⊙c_pWhere t denotes that a comment tree is composed of p pieces of comment information, ⊙ represents the association-building comment tree operation, c_iAnd representing the ith comment information in the comment tree. For each comment

c_iIs composed of q words, where w_jThe jth word vector in the comment information is represented, and the comment information is shorter than the news text in length, so that sentence-level splitting is not performed any more, and the comment text is converted into vector representation. Inputting the comment vector into a bidirectional long-short term memory network to obtain the hidden state of each vocabulary

Wherein

The hidden state of the jth word representing the ith clause is respectively composed of a forward long-term memory network and a backward short-term memory network, and the associated context information representation of each word in the clauses is obtained. And combining the hidden states of all words in the comment information with the attention weights of the words to obtain the vector representation of the comment information. By calculating word attention weights

Wherein

Obtaining the representation result of the ith comment information by combining the hidden vector representation

Preferably, in step 110, all recurrent neural networks are bidirectional long-term and short-term memory networks.

Preferably, the recurrent neural network in step 120 employs a gate cycle unit.

Preferably, in step 120, the feature vector construction method of each comment tree is as follows:

based on a gate cycle unit, combining a feature vector of a current node with a hidden state vector of a father node of each comment tree from top to bottom, calculating a reset gate of the node for retaining partial hidden state information of the father node and an update gate for adjusting the retention proportion of the hidden state information of the father node, and calculating the hidden state vectors of all the nodes in the comment tree through recursive processing; and processing the hidden state vectors of all leaf nodes of the comment tree by using a pooling method to obtain the feature vector of the comment tree.

Specifically, the initial comment in each comment tree structure is taken as a root node, and the replies among the comments are taken as child nodes. A comment tree information processing method is provided based on a gate cycle unit (GRU), a father node of the ith node is represented by p (i), and a reset gate r is calculated firstly_i＝σ(W_rcⁱ+U_rh^p(i)) Recalculating updated gating z_i＝σ(W_zcⁱ+U_zh^p(i)) Reserving partial hidden state information of the father node by using reset gating, and adjusting the reservation proportion of the father node information by using update gating

In the formula W_*、U_*Both are a parameter matrix and a parameter vector, and σ represents the activation function. Calculating the hidden states h of all leaf nodes after recursively processing the comment tree structure_iProcessing the hidden states of all leaf nodes by using a pooling method to obtain the feature representation t of each comment treeⁱ。

Preferably, step 130 includes:

matching the relevance between the news characteristic matrix and the comment characteristic matrix by adopting a collaborative attention network to construct a similarity matrix; using the similarity matrix to correlate the news characteristic matrix with the comment characteristic matrix so as to update the news characteristic matrix and the comment characteristic matrix and obtain a new news characteristic matrix fused with comment information and a new comment characteristic matrix fused with news information; calculating based on the new news characteristic matrix to obtain a collaborative attention weight among news clauses, and calculating based on the new comment characteristic matrix to obtain a collaborative attention weight among comment trees; weighting vectors corresponding to all text clauses in a news characteristic matrix before updating by adopting the cooperative attention weight among news clauses to obtain a news characteristic vector, and weighting the characteristic vectors of all comment trees in a comment characteristic matrix before updating by adopting the cooperative attention weight among comment trees to obtain a comment characteristic vector; and fully connecting the news characteristic vector with the comment characteristic vector to judge the authenticity of the news.

Applying the cooperative attention weight of the news to a feature matrix of a news text to obtain a news representation, and applying the cooperative attention weight of the comment to a feature matrix of the comment to obtain a comment representation; and fully connecting the news representation and the comment representation, and judging the authenticity label of the news.

Specifically, the relevance between the text vector and the comment vector of each piece of news is matched by using a cooperative attention mechanism, the matched key information is captured, and a similarity matrix is constructed. Where the text is S ═ S¹,…,s^NThe comment is C ═ t¹,…,t^PObtaining F ═ tanh (C)^TW_lS) similarity matrix. Using the similarity matrix to correlate the news text and the comments, and respectively obtaining news information of the fusion comments and comment information of the fusion news, H^s＝tanh(W_sS+(W_cC)F)，H^c＝tanh(W_cC+(W_sS)F^T) Finally obtain the cooperative attention weight of news

Collaborative attention weighting of comments

In the formula W_*And w_*Are all parameter matrices. Applying the cooperative attention weight of news to the news representation vector obtained in S1.4 to obtain a news representation

Applying the collaborative attention weight of the comment to the comment tree representation vector obtained in S2.4 to obtain a comment representation

Fully connecting the news representation and the comment representation, and using

A vector of size (1 × 2) is obtained, with the two values representing the probability of the model predicting whether news is true or false, respectively.

The result of one prediction can be obtained by the steps, wherein the weight matrix W_*And a bias parameter b_*The method is obtained by learning of the neural network, the neural network is initialized randomly at first, and the neural network can learn reasonable parameter configuration through continuous training iteration of a training set. After the softmax function normalization is used, the accuracy of the neural network on the news authenticity judgment result can be obtained more intuitively.

Example two

A machine-readable storage medium having stored thereon machine-executable instructions which, when invoked and executed by a processor, cause the processor to implement a false news identification method based on news-comment relevance analysis as described in embodiment one above.

The related technical solution is the same as the first embodiment, and is not described herein again.

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A false news identification method based on news-comment relevance analysis is characterized by comprising the following steps:

s1, constructing a news feature matrix based on news content to be identified, and constructing a feature vector of each comment based on the content of the comment of the news to be identified; meanwhile, according to the reply relation among the comments, constructing a plurality of comment trees by taking each initial comment as a root node and each reply comment as a child node;

2. The false news identification method based on news-comment relevance analysis according to claim 1, wherein the news feature matrix is specifically constructed by:

3. The false news identification method based on news-comment relevance analysis according to claim 2, wherein the one-dimensional feature vector of each comment of news to be identified is constructed according to the content of the comment, and specifically comprises:

4. The method for identifying false news based on news-comment relevance analysis according to claim 3, wherein all the recurrent neural networks in S1 are bidirectional long-short term memory networks.

5. A false news identification method based on news-comment relevance analysis according to any one of claims 1 to 4, wherein in S2, a gate loop unit is adopted to obtain context relevance feature vectors of all leaf nodes through recursive computation.

6. The method for identifying false news based on news-comment relevance analysis according to claim 5, wherein in the step S2, the feature vector construction method of each comment tree is as follows:

7. The method for identifying false news based on news-comment relevance analysis according to claim 6, wherein the reset gate r is used_iThe calculation formula is as follows: r is_i＝σ(W_rcⁱ+U_rh^p(i)) Said update gating z_iThe calculation formula is as follows: z is a radical of_i＝σ(W_zcⁱ+U_zh^p(i)) In the formula, W_r、W_zAre all parameter matrices, U_r、U_zAre all parameter vectors, σ is the activation function, h^p(i)And hiding the state vector for the parent node of the ith node.

8. A false news identification method based on news-comment relevance analysis according to any one of claims 1 to 4, wherein the S3 includes:

9. The method for identifying false news based on news-comment relevance analysis according to claim 8, wherein the news feature matrix is updated according to the formula: h^s＝tanh(W_sS+(W_cC) F), the updating formula of the comment feature matrix is as follows: h^c＝tanh(W_cC+(W_sS)F^T) In the formula, H^sFor the updated new news feature matrix, H^cFor a new comment feature matrix after updating, S is the news feature matrix before updating, C is the comment feature matrix before updating, F is a similarity matrix, W is a similarity matrix_c、W_sAre all parameter matrices.

10. A machine-readable storage medium having stored thereon machine-executable instructions which, when invoked and executed by a processor, cause the processor to implement a method of false news identification based on news-comment relevance analysis as claimed in any one of claims 1 to 9.