CN116701569A - Multi-field false news detection method based on multi-view collaboration - Google Patents

Multi-field false news detection method based on multi-view collaboration Download PDF

Info

Publication number
CN116701569A
CN116701569A CN202310515854.2A CN202310515854A CN116701569A CN 116701569 A CN116701569 A CN 116701569A CN 202310515854 A CN202310515854 A CN 202310515854A CN 116701569 A CN116701569 A CN 116701569A
Authority
CN
China
Prior art keywords
domain
news
view
steps
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310515854.2A
Other languages
Chinese (zh)
Inventor
李慧
蒋园园
王晨曦
顾勇
张舒
仲兆满
李鑫
左宇航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Ocean University
Original Assignee
Jiangsu Ocean University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Ocean University filed Critical Jiangsu Ocean University
Priority to CN202310515854.2A priority Critical patent/CN116701569A/en
Publication of CN116701569A publication Critical patent/CN116701569A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/27Regression, e.g. linear or logistic regression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a multi-field false news detection method based on multi-view collaboration, which aims to solve the problem of the existing false news detection method in field transfer so as to improve the performance of a model in a multi-field scene. The method is based on multi-view collaboration and expert network feature extraction, and the effective utilization of the domain features is realized by introducing the relationship between the domain gate network learning domain and the view angles. The multi-domain false news detection method provided by the invention can be applied to the fields of social media, news media, online questions and answers and the like, provides reliable and accurate information for users, and has wide application prospects.

Description

Multi-field false news detection method based on multi-view collaboration
Technical Field
The invention relates to the field of false news detection, in particular to a multi-field false news detection method based on multi-view collaboration.
Background
With the popularity of the internet and the popularity of social media, false news has become a serious problem. False news not only misdirects people's mind, but also has serious impact on society. Therefore, false news detection is a popular research direction. The purpose of false news detection is to classify news content into true and false categories. Existing methods can be largely classified into content-based methods and social context-based methods.
In content-based approaches, researchers detect false news, mainly by analyzing the text content of the news, extracting features from it. These features include lexical features, semantic features, statistical features, and the like. Some researchers also use external evidence, such as knowledge maps or facts to check information in websites, for false news detection. The advantage of content-based approaches is that news text can be analyzed independently, but the disadvantage is that social context information may be ignored.
Social context-based methods are mainly used for mining news-propagated structural signals by modeling the news-propagated process. These methods may capture social context information by analyzing interactions between social media entities. On the other hand, some researchers use crowd wisdom, such as emotion and standing, to detect false news.
Multiple fields of false news detection are an important branch of false news detection. News in different fields has different characteristics and therefore different models are required to detect false news. The multi-domain false news detection method aims at learning general features from data in different domains so as to improve the accuracy and generalization performance of false news detection.
False news detection based on emotion is another method of false news detection. Research has shown that emotional characteristics are very important for detection of false news. Some researchers have used multiple tasks of emotional characteristics, novelty, and emotion to conduct multitasking learning to improve the performance of false news detection.
In summary, false news detection is an important research direction and can be performed by various methods based on content, social context, multiple fields, emotion and the like. These methods may be used separately or in combination to improve the accuracy and generalization performance of false news detection. In the future, research into false news detection will continue to evolve, and new technologies and methods will also need to be continually explored to address the challenges of new false news.
Disclosure of Invention
The invention aims at overcoming the defects of the prior art, and provides a multi-field false news detection method based on multi-view collaboration, so as to solve the problems of the background art.
In order to achieve the above purpose, the present invention provides the following technical solutions: a multi-field false news detection method based on multi-view collaboration comprises the following steps of:
s1: receiving news content input, and processing the input news content by a BERT model to obtain word embedding vectors;
s2: word embedding vectors are processed through bidirectional LSTM, and sequential characteristics of news are extracted;
s3: processing news content by using semantic network and domain network respectively to obtain semantic features and domain specific features of news;
s4: processing news content through a mixed expert system to obtain emotion characteristics and style characteristics;
s5: inputting semantic features, domain specific features, emotion features and style features into a cross-view fusion module to realize self-adaptive cross-view representation;
s6: according to the weight obtained by the domain network, weighting and summing the fused features to obtain a total feature representation;
s7: inputting the total characteristic representation into a classifier module, and judging the authenticity of the news content;
s8: and outputting a news authenticity judgment result.
As a preferred technical scheme of the invention: modeling of the false news detection problem comprises the steps of:
k1: encoding the text content of the news P into a mark sequence with the length of T by using a BERT pre-training model;
k2: extracting emotion characteristics E and style characteristics S from news P, wherein the emotion characteristics E and the style characteristics S are numerical characteristics;
k3: taking a domain tag g of the news P as input, combining emotion characteristics E and wind grid characteristics S, and training a multi-domain false news detection model by using a multi-task learning method;
and K4: for news P, inputting a text marking sequence, emotion characteristics E and wind grid characteristics S, and outputting a true and false label y by combining a domain label g of the news P and using a trained multi-domain false news detection model;
and K5: repeating the steps K3 and K4 for a plurality of domain labels to obtain true and false labels y under each domain, and finally combining the true and false labels y under the plurality of domains to obtain final true and false labels of news P;
k6: for each domain tag of news P, a set of indicators such as confusion matrix, accuracy, recall, F1 score, etc. are used to evaluate false news detection performance under that domain.
As a preferred technical scheme of the invention: the specific extraction flow of the multi-view collaboration comprises the following steps:
(a) The method comprises the following steps Setting a super parameter T to represent the number of experts in the expert network;
(b) The method comprises the following steps Constructing a hybrid expert network, including a semantic network, an emotion network, a style network and a domain network;
(c) The method comprises the following steps Converting the input news text into a word vector W;
(d) The method comprises the following steps For each expert networkThe following operations are performed:
(d1) The method comprises the following steps Determining individual expert network model structures and the learnable parameters θ therein i
(d2) The method comprises the following steps Using word vector W and a learnable parameter θ i Computing expert networkThe output of (2) represents r i
(e) The method comprises the following steps Output representation r according to the respective expert network i Obtaining a multi-view feature representation of an input news text;
wherein each expert network(1. Ltoreq.i.ltoreq.T) all have the own good field and are good at extracting the characteristics of a certain field.
As a preferred technical scheme of the invention: the specific flow steps of the cross-view fusion in the S5 are as follows:
s51: receiving input data for a plurality of views, wherein each view represents a particular data feature, including but not limited to semantics, emotion, and style;
s52: for each view, a corresponding weight coefficient is calculated, where w sem ,w emo And w stl Weight coefficients respectively representing semantic, emotion and style views;
s53: calculating a cross-view interaction representation z, which is obtained by multiplying and summing the weight coefficients of different views with corresponding view representations, wherein the calculation formula is as follows:
wherein k is sem ,k emo ,k stl Representing the number of experts in semantic network, emotion network and style network respectively, wherein lnr sem ,lnr emo And lnr stl View representations representing semantic, emotional and style views, respectively, w domain And lnr domain Representing domain weights and domain local view representations;
s54: setting multi-head cross-view fusion, each head adaptively learning a cross-view representation, and generating a set of cross-view representation setsWherein H represents the number of cross-view representations;
s55: from the generated cross-view representation setAnd classifying or regressing the input data and outputting a result.
As a preferred technical scheme of the invention: the classifier module in the S7 is specifically characterized in that:
s71: acquiring cross-view representations of news articles by adopting different expert networks;
s71: inputting the domain labels into a domain gate to model the domain differences and obtain weight scores, wherein the weight functions are expressed as softmax (MLP (g));
s71: aggregating cross-view representation according to the calculated weight score, wherein the formula is as follows:
s71: inputting the aggregated cross-view representation into a multi-layer perception classifier with a softmax output layer for classifying false news into two classes;
s71: network training using a two-class cross entropy loss function, expressed asy i Representing the real label->Representative is a predictive label.
As a preferred technical scheme of the invention: the specific steps of constructing the hybrid expert network in the step (b) are as follows:
(b1) The method comprises the following steps Performing BERT marking and LSTM sequence information extraction on the news text to obtain semantic feature representation r of the news text sem
(b2) The method comprises the following steps Inputting emotion characteristics of emotion category, emotion dictionary, emotion intensity, emotion score and auxiliary characteristics as emotion characteristics into emotion network, extracting emotion characteristics by using mixed expert network, and obtaining emotion characteristic representation r of news text emo
(b3) The method comprises the following steps For eight aspects of readability, logic, credibility, normative, interactivity, interestingness, emotion and completeness as style characteristics, the style characteristics are input into a style network, and the style characteristics are extracted by using a mixed expert network to obtain style characteristic representation r of news text stl
(b4) The method comprises the following steps Using a specific domain feature extraction network for each domain, extracting domain features using TextCNN to obtain domain specific feature representation r of news text domain
(b5) The method comprises the following steps Representing three sets of features of different origins sem 、r emo 、r stl And domain-specific feature representation r domain And combining to obtain a multidimensional feature representation, and comprehensively describing the content and the features of the news text.
Compared with the prior art, the multi-field false news detection method based on multi-view collaboration has the following technical effects:
the beneficial effects of the invention are as follows: the multi-field false news detection method based on multi-view collaboration provided by the invention can better utilize the field information and realize the false news correction under the multi-field scene. The expert network access feature extraction is carried out, the relationship between the domain portal network learning domain and the view angle is introduced, and the multi-view angle feature representation of news is effectively captured by utilizing multi-view angle fusion and a BiLSTM module. The effectiveness and the superiority of the method are proved by the comparison of various virtual false news detection methods. The invention has the advantages of improving the accuracy and reliability of false news detection, being capable of carrying out multi-collar hope under the tomb of news media, social networks and the like, reducing the influence of false information on the public and society, and protecting the information safety and the social stability.
Drawings
Fig. 1 is a diagram of a model architecture of a multi-view collaborative multi-domain false news detection method according to the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in detail below with reference to the attached drawings so that the advantages and features of the present invention can be more easily understood by those skilled in the art, thereby making clear and defining the scope of the present invention.
Examples: referring to fig. 1, the present invention provides a technical solution: a multi-field false news detection method based on multi-view collaboration comprises the following steps of S1 to S8, namely judging whether the news is false news or not: s1: receiving news content input, and processing the input news content by a BERT model to obtain word embedding vectors;
s2: word embedding vectors are processed through bidirectional LSTM, and sequential characteristics of news are extracted; s3: processing news content by using semantic network and domain network respectively to obtain semantic features and domain specific features of news; s4: processing news content through a mixed expert system to obtain emotion characteristics and style characteristics; s5: inputting semantic features, field specific features, emotion features and style features into a cross-view fusion module for feature fusion; s6: according to the weight obtained by the domain network, weighting and summing the fused features to obtain a total feature representation; s7: inputting the total characteristic representation into a classifier module, and judging the authenticity of the news content; s8: and outputting a news authenticity judgment result. The invention performs experiments on the public data set, including a Chinese data set and an English data set.
Modeling of false news detection problems for this method includes the steps of: encoding the text content of the news P into a mark sequence with the length of T by using a BERT pre-training model; extracting emotion characteristics E and style characteristics S from news P, wherein the emotion characteristics E and the style characteristics S are numerical characteristics; taking a domain label g of the news P as input, combining emotion characteristics E and wind grid characteristics S, and training a multi-domain false news detection model by using a multi-view collaborative method; for news P, inputting a text marking sequence, emotion characteristics E and wind grid characteristics S, and outputting a true and false label y by combining a domain label g of the news P and using a trained multi-domain false news detection model; repeating the steps for a plurality of domain labels to obtain true and false labels y under each domain, and finally combining the true and false labels y under the plurality of domains to obtain final true and false labels of news P; for each domain tag of news P, a set of indicators such as confusion matrix, accuracy, recall, F1 score, etc. are used to evaluate false news detection performance under that domain.
The specific extraction of the multi-view collaborative method comprises the following steps: setting a super parameter T to represent the number of experts in the expert network; constructing a hybrid expert network, including a semantic network, an emotion network, a style network and a domain network; converting the input news text into a word vector W; for each expert network(1.ltoreq.i.ltoreq.T), the following operations are performed: determining a learnable parameter θ in an individual expert network i The method comprises the steps of carrying out a first treatment on the surface of the Using word vector W and a learnable parameter θ i Computing expert network->The output of (2) represents r i The method comprises the steps of carrying out a first treatment on the surface of the Output representation r according to the respective expert network i Obtaining a multi-view feature representation of an input news text; wherein each expert network->(1. Ltoreq.i.ltoreq.T) all have the own good field and are good at extracting the characteristics of a certain field.
The method for constructing the mixed expert network specifically comprises the following steps: performing BERT marking and LSTM sequence information extraction on the news text to obtain semantic feature representation r of the news text sem The method comprises the steps of carrying out a first treatment on the surface of the Inputting emotion characteristics of emotion category, emotion dictionary, emotion intensity, emotion score and auxiliary characteristics as emotion characteristics into emotion network, extracting emotion characteristics by using mixed expert network, and obtaining emotion characteristic representation r of news text emo The method comprises the steps of carrying out a first treatment on the surface of the For eight aspects of readability, logic, credibility, normative, interactivity, interestingness, emotion and completeness as style characteristics, the style characteristics are input into a style network, and the style characteristics are extracted by using a mixed expert network to obtain style characteristic representation r of news text stl The method comprises the steps of carrying out a first treatment on the surface of the Using a specific domain feature extraction network for each domain, extracting domain features using TextCNN to obtain domain specific feature representation r of news text domain The method comprises the steps of carrying out a first treatment on the surface of the Representing three sets of features of different origins sem 、r emo 、r stl And domain-specific feature representation r domain And combining to obtain a multidimensional feature representation, and comprehensively describing the content and the features of the news text.
The specific flow steps of the cross-view fusion of the method are as follows: receiving input data for a plurality of views, wherein each view represents a particular data feature, including but not limited to semantics, emotion, and style; for each view, a corresponding weight coefficient is calculated, where w sem ,w emo And w stl Weight coefficients respectively representing semantic, emotion and style views; calculating a cross-view interaction representation z, which is obtained by multiplying and summing the weight coefficients of different views with corresponding view representations, wherein the calculation formula is as follows: setting multi-head cross-view fusion, each head adaptively learning a cross-view representation, and generating a group of cross-view representation sets +.>Wherein H represents the number of cross-view representations; representing the collection +.>And classifying or regressing the input data and outputting a result.
The classifier module of the method is characterized in that: acquiring cross-view representations of news articles by adopting different expert networks; inputting the domain labels into a domain gate to model the domain differences and obtain weight scores, wherein the weight functions are expressed as softmax (MLP (g)); aggregating cross-view representation according to the calculated weight score, wherein the formula is as follows:inputting the aggregated cross-view representation into a multi-layer perception classifier with a softmax output layer for classifying false news into two classes; network training using a two-class cross entropy loss function, expressed as +.> y i Representing the real label->Representative is a predictive label.
Through the flow, the multi-field false news detection method based on multi-view collaboration can effectively detect false news in a plurality of fields, and accuracy and robustness are improved. The method fully utilizes all visual angle characteristics of the news text, such as semantics, emotion, style and the like, and is beneficial to improving the accuracy of judging the authenticity of the news. Meanwhile, the detection performance can be further improved through multi-view feature fusion.
Examples: the performance of the false news detection model is evaluated by adopting a common evaluation index AUC, F1 and accuracy. AUC is the area under the apparent operating characteristic curve plotted in two dimensions with false positive rate as x-coordinate and true positive rate as y-coordinate. AUC is widely used for evaluating the performance of different models, as it is not affected by class imbalance and is independent of the prediction threshold. F1 synthesizes two indexes of precision and recall, and the accuracy represents that the classification is correct and accounts for the total proportion.
The baseline can be divided into three types in total, the first type is a single domain detection method, and a model is built for each domain to train. Textcnn_s: this is a commonly used text classification model, we implement TextCNN with 5 kernels. The 5 cores with the same 64 channels have different step sizes of 1, 2, 3, 5 and 10, respectively. Biglu_s: and obtaining sequence information of the text from the news text through BiGRU modeling, and obtaining a prediction result by using MLP. The hidden layer size of the GRU is set to 300.Bert_s: the method comprises the steps of encoding a token of a news text by using BERT, and embedding an extracted average value into an MLP to obtain a final prediction result.
The second type is a mixed domain baseline, which mixes all domain data together for training. The implementation of biglu_a, textcnn_a, bert_a in this group is the same as the first group. Two other baselines of this type are StyleLSTM: news representations are first extracted from the content using BiLSTM. And combining the news representation and the style characteristics into an input MLP to obtain a final prediction result. DualEmo, extracting the characteristic representation of news through BiGRU, extracting emotion characteristics from news text and comments, and fusing the news text and comments to realize false news detection.
The third class is multi-domain methods, MMoE: is a multitasking model that sets each domain as a separate task, each domain having its own specific header by sharing a hybrid expert (MoE) among the domains. MoSE it adds LSTM to get sequence information of the text before expert in MMoE. Mdfend: is a multi-domain false news detection model that utilizes domain to select a useful expert for MoE.
Experimental analysis: in order to verify the effectiveness of the model provided by the invention, a comparison experiment is carried out on a Chinese and English data set by the model and a baseline method, the comparison indexes are F1, acc and AUC.
Table 1: chinese dataset model method performance comparison
Table 2: english data set model method performance comparison
The various indexes of the model and the variant thereof in the Chinese data set are shown in table 3. To verify the importance of the multiview fusion and the BiLSTM module to the proposed model, the two modules are analyzed separately.
1) Multi-view effect. First, we experimentally verified the contributions of the different views, comparing the model and variants presented herein MMFND-sem, MMFND-emo, MMFND-stl, representing the removal of semantic, emotional and stylistic views from MMFND, respectively. We find that all views are beneficial for false news detection, especially semantic views, which is the core of most existing approaches. Since emotion and style features are manually extracted from text content, these features are typically used as auxiliary information for semantic view modeling. Furthermore, we observe that emotion is more efficient than style. The reason may be that emotional characteristics include publishers and social characteristics, while style characteristics represent only publisher preferences.
2) The effect of BiLSTM. To model the order information of news text, a BiLSTM module is added before the semantic network. To evaluate the effectiveness of the proposed sequential modeling module, a variant MMFND-BiLSTM of the model was designed, which did not contain a BiLSTM module, but simply input text features directly into the semantic network, ignoring sequential connections of the text. Experiments prove that compared with a variant MMFND-BiLSTM, the MMFND has better performance, which means that the BiLSTM can effectively model the sequence information of the news text, and is beneficial to improving the detection performance.
Table 3: ablation experiments
Conclusion of experiment: it can be seen that the model MMFND of the present invention is superior to the comparative model in terms of multiple indices. Wherein, the method has very good expression effect on both Chinese and English data sets. On most tasks, the mixed domain model improves performance over the single domain model, which means that the joint multi-domain joint training data helps to improve not only the overall performance of multiple domains, but also the performance of a single domain. Meanwhile, the detection results of StyleLSTM and Dualemo are superior to BiGRU, and the two models combine text features extracted by BiGRU with style features and emotion features, so that the introduction of more views is beneficial to multi-domain false news detection. We found that MMFND was significantly better than the comparative model in most tasks, suggesting that MMDFND can not only improve overall detection performance, but also improve performance for a particular domain. The main reason is that MMFND enriches domain information and explicitly models various domain differences by aggregating useful cross-view interactions of different domains.
The foregoing examples illustrate only a few embodiments of the invention, which are described in detail and are not to be construed as limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention.

Claims (6)

1. A multi-field false news detection method based on multi-view collaboration is characterized by comprising the following steps of: the method adopts a multi-view collaborative method to solve the problem of multi-domain false news detection by combining domain information, and specifically comprises the following steps of:
s1: receiving news content P, and processing the input news content by using a BERT model to obtain word embedding vectors;
s2: word embedding vectors are processed through bidirectional LSTM, and sequential characteristics of news are extracted;
s3: processing news content by using semantic network and domain network respectively to obtain semantic features and domain specific features of news;
s4: processing news content P through a mixed expert system to obtain emotion characteristics E and style characteristics S;
s5: inputting semantic features, domain specific features, emotion features and style features into a cross-view fusion module to realize self-adaptive cross-view representation;
s6: according to the weight obtained by the domain network, weighting and summing the fused features to obtain a total feature representation;
s7: inputting the total characteristic representation into a classifier module, and judging the authenticity of the news content;
s8: and outputting a news authenticity judgment result.
2. The multi-domain false news detection method based on multi-view collaboration according to claim 1, wherein the method comprises the following steps: in the solving of false news detection problems, the modeling thereof comprises the following steps:
k1: encoding text content of the news content P into a text mark sequence with the length of T by using a BERT pre-training model;
k2: extracting emotion characteristics E and style characteristics S from news content P, wherein the emotion characteristics E and the style characteristics S are numerical characteristics;
k3: taking a domain tag g of news P as input and combining emotion characteristics E and style characteristicsS, training a multi-domain false news detection model by using a multi-task learning method, wherein the domain label g epsilon { Domian } 1 ,…,Somain N Domian represents a specific value of a domain;
and K4: for news P, inputting a text marking sequence, emotion characteristics E and wind grid characteristics S, and outputting a true and false label y by combining a domain label g of the news P and using a trained multi-domain false news detection model;
and K5: repeating the steps K3 and K4 for a plurality of domain labels to obtain true and false labels y under each domain, and finally combining the true and false labels y under the plurality of domains to obtain final true and false labels of news P;
k6: for each domain tag g of news P, a set of evaluation metrics is used, including confusion matrix, accuracy, recall, and F1 score, to evaluate false news detection performance under that domain.
3. The multi-domain false news detection method based on multi-view collaboration according to claim 1, wherein the method comprises the following steps: the multi-view collaborative method comprises the following steps: the specific extraction process comprises the following steps:
(a) The method comprises the following steps Setting a super parameter T to represent the number of experts in the expert network;
(b) The method comprises the following steps Constructing a hybrid expert network, including a semantic network, an emotion network, a style network and a domain network;
(c) The method comprises the following steps Converting text content of the input news content P into a word vector W;
(d) The method comprises the following steps For each expert networkThe following operations are performed:
(d1) The method comprises the following steps Determining the model structure of each expert network and the learnable parameters θ therein i
(d2) The method comprises the following steps Using word vector W and a learnable parameter θ i Computing expert networkIs the input of (2)Let r denote i
(e) The method comprises the following steps Output representation r according to the respective expert network i Obtaining a multi-view feature representation of an input news text;
wherein each expert networkIs used for extracting the characteristics of different fields.
4. The multi-domain false news detection method based on multi-view collaboration according to claim 1, wherein the method comprises the following steps: the specific operation flow steps of the cross-view fusion module in the S5 include:
s51: receiving input data for a plurality of views, wherein each view represents a particular data feature, including but not limited to semantics, emotion, and style;
s52: for each view, a corresponding weight coefficient is calculated, where W sem ,w emo And w stl Weight coefficients respectively representing semantic, emotion and style views;
s53: calculating a cross-view interaction representation z, which is obtained by multiplying and summing the weight coefficients of different views with corresponding view representations, wherein the calculation formula is as follows:
wherein k is sem ,k emo ,k stl Representing the number of experts in semantic network, emotion network and style network respectively, wherein lnr sem ,lnr emo And lnr stl View representations representing semantic, emotional and style views, respectively, w domain And lnr domain Representing domain weights and domain local view representations;
s54: setting multi-head cross-view fusion, each head adaptively learning a cross-view representation, and generating a set of cross-view representation setsWherein H represents the number of cross-view representations;
s55: from the generated cross-view representation setAnd classifying or regressing the input data and outputting a result.
5. The multi-domain false news detection method based on multi-view collaboration according to claim 1, wherein the method comprises the following steps: the specific workflow of the classifier module in S7 is as follows:
s71: acquiring cross-view representations of news articles by adopting different expert networks;
s72: inputting the domain labels into a domain gate to model the domain differences and obtain weight scores, wherein the weight functions are expressed as softmax (MLP (g));
s73: aggregating cross-view representation according to the calculated weight score, wherein the formula is as follows:
s74: inputting the aggregated cross-view representation into a multi-layer perception classifier with a softmax output layer to perform two classifications of false news;
s75: network training using a two-class cross entropy loss function, expressed asy i Representing the real label->Representative is a predictive label.
6. A multi-domain false news detection method based on multi-view collaboration according to claim 3, wherein: the specific steps of constructing the hybrid expert network in the step (b) are as follows:
(b1) The method comprises the following steps Performing BERT marking and LSTM sequence information extraction on the news text to obtain semantic feature representation r of the news text sem
(b2) The method comprises the following steps Inputting emotion characteristics into an emotion network, wherein the emotion characteristics comprise emotion categories, emotion dictionaries, emotion intensities, emotion scores and auxiliary characteristics, extracting the emotion characteristics by using a mixed expert network, and obtaining emotion characteristic representation r of a news text emo
(b3) The method comprises the following steps Inputting style characteristics into a style network, wherein the style characteristics comprise eight aspects of readability, logic, credibility, normalization, interactivity, interestingness, emotion and completeness, extracting the style characteristics by using a mixed expert network, and obtaining style characteristic representation r of news text stl
(b4) The method comprises the following steps Using a specific domain feature extraction network for each domain, extracting domain features using TextCNN to obtain domain specific feature representation r of news text domain
(b5) The method comprises the following steps Representing three sets of features of different origins sem 、r emo 、r stl And domain-specific feature representation r domain And combining to obtain a multidimensional feature representation, wherein the multidimensional feature representation is used for comprehensively describing the content and the features of the news text.
CN202310515854.2A 2023-05-09 2023-05-09 Multi-field false news detection method based on multi-view collaboration Pending CN116701569A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310515854.2A CN116701569A (en) 2023-05-09 2023-05-09 Multi-field false news detection method based on multi-view collaboration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310515854.2A CN116701569A (en) 2023-05-09 2023-05-09 Multi-field false news detection method based on multi-view collaboration

Publications (1)

Publication Number Publication Date
CN116701569A true CN116701569A (en) 2023-09-05

Family

ID=87838304

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310515854.2A Pending CN116701569A (en) 2023-05-09 2023-05-09 Multi-field false news detection method based on multi-view collaboration

Country Status (1)

Country Link
CN (1) CN116701569A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117972497A (en) * 2024-04-01 2024-05-03 中国传媒大学 False information detection method and system based on multi-view feature decomposition

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117972497A (en) * 2024-04-01 2024-05-03 中国传媒大学 False information detection method and system based on multi-view feature decomposition
CN117972497B (en) * 2024-04-01 2024-06-18 中国传媒大学 False information detection method and system based on multi-view feature decomposition

Similar Documents

Publication Publication Date Title
Hu et al. Creating something from nothing: Unsupervised knowledge distillation for cross-modal hashing
Huang et al. Speech emotion recognition from variable-length inputs with triplet loss function.
CN110990564B (en) Negative news identification method based on emotion calculation and multi-head attention mechanism
Tang et al. CTFN: Hierarchical learning for multimodal sentiment analysis using coupled-translation fusion network
Guo et al. LD-MAN: Layout-driven multimodal attention network for online news sentiment recognition
CN105279495A (en) Video description method based on deep learning and text summarization
Wang et al. Deep cascaded cross-modal correlation learning for fine-grained sketch-based image retrieval
Cai et al. Intelligent question answering in restricted domains using deep learning and question pair matching
CN110532379A (en) A kind of electronics information recommended method of the user comment sentiment analysis based on LSTM
CN107145514B (en) Chinese sentence pattern classification method based on decision tree and SVM mixed model
CN110287314B (en) Long text reliability assessment method and system based on unsupervised clustering
CN117371456B (en) Multi-mode irony detection method and system based on feature fusion
CN111563373A (en) Attribute-level emotion classification method for focused attribute-related text
Sun et al. Transformer based multi-grained attention network for aspect-based sentiment analysis
CN110297986A (en) A kind of Sentiment orientation analysis method of hot microblog topic
CN113239159A (en) Cross-modal retrieval method of videos and texts based on relational inference network
CN116701569A (en) Multi-field false news detection method based on multi-view collaboration
Nadeem et al. SSM: Stylometric and semantic similarity oriented multimodal fake news detection
CN115309860A (en) False news detection method based on pseudo twin network
Luo et al. Bi-vldoc: Bidirectional vision-language modeling for visually-rich document understanding
Guo A mutual attention based multimodal fusion for fake news detection on social network
Liu et al. A multimodal approach for multiple-relation extraction in videos
He et al. VGSG: Vision-Guided Semantic-Group Network for Text-Based Person Search
Chen et al. Identifying Cantonese rumors with discriminative feature integration in online social networks
Wu et al. Audio-visual kinship verification: a new dataset and a unified adaptive adversarial multimodal learning approach

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination