CN112650851A - False news identification system and method based on multilevel interactive evidence generation - Google Patents

False news identification system and method based on multilevel interactive evidence generation Download PDF

Info

Publication number
CN112650851A
CN112650851A CN202011587811.8A CN202011587811A CN112650851A CN 112650851 A CN112650851 A CN 112650851A CN 202011587811 A CN202011587811 A CN 202011587811A CN 112650851 A CN112650851 A CN 112650851A
Authority
CN
China
Prior art keywords
sequence
news
attention
false
generation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011587811.8A
Other languages
Chinese (zh)
Other versions
CN112650851B (en
Inventor
饶元
吴连伟
孙菱
郝哲
贺王卜
兰玉乾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202011587811.8A priority Critical patent/CN112650851B/en
Publication of CN112650851A publication Critical patent/CN112650851A/en
Application granted granted Critical
Publication of CN112650851B publication Critical patent/CN112650851B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Business, Economics & Management (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Electrically Operated Instructional Devices (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a false news identification system and method based on multilevel interactive evidence generation, which are generated by designing two progressive coding and decoding levels to generate a true phase behind the false news as an explanation of a verification result. The inference generation of the invention utilizes local inference to promote the false part of news and deep understanding between conflicts, so as to focus on how to reveal the real false part behind the false news; the invention has detachability, can decouple, train and utilize the three generating modules of the invention, and has the model generalization capability and the task stage training capability; experiments on two published, widely used, fake news data sets have shown that the present invention achieves better performance than the most advanced methods before.

Description

False news identification system and method based on multilevel interactive evidence generation
Technical Field
The invention relates to a false news identification system and method with interpretability and based on multi-level interactive refined evidence generation.
Background
Currently, social media have become an indispensable part of people's life, and people can freely express themselves in social media, draw knowledge and interact. The social network not only brings 'group intelligence' by virtue of the speaking convenience and the low cost of information publishing, but also causes the diffusion and the flooding of a large amount of false or unverified information, and particularly in the presence of a great extreme sudden event, the false information diffusion is easily caused, the life order of people is disturbed, and the social panic is caused. The abuse of fake news seriously affects the life, social stability and national safety of people. How to quickly identify the credibility of information in a social network and enable the identification result to be interpretable for users has become one of the major problems in the academic world and the industrial world at present.
The application of data mining and machine learning has led to the development of identification research of fake news. The classical method is to extract text features (such as N-gram features and bag of words features) by the content of fake news and to identify the authenticity of information by using supervised learning algorithms (such as random forest and support vector machine). NLP researchers have also focused on deeper linguistic features such as mining of factual/affirmative verbs and subjective words and writing styles. Although these methods have achieved some false news detection performance, they have difficulty in providing a reasonable interpretation of the detection results to the user. To overcome these drawbacks, recent research tends to explore a false news detection method with interpretability, which mainly explains the false part of false news by developing an interaction model to capture evidence segments from reliable sources, and often focuses on word-level saliency evidence semantics and sentence-level consistency semantics to embody the interpretability of false news. However, although these interaction models have reflected some degree of interpretability, the word-level and sentence-level evidence they capture may simply be conflicts between news and related articles that are difficult to interpret behind fake news. In other words, the current interaction model captures a variety of coarse-grained conflicts in related articles, and the truth behind the false news may need to be refined continuously in these conflicts to obtain.
Disclosure of Invention
The invention aims to solve the problems in the prior art and provides a false news identification system and method based on multi-level interactive evidence generation. The invention not only improves the identification performance of the fake news, but also provides reasonable and transparent interpretable evidence for the identification result.
In order to achieve the purpose, the invention adopts the following technical scheme to realize the purpose:
the false news identification method based on multilevel interactive evidence generation comprises the following steps:
step 1, taking a news sequence C and a related article sequence R as input characteristics;
step 2: for any news sequence C and related article sequence R, learning the dependency relationship between any two words and the structural characteristics in the sequences by adopting a self-attention network as an encoder of a conflict generator and a false part generator;
and step 3: linearly projecting the query, key and value of a news sequence C or a related article sequence R for h times by means of different linear projections, and then parallelly executing zooming point-by-point attention; the attention results are concatenated and projected to obtain a new representation as follows:
Figure BDA0002866353190000031
H=MultiHead(Q,K,V)=Concat(head1,head2,…,headh)Wo (2)
wherein,
Figure BDA0002866353190000032
and WoIs a trainable parameter; hCAnd HRAre two outputs of the spurious part generator module;
Figure BDA0002866353190000033
outputs for the conflict generator for the first, ith, and last related articles;
step 4, the cross attention network formed by the self attention network makes the outputs of the encoders of the conflict generator and the false part generator mutually interact as the input of the decoder, which is concretely as follows:
Hclaim=attention(Q,K,V)=attention(HR,HC,HC) (4)
HallRA=attention(Q,K,V)=attention(Hc,HR,HR) (5)
wherein HclaimAnd HallRARepresenting the output of the cross-attention tier for news and for related articles, respectively;
step 6: using linear interpolation as a fusion function
Figure BDA0002866353190000034
Obtaining:
Figure BDA0002866353190000035
wherein λ is a hyper-parameter for controlling how much information content of other tasks should be considered absorbed, 0< λ < 1;
step 7, applying a feedforward network to the fused result, wherein the feedforward network is added with nonlinear characteristics and scale invariant characteristics, and comprises an implicit layer with a ReLU;
Figure BDA0002866353190000036
wherein, W1、W2、b1And b2For trainable parameters, OFIs a long-context attention representation of the decoder;
step 8, acquiring the word probability of the generation process by using a softmax layer; correspondingly generated error part sequence
Figure BDA0002866353190000037
The log-likelihood estimate of (d) is expressed as:
Figure BDA0002866353190000041
step 9, the false part generator module generates a contextual attention representation O based on the feed-forward networkFPredicted word y oftExpressed as:
P(yt|C,y1,y2,…,yt-1;θ)=P(yt-1|OF;θ)=softmax(WsOF) (9)
wherein, WsIs a trainable parameter;
step 10, in the cross attention layer,
Figure BDA0002866353190000042
representing the interaction of the relevant article and the ith article;
in the fusion layer, the interactions of all related articles are fused, namely:
Figure BDA0002866353190000043
wherein λ is12+…+λn=1;
At the feedback network layer, the output of the conflict generation module is a conflict sequence OCThe sequence generated by the false part generator module is YC
Step 11, capturing the generated sequence Y by using a local reasoning unitFAnd YCAnd incorporate it into a Y-based basisCY of (A) isFIn the new representation of (a);
firstly, a common attention moment array is calculated
Figure BDA0002866353190000044
To capture the correlation between two sequences, each element E in a common attention matrixi,jRepresents YFSequence ith word and YCThe correlation between the jth word of the sequence; the common attention matrix is:
Figure BDA0002866353190000045
wherein W and P represent trainable parameters, an element dot-product operation;
for YFY of (A) isCDirected attention vector:
Figure BDA0002866353190000046
Figure BDA0002866353190000051
fusing original vectors using absolute differences and element dot multiplication
Figure BDA0002866353190000052
And
Figure BDA0002866353190000053
Figure BDA0002866353190000054
Figure BDA0002866353190000055
to obtain a catalyst containing YFBy YCNew representation of reasoning information for guidance:
Figure BDA0002866353190000056
Figure BDA0002866353190000057
where LayerNorm (-) is layer regularization, the result is
Figure BDA0002866353190000058
Is a 2-dimensional and YFSimilar shaped tensions;
step 11, obtaining the generated inference sequence Y through the generation processEThe reason for the error of the fake news can be explained by the reasoning sequence because the generated reasoning sequence can reason the fake part of the news and corresponding evidence;
and 12, integrating the three sequences according to different proportions to absorb the context expression to obtain a positive characteristic F:
F=e(YE)+γ1e(YF)+γ2e(YC) (17)
wherein e (-) is a representation of a sequence of words, γ1And gamma2Is a hyper-parameter;
step 13, based on the integration characteristic F, a multi-layer perceptron MLP classifier is used for predicting distributed labels, probability distribution prediction task learning of a softmax function is adopted, and a real training sample label y is used for minimizing the error of a global loss function model:
v=ReLU(WfF+bf) (18)
p=softmax(WpF+bp) (19)
loss=-∑ylogp (20)
wherein, Wp、Wf、bfAnd bpAre trainable parameters.
A false news identification system based on multi-level interactive evidence generation, comprising:
an encoding module for capturing context representations from an input sequence of a generative model, learning and encoding dependencies between the input sequence and internal structural features;
the interactive learning decoding module is used for exploring a part which is possible to generate errors in the fake news and conflict semantics existing between related articles;
an interpretable evidence generating module for generating an inference sequence as an interpretation sequence of a false news cause of error;
and the task learning module is used for integrating the three generation sequences to enhance the identification performance of the false news.
A false news identification terminal device based on multi-level interactive evidence generation comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor executes the computer program to realize the steps of the method.
A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method as described above.
Compared with the prior art, the invention has the following beneficial effects:
the invention discloses a virtual and fake news identification system and method with interpretability and based on multi-level interactive refined evidence generation.
The invention designs two progressive coding and decoding levels to generate the true phase behind the false news as the explanation of the verification result. The inference generation of the invention utilizes local inference to promote the false part of news and deep understanding between the conflict, in order to focus on how to reveal the true false part behind the false news; the method has detachability, can be used for decoupling training of three generating modules, and has model generalization capability and task stage training capability; experiments on two published, widely used, fake news data sets have shown that the present invention achieves better performance than the most advanced methods before.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings may be obtained according to these drawings without inventive effort.
FIG. 1 is an architectural diagram of the present invention;
FIG. 2 is a graph of performance of the experiments of the present invention under two data sets, Snaps and PolitiFact;
FIG. 3 is a graph showing the separation performance of the module assembly of the present invention under two data sets of snoops and PolitiFact.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
In the description of the embodiments of the present invention, it should be noted that if the terms "upper", "lower", "horizontal", "inner", etc. are used for indicating the orientation or positional relationship based on the orientation or positional relationship shown in the drawings or the orientation or positional relationship which is usually arranged when the product of the present invention is used, the description is merely for convenience and simplicity, and the indication or suggestion that the referred device or element must have a specific orientation, be constructed and operated in a specific orientation, and thus, cannot be construed as limiting the present invention. Furthermore, the terms "first," "second," and the like are used merely to distinguish one description from another, and are not to be construed as indicating or implying relative importance.
Furthermore, the term "horizontal", if present, does not mean that the component is required to be absolutely horizontal, but may be slightly inclined. For example, "horizontal" merely means that the direction is more horizontal than "vertical" and does not mean that the structure must be perfectly horizontal, but may be slightly inclined.
In the description of the embodiments of the present invention, it should be further noted that unless otherwise explicitly stated or limited, the terms "disposed," "mounted," "connected," and "connected" should be interpreted broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be connected internally or indirectly. The specific meaning of the above terms in the present invention can be understood by those of ordinary skill in the art according to specific situations.
The invention is described in further detail below with reference to the accompanying drawings:
referring to fig. 1, an embodiment of the present invention discloses a false news recognition system based on multi-level interactive evidence generation, including:
and the coding module takes fake news and a series of related articles as input of the generative model, and learns and codes the dependency and internal structural features between input sequences by adopting a self-attention model in order to capture context representation from the input sequences of the generative model. In particular, the first two generative models of the present invention have the same self-attention network as the encoder structure.
The interactive learning decoding module develops an interactive learning model to enable news to interact with related articles and interact among the related articles, so that parts of false news, where errors may occur, and conflict semantics existing among related articles are explored respectively.
The interpretable evidence generating module provides a local inference network on the basis of a conventional decoder to enable the error part and conflict semantics of the false news obtained in the module 2 to realize a global reasoning process, so that a refined inference sequence is generated to serve as an explanation sequence of error reasons of the false news.
And the task learning module integrates the three generation sequences in a linear combination mode to enhance the identification performance of the fake news.
The embodiment of the invention discloses a false news identification method based on multi-level interactive evidence generation, which comprises the following steps:
stage 0: data initialization
Step 0: giving a news sequenceColumn C ═ C1,c2,…,c|C|In which c isiAn embedded sequence representing the ith word, and a series of related article sequences R<r1;r2;…;r|R|>(ii) a Wherein r isiThe i-th related article, which represents the composition, "; "denotes a splicing operation, and
Figure BDA0002866353190000091
an embedded representation representing a kth word in an ith related chapter; in addition, | C |, | R |, and | R |iAnd | represents the word length of the news sequence, the number of related articles, and the word length of the ith related article, respectively. y represents a true or false binary label;
stage 1: construction of the encoder
Step 1: taking the news sequence and the related article sequence as input characteristics of the model;
step 2: for the context representation of the model input features, the invention adopts a self-attention network as an encoder of two generators to implicitly learn the dependency relationship between any two words and the structural features inside the sequence, taking a false part generator as an example, the details of the encoder can be expressed as follows:
Figure BDA0002866353190000101
wherein Q, K, V are the query matrix, the key matrix, and the value matrix, respectively. d is the dimension of the key matrix. In the configuration of the present embodiment, Q ═ K ═ V ═ C is for the module of the news sequence, and Q ═ K ═ V ═ R is for all the relevant article sequence modules. In the encoder of the collision generator, Q-K-V-riEncoding for the ith related article.
And step 3: to enhance the parallelism from attention to boost the efficiency of the model, multi-headed attention first linearly projects the query, key and value h times by means of different linear projections, and then performs scaled point-by-point attention in parallel. Finally, the results of these attentions are concatenated and projected to obtain a new representation. The process can be formulated as:
Figure BDA0002866353190000102
H=MultiHead(Q,K,V)=Concat(head1,head2,…,headh)Wo (3)
wherein,
Figure BDA0002866353190000103
and WoAre trainable parameters. In particular, HCAnd HRAre the two outputs of the dummy portion generator module.
Figure BDA0002866353190000104
The conflict generator is directed to the output of the first, ith and last related articles.
And (2) stage: construction of an interactive learning decoder
And 4, step 4: in order to explore the possible wrong parts in the news to be verified, interactive learning decoders are designed to interact with the relevant articles. The interaction module involves three levels: a cross-attention layer, a fusion layer, and a feed-forward network layer.
And 5: in order to make the interaction between the news to be verified and the related articles more sufficient, a cross attention network consisting of a self attention network makes the outputs of the two encoders interact with each other as the input of the decoder. The interaction process can be described as follows:
Hclaim=attention(Q,K,V)=attention(HR,HC,HC) (4)
HallRA=attention(Q,K,V)=attention(Hc,HR,HR) (5)
wherein HclaimAnd HallRAThe output of the cross-attention tier for news and for related articles is shown separately.
Step 6: to fuse news into related articles and to more importantly absorb high-level representations in news semantics, linear interpolation is used as a fusion function, which can be calculated as:
Figure BDA0002866353190000111
where λ (0< λ <1) is a hyper-parameter to control how much information amount of other tasks should be considered absorbed.
And 7: next, a feed forward network is applied to the fused results, which adds non-linear features and scale invariant features, including an implicit layer with ReLU.
Figure BDA0002866353190000112
Wherein, W1、W2、b1And b2For trainable parameters, OFIs a long-context attention representation of the decoder.
And 8: finally, the probability of the word of the generation process is obtained by utilizing the softmax layer. Formally, correspondingly generated error partial sequences
Figure BDA0002866353190000113
The log-likelihood estimate of (d) may be expressed as:
Figure BDA0002866353190000114
and step 9: the error portion generation module generates a context representation O based on a feed-forward networkFPredicting word ytCan be expressed as:
P(yt|C,y1,y2,…,yt-1;θ)=P(yt-1|OF;θ)=softmax(WsOF) (9)
wherein, WsAre trainable parameters.
Step 10: in particular, the decoder and error part of the collision generating moduleThe decoders of the generation module are similar to interactive learning decoders that allow all relevant articles to interact with each relevant article, thereby capturing suspicious or conflicting semantics from the relevant articles. At the cross-attention level(s),
Figure BDA0002866353190000121
Figure BDA0002866353190000122
representing the interaction of the relevant article with the ith article. In the fusion layer, the interactions of all related articles are fused, i.e.
Figure BDA0002866353190000123
Wherein λ12+…+ λn1. At the feedback network layer, the output of the conflict generation module is a conflict sequence OCThe sequence generated by the module is YC
And (3) stage: generation of interpretable evidence
Step 11: in order to find the truth behind the false news, the present embodiment proposes to perform inference generation by means of a local inference unit, thereby implementing a general inference process. Local inference unit captures the generated sequence YFAnd YCAnd incorporate it into the Y-based baseCY of (A) isFIn the new representation of (2). Specifically, a common attention moment matrix is first calculated
Figure BDA0002866353190000124
To capture the correlation between two sequences, each element E in a common attention matrixi,jRepresents YFSequence ith word and YCCorrelation between the j-th words of the sequence. Formally, the common attention moment array can be calculated as:
Figure BDA0002866353190000125
where W and P represent trainable parameters, and an element dot product operation.
Step 12: to obtain a radical of formula YFY of (A) isCDirected attention vector:
Figure BDA0002866353190000126
Figure BDA0002866353190000127
step 13: to more fully integrate YFAnd YCFusing the original vector by absolute difference and element dot multiplication
Figure BDA0002866353190000128
And
Figure BDA0002866353190000129
Figure BDA00028663531900001210
Figure BDA00028663531900001211
step 14: obtain a catalyst containing YFBy YCNew representation of reasoning information for guidance:
Figure BDA0002866353190000131
Figure BDA0002866353190000132
where LayerNorm (-) is layer regularization, the result is
Figure BDA0002866353190000133
Is a 2-dimensional and YFSimilar shaped tensions.
Step 15: the generated inference sequence Y is obtained through the generation processE(step 8 and step 9). The reason for the error of the fake news can be explained by the reasoning sequence because the reasoning sequence generated can reason out the fake part of the news and corresponding evidence.
And (4) stage: task learning
Step 16: in order to fully utilize the generated three sequences to improve the performance of false news identification, the three sequences are integrated to absorb context expressions according to different proportions.
F=e(YE)+γ1e(YF)+γ2e(YC) (17)
Wherein e (-) is a representation of a sequence of words, γ1And gamma2Is a hyper-parameter.
And step 17: based on the integration characteristic F, a multi-layer perceptron (MLP) classifier is used for predicting the distributed labels, probability distribution prediction task learning of a softmax function is adopted, and a real training sample label y is used for minimizing the error of a global loss function model:
v=ReLU(WfF+bf) (18)
p=softmax(WpF+bp) (19)
loss=-∑ylogp (20)
wherein, Wp、Wf、bfAnd bpAre trainable parameters.
The device provided by the embodiment of the invention. The embodiment comprises the following steps: a processor, a memory, and a computer program stored in the memory and executable on the processor. The processor realizes the steps of the above-mentioned method embodiments when executing the computer program. Alternatively, the processor implements the functions of each module/unit in the above device embodiments when executing the computer program.
The computer program may be partitioned into one or more modules/units that are stored in the memory and executed by the processor to implement the invention.
The terminal equipment can be computing equipment such as a desktop computer, a notebook computer, a palm computer and a cloud server. The terminal device may include, but is not limited to, a processor, a memory.
The processor may be a Central Processing Unit (CPU), other general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware component, etc.
The memory can be used for storing the computer program and/or the module, and the processor can realize various functions of the terminal equipment by running or executing the computer program and/or the module stored in the memory and calling data stored in the memory.
The terminal device integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer memory, Read-only memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, etc. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease in accordance with the requirements of legislative and patent practice in a jurisdiction, for example in some jurisdictions, in accordance with legislative and patent practice, the computer readable medium does not include electrical carrier signals and telecommunications signals.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and it is obvious to those skilled in the art that various modifications and variations can be made in the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (4)

1. The false news identification method based on multilevel interactive evidence generation is characterized by comprising the following steps:
step 1, taking a news sequence C and a related article sequence R as input characteristics;
step 2: for any news sequence C and related article sequence R, learning the dependency relationship between any two words and the structural characteristics in the sequences by adopting a self-attention network as an encoder of a conflict generator and a false part generator;
and step 3: linearly projecting the query, key and value of a news sequence C or a related article sequence R for h times by means of different linear projections, and then parallelly executing zooming point-by-point attention; the attention results are concatenated and projected to obtain a new representation as follows:
headi=Attention(QWi Q,KWi K,VWi W) (1)
H=MultiHead(Q,K,V)=Concat(head1,head2,…,headh)Wo (2)
wherein, Wi Q、Wi K、Wi WAnd WoIs a trainable parameter; hCAnd HRAre two outputs of the spurious part generator module;
Figure FDA0002866353180000011
output for the conflict generator for the first, ith, and last related articles;
step 4, the cross attention network formed by the self attention network makes the outputs of the encoders of the conflict generator and the false part generator mutually interact as the input of the decoder, which is concretely as follows:
Hclaim=attention(Q,K,V)=attention(HR,HC,HC) (4)
HallRA=attention(Q,K,V)=attention(Hc,HR,HR) (5)
wherein HclaimAnd HallRARepresenting the output of the cross-attention tier for news and for related articles, respectively;
step 6: using linear interpolation as a fusion function
Figure FDA0002866353180000012
Obtaining:
Figure FDA0002866353180000013
wherein λ is a hyper-parameter for controlling how much information content of other tasks should be considered absorbed, 0< λ < 1;
step 7, applying a feedforward network to the fused result, wherein the feedforward network is added with nonlinear characteristics and scale invariant characteristics, and comprises an implicit layer with a ReLU;
Figure FDA0002866353180000021
wherein, W1、W2、b1And b2For trainable parameters, OFIs a long-context attention representation of the decoder;
step (ii) of8, acquiring the word probability of the generation process by utilizing a softmax layer; correspondingly generated error part sequence
Figure FDA0002866353180000022
The log-likelihood estimate of (d) is expressed as:
Figure FDA0002866353180000023
step 9, the false part generator module generates a contextual attention representation O based on the feed-forward networkFPredicted word y oftExpressed as:
P(yt|C,y1,y2,…,yt-1;θ)=P(yt-1|OF;θ)=softmax(WsOF) (9)
wherein, WsIs a trainable parameter;
step 10, in the cross attention layer,
Figure FDA0002866353180000024
representing the interaction of the relevant article and the ith article;
in the fusion layer, the interactions of all related articles are fused, namely:
Figure FDA0002866353180000025
wherein λ is12+…+λn=1;
At the feedback network layer, the output of the conflict generation module is a conflict sequence OCThe sequence generated by the false part generator module is YC
Step 11, capturing the generated sequence Y by using a local reasoning unitFAnd YCAnd incorporate it into the Y-based baseCY of (A) isFIn the new representation of (a);
first calculating a common attentionMatrix array
Figure FDA0002866353180000031
To capture the correlation between two sequences, each element E in a common attention matrixi,jRepresents YFSequence ith word and YCThe correlation between the jth word of the sequence; the common attention matrix is:
Figure FDA0002866353180000032
wherein W and P represent trainable parameters, an element dot-product operation;
for YFY of (A) isCDirected attention vector:
Figure FDA0002866353180000033
Figure FDA0002866353180000034
fusing original vector Y by absolute difference and element dot producti FAnd
Figure FDA0002866353180000035
Figure FDA0002866353180000036
Figure FDA0002866353180000037
to obtain a catalyst containing YFBy YCNew representation of reasoning information for guidance:
Figure FDA0002866353180000038
Figure FDA0002866353180000039
where LayerNorm (-) is layer regularization, the result is
Figure FDA00028663531800000310
Is a 2-dimensional and YFTensors of similar shape;
step 11, obtaining the generated inference sequence Y through the generation processEThe reason for the error of the fake news can be explained by the reasoning sequence because the generated reasoning sequence can reason the fake part of the news and corresponding evidence;
and 12, integrating the three sequences according to different proportions to absorb the context expression to obtain a positive characteristic F:
F=e(YE)+γ1e(YF)+γ2e(YC) (17)
wherein e (-) is a representation of a sequence of words, γ1And gamma2Is a hyper-parameter;
step 13, based on the integration characteristic F, a multi-layer perceptron MLP classifier is used for predicting distributed labels, probability distribution prediction task learning of a softmax function is adopted, and a real training sample label y is utilized to minimize the global loss function model error:
v=ReLU(WfF+bf) (18)
p=softmax(WpF+bp) (19)
loss=-∑ylogp (20)
wherein, Wp、Wf、bfAnd bpAre trainable parameters.
2. A false news recognition system based on multi-level interactive evidence generation, comprising:
an encoding module for capturing context representations from an input sequence of a generative model, learning and encoding dependencies between the input sequence and internal structural features;
the interactive learning decoding module is used for exploring a part which is possible to generate errors in the fake news and conflict semantics existing between related articles;
an interpretable evidence generating module for generating an inference sequence as an interpretation sequence of causes of error for fake news;
and the task learning module is used for integrating the three generation sequences to enhance the identification performance of the fake news.
3. False news recognition terminal device based on multi-level interactive evidence generation, comprising a memory, a processor and a computer program stored in said memory and executable on said processor, characterized in that said processor, when executing said computer program, implements the steps of the method according to claim 1.
4. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method as set forth in claim 1.
CN202011587811.8A 2020-12-28 2020-12-28 False news identification system and method based on multilevel interactive evidence generation Active CN112650851B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011587811.8A CN112650851B (en) 2020-12-28 2020-12-28 False news identification system and method based on multilevel interactive evidence generation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011587811.8A CN112650851B (en) 2020-12-28 2020-12-28 False news identification system and method based on multilevel interactive evidence generation

Publications (2)

Publication Number Publication Date
CN112650851A true CN112650851A (en) 2021-04-13
CN112650851B CN112650851B (en) 2023-04-07

Family

ID=75363650

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011587811.8A Active CN112650851B (en) 2020-12-28 2020-12-28 False news identification system and method based on multilevel interactive evidence generation

Country Status (1)

Country Link
CN (1) CN112650851B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113849599A (en) * 2021-09-03 2021-12-28 北京中科睿鉴科技有限公司 Joint false news detection method based on mode information and fact information

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018237098A1 (en) * 2017-06-20 2018-12-27 Graphika, Inc. Methods and systems for identifying markers of coordinated activity in social media movements
WO2020061578A1 (en) * 2018-09-21 2020-03-26 Arizona Board Of Regents On Behalf Of Arizona State University Method and apparatus for collecting, detecting and visualizing fake news
CN111177554A (en) * 2019-12-27 2020-05-19 西安交通大学 False news identification system and method capable of explaining exploration based on generation of confrontation learning
CN111581980A (en) * 2020-05-06 2020-08-25 西安交通大学 False news detection system and method based on decision tree and common attention cooperation
CN111581979A (en) * 2020-05-06 2020-08-25 西安交通大学 False news detection system and method based on evidence perception layered interactive attention network
CN112035759A (en) * 2020-09-02 2020-12-04 胡煜昊 False news detection method for English news media reports

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018237098A1 (en) * 2017-06-20 2018-12-27 Graphika, Inc. Methods and systems for identifying markers of coordinated activity in social media movements
WO2020061578A1 (en) * 2018-09-21 2020-03-26 Arizona Board Of Regents On Behalf Of Arizona State University Method and apparatus for collecting, detecting and visualizing fake news
CN111177554A (en) * 2019-12-27 2020-05-19 西安交通大学 False news identification system and method capable of explaining exploration based on generation of confrontation learning
CN111581980A (en) * 2020-05-06 2020-08-25 西安交通大学 False news detection system and method based on decision tree and common attention cooperation
CN111581979A (en) * 2020-05-06 2020-08-25 西安交通大学 False news detection system and method based on evidence perception layered interactive attention network
CN112035759A (en) * 2020-09-02 2020-12-04 胡煜昊 False news detection method for English news media reports

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
WU LIANWEI: "A Multi-semantics Classification Method Based on Deep Learning for Incredible Messages on Social Media", 《CHINESE JOURNAL OF ELECTRONICS》 *
何韩森等: "基于特征聚合的假新闻内容检测模型", 《计算机应用》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113849599A (en) * 2021-09-03 2021-12-28 北京中科睿鉴科技有限公司 Joint false news detection method based on mode information and fact information

Also Published As

Publication number Publication date
CN112650851B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
Cui et al. Edge-enhanced graph convolution networks for event detection with syntactic relation
Han et al. Semi-supervised active learning for sound classification in hybrid learning environments
Lipton et al. A critical review of recurrent neural networks for sequence learning
Yang et al. Cross-modal multitask transformer for end-to-end multimodal aspect-based sentiment analysis
CN111695347B (en) System and method for mutual learning of topic discovery and word embedding
KR20230128492A (en) Explainable Transducers Transducers
Fei et al. Topic-enhanced capsule network for multi-label emotion classification
Zhang et al. An emotional classification method of Chinese short comment text based on ELECTRA
Beseiso et al. Subword attentive model for Arabic sentiment analysis: A deep learning approach
Zavrak et al. Email spam detection using hierarchical attention hybrid deep learning method
Kim et al. Acp++: Action co-occurrence priors for human-object interaction detection
CN112650851B (en) False news identification system and method based on multilevel interactive evidence generation
Hussain et al. Improving source code suggestion with code embedding and enhanced convolutional long short‐term memory
Zhu et al. Knowledge-based BERT word embedding fine-tuning for emotion recognition
Fahfouh et al. A contextual relationship model for deceptive opinion spam detection
Lai Event extraction: A survey
Venkataram Open set text classification using neural networks
Kounte et al. Analysis of Intelligent Machines using Deep learning and Natural Language Processing
CN115964497A (en) Event extraction method integrating attention mechanism and convolutional neural network
CN116341564A (en) Problem reasoning method and device based on semantic understanding
Lee et al. Language Model Using Differentiable Neural Computer Based on Forget Gate-Based Memory Deallocation.
Jaybhaye et al. Fake News Detection using LSTM based deep learning approach
Heidari et al. Diverse and styled image captioning using singular value decomposition‐based mixture of recurrent experts
Chen et al. Sentiment analysis and research based on two‐channel parallel hybrid neural network model with attention mechanism
Liu et al. STP-MFM: Semi-tensor product-based multi-modal factorized multilinear pooling for information fusion in sentiment analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant