CN115525781A - Multi-mode false information detection method, device and equipment - Google Patents

Multi-mode false information detection method, device and equipment Download PDF

Info

Publication number
CN115525781A
CN115525781A CN202211388128.0A CN202211388128A CN115525781A CN 115525781 A CN115525781 A CN 115525781A CN 202211388128 A CN202211388128 A CN 202211388128A CN 115525781 A CN115525781 A CN 115525781A
Authority
CN
China
Prior art keywords
information
entity
event
entities
event information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211388128.0A
Other languages
Chinese (zh)
Inventor
郎公福
郭子瑜
肖保臣
马玉辉
胡怀迪
张翠翠
李学伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qilu Aerospace Information Research Institute
Aerospace Information Research Institute of CAS
Original Assignee
Qilu Aerospace Information Research Institute
Aerospace Information Research Institute of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qilu Aerospace Information Research Institute, Aerospace Information Research Institute of CAS filed Critical Qilu Aerospace Information Research Institute
Priority to CN202211388128.0A priority Critical patent/CN115525781A/en
Publication of CN115525781A publication Critical patent/CN115525781A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/45Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Library & Information Science (AREA)
  • Animal Behavior & Ethology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a multi-mode false information detection method, a device and equipment, which can be applied to the technical field of information detection. The method comprises the following steps: acquiring a plurality of event information about the same subject entry; generating a heterogeneous graph corresponding to the event information according to the external knowledge information and the event information; extracting features from the entity of the heterogeneous graph to obtain the entity features of the entity of the heterogeneous graph; processing the heterogeneous graph and the entity characteristics to obtain multi-modal characteristics of the event information and event false judgment results; generating a time characteristic sequence based on a plurality of multi-modal characteristics and a plurality of event false judgment results; and obtaining a false judgment result about the subject entry according to the time characteristic sequence.

Description

Multi-mode false information detection method, device and equipment
Technical Field
The present disclosure relates to the field of information detection technologies, and in particular, to a method, an apparatus, and a device for detecting multimodal false information.
Background
In recent years, with the rapid development of the internet, the way for people to acquire news event information is more convenient, and false news event information and the like are rapidly spread by means of a social platform, so that great negative effects are generated on social public opinion. With the popularity of mobile devices, news event information tends to spread rapidly in the form of multi-modal information. The method is convenient for readers to obtain news event information, and meanwhile, the technology of analyzing false news event information by simply relying on the traditional text is not applicable any more.
And in the traditional false news event information detection method, only single news event information can be detected. The multiple news event information under the same subject entry are correlated and influenced, and in the process of detecting the false of the subject entry by utilizing the multiple news event information correlated with the subject entry, the false identification contribution degrees of the multiple news event information to the subject entry are different, so that the joint modeling is difficult.
Disclosure of Invention
In view of the above, the present disclosure provides a method, an apparatus, and a device for multi-modal false information detection.
One aspect of the present disclosure provides a multimodal false information detection method, including:
acquiring a plurality of event information about the same subject entry;
generating a heterogeneous graph corresponding to the event information according to external knowledge information and the event information;
extracting features from the entity of the heterogeneous graph to obtain the entity features of the entity of the heterogeneous graph;
processing the heterogeneous graph and the entity characteristics to obtain multi-modal characteristics of the event information and event false identification results;
generating a temporal feature sequence based on the multi-modal features and the event false positive results, wherein the temporal feature sequence includes a plurality of local sensing vectors, the local sensing vectors are sorted according to the respective issuing times of the event information, and each local sensing vector includes the multi-modal features and the event false positive results corresponding to the event information after being spliced;
and obtaining a false judgment result about the subject entry according to the time characteristic sequence.
According to an embodiment of the present disclosure, the generating a heterogeneous graph corresponding to the event information according to the external knowledge information and the event information includes:
obtaining multi-modal information of the event information, wherein the multi-modal information comprises at least two of the following items: text information in the event information, visual information in the event information, and social context information related to the event information;
extracting entities and relations of text information from the text information of the multi-modal information, entities and relations of visual information from the visual information, and entities and relations of social context information from the social context information by using a preset extraction rule;
obtaining entities and relations of external knowledge information related to the subject vocabulary entry according to the external knowledge information;
and generating the heterogeneous graph corresponding to the event information according to the entity and the relationship of the external knowledge information, the entity and the relationship of the text information, the entity and the relationship of the visual information and the entity and the relationship of the social context information.
According to an embodiment of the present disclosure, the extracting, by using a preset extraction rule, entities and relationships of text information from the text information of the multimodal information, entities and relationships of visual information from the visual information, and entities and relationships of social context information from the social context information includes:
extracting entities of the text information from the text information and entities of the social context information from the social context information by utilizing a pre-trained language model;
an entity for extracting the visual information from the visual information by using a residual neural network model;
extracting the relation of the text information between the entities of the text information from the text information by utilizing a segmented convolution neural network model according to the entities of the text information and the text information;
extracting the relation of the social context information among the entities of the social context information from the social context information by utilizing a segmented convolutional neural network model according to the entities of the social context information and the social context information;
and extracting the relationship of the visual information among the entities of the visual information from the visual information by utilizing a segmented convolutional neural network model according to the entities of the visual information and the visual information.
According to an embodiment of the present disclosure, the obtaining of the entity and the relationship of the external knowledge information related to the subject term according to the external knowledge information includes:
constructing a plurality of knowledge maps of different types according to the external knowledge information, wherein the external knowledge information is acquired from open source information, the external knowledge information comprises information related to the subject term, and the knowledge maps comprise at least two of the following items: domain knowledge maps, general knowledge maps, physical knowledge maps and space-time knowledge maps;
and acquiring the entities and the relations of the external knowledge information from the knowledge maps.
According to an embodiment of the present disclosure, the generating the heterogeneous graph corresponding to the event information according to the entity and the relationship of the external knowledge information, the entity and the relationship of the text information, the entity and the relationship of the visual information, and the entity and the relationship of the social context information includes:
entity alignment is carried out on the entity of the external knowledge information, the entity of the text information, the entity of the visual information and the entity of the social context information by utilizing an entity link technology, so that the entity of the external knowledge information, the entity of the text information, the entity of the visual information and the entity of the social context information with unified entity names are obtained;
and generating the heterogeneous graph corresponding to the event information by using the entity and the relation of the external knowledge information with uniform entity names, the entity and the relation of the text information with uniform entity names, the entity and the relation of the visual information with uniform entity names and the entity and the relation of the social context information with uniform entity names.
According to an embodiment of the present disclosure, the extracting features from the entities of the heterogeneous map to obtain the entity features of the entities of the heterogeneous map includes:
acquiring an entity of the heterogeneous graph;
and processing the entity by utilizing a pre-training language model to obtain the entity characteristics of the entity of the heterogeneous graph.
According to an embodiment of the present disclosure, the processing the heterogeneous graph and the entity feature to obtain the multi-modal feature of the event information and the event false identification result includes:
processing the heterogeneous graph to obtain a heterogeneous graph adjacency matrix;
inputting the heterogeneous graph adjacency matrix and the entity characteristics into a characteristic extraction module in a hierarchical attention model to obtain the multi-modal characteristics;
and inputting the multi-mode features into a full connection layer and a classification layer of the hierarchical attention model in sequence to obtain the event false identification result.
According to an embodiment of the present disclosure, the generating a time feature sequence based on a plurality of the multi-modal features and a plurality of the event false positive identification results includes:
performing feature splicing on the multi-modal features and event false identification results corresponding to the multi-modal features to generate local perception vectors of the event information;
and sequencing and combining the plurality of local sensing vectors according to the respective release time of the plurality of event information to obtain the time characteristic sequence.
Another aspect of the present disclosure provides a multimodal false information detection apparatus, including:
the system comprises a plurality of event information acquisition modules, a plurality of event information acquisition modules and a plurality of event information acquisition modules, wherein the event information acquisition modules are used for acquiring a plurality of event information about the same subject entry;
a heterogeneous graph generating module, configured to generate a heterogeneous graph corresponding to the event information according to external knowledge information and the event information;
an entity characteristic obtaining module, configured to extract characteristics from the entity of the heterogeneous map to obtain entity characteristics of the entity of the heterogeneous map;
an event false judgment result obtaining module, configured to process the heterogeneous graph and the entity feature to obtain a multi-modal feature of the event information and an event false judgment result;
a temporal feature sequence generating module, configured to generate a temporal feature sequence based on the multi-modal features and the event false positive results, where the temporal feature sequence includes a plurality of local sensing vectors, the local sensing vectors are sorted according to respective issue times of the event information, and each local sensing vector includes a spliced multi-modal feature corresponding to the event information and an event false positive result;
and the false identification result obtaining module of the subject entry is used for obtaining the false identification result of the subject entry according to the time characteristic sequence.
Another aspect of the present disclosure provides an electronic device including:
one or more processors;
a memory to store one or more instructions that,
wherein the one or more instructions, when executed by the one or more processors, cause the one or more processors to implement the method as described above.
According to the embodiment of the disclosure, by acquiring a plurality of event information about the same subject entry, generating a heterogeneous graph corresponding to the event information according to external knowledge information and the event information, extracting features from the entities of the heterogeneous graph to obtain entity features of the entities of the heterogeneous graph, processing the heterogeneous graph and the entity features to obtain multi-modal features of the event information and an event false recognition result, implementing local perception of the multi-modal features of the plurality of event information and the event false recognition result included in the subject entry, generating a time feature sequence based on the plurality of multi-modal features and the plurality of event false recognition results, wherein the time feature sequence includes a plurality of local perception vectors, the plurality of local perception vectors are ordered according to respective release times of the plurality of event information, each local perception vector includes the spliced multi-modal features and event false recognition results corresponding to the event information, obtaining false recognition results about the subject entry according to the time feature sequence, implementing detection of the multi-modal features of the plurality of event information and the event false recognition results, implementing detection of the multi-modal features of the event information and the event false recognition results, and implementing overall topic sequence detection of the overall recognition of the event information.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent from the following description of the embodiments of the present disclosure with reference to the accompanying drawings, in which:
FIG. 1 schematically shows an exemplary system architecture to which a multimodal false information detection method may be applied, according to an embodiment of the present disclosure;
FIG. 2 schematically shows a flow chart of a multimodal false information detection method according to an embodiment of the present disclosure;
FIG. 3 schematically shows a block diagram of a multimodal false information detection apparatus according to an embodiment of the present disclosure; and
fig. 4 schematically shows a block diagram of an electronic device adapted for the above described method according to an embodiment of the present disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.
Where a convention analogous to "A, B and at least one of C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include, but not be limited to, systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). Where a convention analogous to "A, B or at least one of C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B or C" would include, but not be limited to, systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
In the traditional false news event information detection method, only single news event information can be detected. The method comprises the steps that a plurality of news event information under the same subject entry are correlated and influenced, in the process of detecting the false of the subject entry by utilizing the news event information correlated with the subject entry, the false identification contribution degrees of the news event information to the subject entry are different, and the joint modeling is difficult. In order to solve the problems in the process of detecting the falseness of a subject term by using a plurality of news event information related to the subject term, embodiments of the present disclosure provide a multimodal false information detection method, apparatus and device.
The embodiment of the disclosure provides a multi-mode false information detection method, which comprises the following steps: acquiring a plurality of event information about the same subject entry; generating a heterogeneous graph corresponding to the event information according to the external knowledge information and the event information; extracting features from the entity of the heterogeneous graph to obtain the entity features of the entity of the heterogeneous graph; processing the heterogeneous graph and the entity characteristics to obtain multi-mode characteristics of the event information and event false identification results; generating a time characteristic sequence based on a plurality of multi-modal characteristics and a plurality of event false judgment results, wherein the time characteristic sequence comprises a plurality of local perception vectors, the local perception vectors are ordered according to respective release time of a plurality of event information, and each local perception vector comprises the multi-modal characteristics corresponding to the event information and the event false judgment results after splicing; and obtaining a false judgment result about the subject entry according to the time characteristic sequence.
According to the embodiment of the disclosure, by acquiring a plurality of event information about the same subject entry, generating a heterogeneous graph corresponding to the event information according to external knowledge information and the event information, extracting features from the entities of the heterogeneous graph to obtain entity features of the entities of the heterogeneous graph, processing the heterogeneous graph and the entity features to obtain multi-modal features of the event information and an event false recognition result, implementing local perception of the multi-modal features of the plurality of event information and the event false recognition result included in the subject entry, generating a time feature sequence based on the plurality of multi-modal features and the plurality of event false recognition results, wherein the time feature sequence includes a plurality of local perception vectors, the plurality of local perception vectors are ordered according to respective release times of the plurality of event information, each local perception vector includes the spliced multi-modal features and event false recognition results corresponding to the event information, obtaining false recognition results about the subject entry according to the time feature sequence, implementing detection of the multi-modal features of the plurality of event information and the event false recognition results, implementing detection of the multi-modal features of the event information and the event false recognition results, and implementing overall topic sequence detection of the overall recognition of the event information.
In the technical scheme of the disclosure, the collection, storage, use, processing, transmission, provision, disclosure, application and other processing of the personal information of the related user are all in accordance with the regulations of related laws and regulations, necessary confidentiality measures are taken, and the customs of the public order is not violated.
In the technical scheme of the disclosure, before the personal information of the user is acquired or collected, the authorization or the consent of the user is acquired.
Fig. 1 schematically shows an exemplary system architecture 100 to which a multimodal false information detection method may be applied, according to an embodiment of the present disclosure. It should be noted that fig. 1 is only an example of a system architecture to which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, and does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios.
As shown in fig. 1, the system architecture 100 according to this embodiment may include a first terminal device 101, a second terminal device 102, a third terminal device 103, a network 104, and a server 105. The network 104 is used to provide a medium of communication links between the first terminal device 101, the second terminal device 102, the third terminal device 103 and the server 105. Network 104 may include various connection types, such as wired and/or wireless communication links, and so forth.
The user may interact with the server 105 via the network 104 using the first terminal device 101, the second terminal device 102, the third terminal device 103, to receive or send messages, etc. The terminal devices, i.e., the first terminal device 101, the second terminal device 102, and the third terminal device 103, may have various communication client applications installed thereon, such as a shopping application, a web browser application, a search application, an instant messaging tool, a mailbox client, and/or social platform software (for example only).
The first terminal device 101, the second terminal device 102, and the third terminal device 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 105 may be a server providing various services, such as a background management server (for example only) providing support for websites browsed by the user using the first terminal device 101, the second terminal device 102, and the third terminal device 103. The backend management server may analyze and process the received data such as the user request, and feed back a processing result (for example, a web page, information, or data obtained or generated according to the user request) to the terminal device.
It should be noted that the multi-modal false information detection method provided by the embodiment of the present disclosure can be generally executed by the server 105. Accordingly, the multi-modal false information detection apparatus provided by the embodiment of the present disclosure can be generally disposed in the server 105. The multi-modal false information detection method provided by the embodiment of the present disclosure may also be executed by a server or a server cluster that is different from the server 105 and is capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103, and/or the server 105. Accordingly, the multi-modal false information detection apparatus provided by the embodiment of the present disclosure may also be disposed in a server or a server cluster different from the server 105 and capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103 and/or the server 105.
For example, the topic term and a plurality of event information of the same topic term may be stored in any one of the first terminal device 101, the second terminal device 102, or the third terminal device 103 (for example, the terminal device 101, but not limited thereto), or may be stored on an external storage device and may be imported into the terminal device 101. Then, the terminal device 101 may send the topic term and the plurality of event information of the same topic term to the server 105, and the server 105 thereof receiving the topic term and the plurality of event information of the same topic term performs the multi-modal false information detection method provided by the embodiment of the present disclosure.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Fig. 2 schematically shows a flow chart of a multimodal false information detection method according to an embodiment of the present disclosure.
As shown in fig. 2, the method includes operations S201 to S207.
In operation 201, a plurality of event information about the same subject term is acquired.
According to embodiments of the present disclosure, the plurality of event information may characterize a plurality of multimodal information related to the same subject term. The multimodal information may include: text information, image information, video information, audio information, and the like.
According to the embodiment of the disclosure, since the plurality of event information may include information related to the same subject entry and having similar contents in part, the plurality of event information under the same subject entry are related to each other and affect each other, and may be used to perform false recognition on the subject entry. According to an embodiment of the present disclosure, the plurality of event information may be, for example, a plurality of news information under the same subject term.
According to an embodiment of the present disclosure, the subject term may be, for example, "star a married," and 3 pieces of news information related to "star a married" may be searched according to the subject term, "star a married," and each piece of news information may include a photograph of star a, and a textual description of star a married information.
In operation 202, a heterogeneous graph corresponding to the event information is generated according to the external knowledge information and the event information.
According to the embodiment of the disclosure, the external knowledge information is external data related to the subject term, which can be acquired from different media platforms.
According to the embodiment of the disclosure, the entity and the relationship may be extracted from the external knowledge information and the event information, respectively, and then the heterogeneous graph corresponding to the event information may be generated according to the entity and the relationship extracted from the external knowledge information and the event information.
In operation 203, features are extracted from the entities of the heterogeneous graph, resulting in entity features of the entities of the heterogeneous graph.
According to the embodiment of the disclosure, the same feature extraction method can be used for respectively extracting the features of the plurality of entities in the heterogeneous graph to obtain the entity features of the entities in the heterogeneous graph.
According to an embodiment of the present disclosure, a feature extraction method for extracting features from entities of a heterogeneous graph may be, for example: a pre-training language (BERT) model, a Residual neural Network (RESNET 50) model, a histogram decision tree (LGB) model, or a visual geometry group Network (VGG-16) model, and the like.
In operation 204, the heterogeneous graphs and the entity features are processed to obtain multi-modal features of the event information and a false event recognition result.
According to an embodiment of the present disclosure, the multi-modal features characterize the multi-modal information fused with the event information and the external knowledge information.
According to an embodiment of the present disclosure, the false positive result characterizes a false positive of the event information. The false recognition result may be represented by 1 in the case where the event information is discriminated to be true, and may be represented by 0 in the case where the event information is discriminated to be false.
According to the embodiment of the disclosure, the heterogeneous graph can be converted into an adjacent matrix form, then the heterogeneous graph and the entity features in the adjacent matrix form are input into a related text classification algorithm, and a multi-modal feature of the event information and an event false identification result are output according to the text classification algorithm.
According to an embodiment of the present disclosure, the relevant text classification algorithm may be, for example, a Neural Network-based algorithm, such as a Deep Neural Network (DNN) model, a Deep Belief Network (DBN) model, a Hierarchical Attention Network (HAN) model, and a combination thereof. The embodiment of the disclosure does not limit the specific text classification algorithm, and can be selected according to the actual situation.
In operation 205, a temporal feature sequence is generated based on the multi-modal features and the event false positive identification results, wherein the temporal feature sequence includes a plurality of local sensing vectors, the local sensing vectors are ordered according to respective release times of the event information, and each local sensing vector includes the spliced multi-modal features and event false positive identification results corresponding to the event information.
At operation 206, a false positive result for the subject term is obtained based on the temporal feature sequence.
According to the embodiment of the disclosure, the time characteristic sequence can be input into a bidirectional Long Short term Memory ATTENTION network model (Bi-directional Long Short-term Memory-ATTENTION, BI-LSTM-ATTENTION), and the false recognition result of the subject term is output by the BI-LSTM-ATTENTION model.
According to the embodiment of the disclosure, the characteristics of the subject entry are globally expressed by utilizing the multi-modal characteristics of the multiple event information included in the subject entry and the time characteristic sequence generated by the event false identification result to obtain the time characteristic sequence, and the false identification result related to the subject entry is obtained according to the time characteristic sequence, so that the false identification of the subject entry is judged by utilizing the multi-modal characteristics of the multiple event information included in the subject entry and the time characteristic sequence generated by the event false identification result, and the accuracy of false detection of the subject entry is improved.
According to an embodiment of the present disclosure, for operation S202 shown in fig. 2, generating a heterogeneous graph corresponding to event information according to external knowledge information and event information may include the following operations:
obtaining multi-modal information of the event information, the multi-modal information including at least two of: text information in the event information, visual information in the event information, and social context information related to the event information;
extracting entities and relations of text information from the text information of the multimodal information, entities and relations of visual information from the visual information, and entities and relations of social context information from the social context information by using a preset extraction rule;
obtaining entities and relations of external knowledge information related to the subject vocabulary entry according to the external knowledge information;
and generating a heterogeneous graph corresponding to the event information according to the entity and the relation of the external knowledge information, the entity and the relation of the text information, the entity and the relation of the visual information and the entity and the relation of the social context information.
According to an embodiment of the present disclosure, the text information represents textual description information included in the event information. The visual information characterizes information in the images and videos that the event information contains. The social context information related to the event information includes basic attribute information and behavior attribute information. The basic attribute information includes gender information and region information of a publisher of the event information. The behavior attribute information includes attention behavior information and forwarding behavior information of event information by a publisher of the event information.
According to the embodiment of the disclosure, a heterogeneous graph can be constructed for each event information in a plurality of event information of the subject term, and each heterogeneous graph comprises multi-modal information of the event information and external knowledge information.
According to an embodiment of the present disclosure, a heterogeneous graph may be represented by G = (V, E), where V is a set of nodes in the heterogeneous graph and E is a set of edges in the heterogeneous graph. Nodes correspond to entities and edges correspond to relationships. The entities comprise entities of text information, entities of visual information, entities of social context information and entities of external knowledge information. The relationships include those of textual information, those of visual information, those of social context information, and those of external knowledge information.
According to the embodiment of the disclosure, in the process of extracting the entities and the relations from the multimodal information by using the preset extraction rules, different extraction rules can be selected according to different information types in the multimodal information.
According to the embodiment of the disclosure, for the social context information, three types of relationships, namely, user-release-news relationship, user-propagation-news relationship and user-attention-user relationship, can be included.
At present, in the technology of detecting false event information at home and abroad, the false judgment of the detected event information is usually carried out by depending on single modal information related to the detected event information, and the problem of modal information loss exists.
According to the embodiment of the disclosure, by acquiring the multi-modal information of the event information, the multi-modal information includes at least two of the following: the text information in the event information, the visual information in the event information and the social context information related to the event information can obtain different types of information of the event information, and then the different types of information of the event information can be used for subsequent false detection of the subject entry.
According to the embodiment of the disclosure, in the process of generating the heterogeneous graph corresponding to the event information according to the entity and the relationship of the external knowledge information, the entity and the relationship of the text information, the entity and the relationship of the visual information and the entity and the relationship of the social context information, the entity and the relationship of the external knowledge information can provide additional entity and relationship for the event information, and the entity and the relationship of the event information are enriched.
According to the embodiment of the disclosure, the heterogeneous graph corresponding to the event information is generated according to the entity and the relationship of the external knowledge information, the entity and the relationship of the text information, the entity and the relationship of the visual information and the entity and the relationship of the social context information, so that the graph can represent various types of entities and relationships related to the event information, simplify the representation form of various types of entities and relationships related to the event information and prepare for further processing the event information in the follow-up process.
According to an embodiment of the present disclosure, extracting entities and relationships of text information from text information of multimodal information, extracting entities and relationships of visual information from visual information, and extracting entities and relationships of social context information from social context information using a preset extraction rule includes:
extracting entities of text information from the text information and entities of social context information from the social context information by using a BERT model;
an entity for extracting visual information from the visual information by using the RESNET50 model;
extracting the relation of the text information between the entities of the text information from the text information by utilizing a segmented Convolutional Neural Network (PCNN) model according to the entities of the text information and the text information;
extracting the relation of social context information between the entities of the social context information from the social context information by utilizing a PCNN model according to the entities of the social context information and the social context information;
and extracting the relationship of the visual information between the visual information entities from the visual information by utilizing the PCNN model according to the visual information entities and the visual information.
According to the embodiment of the disclosure, for example, the entity of the text information and the sentence in the text information corresponding to the entity of the text information may be input into the PCNN model, and the relationship of the text information between the entities of the text information may be extracted from the sentence in the text information corresponding to the entity of the text information by using the PCNN model. Similarly, the relationship of the social context information and the relationship of the visual information can be obtained.
According to an embodiment of the present disclosure, obtaining the entity and the relationship of the external knowledge information related to the subject term according to the external knowledge information includes:
constructing a plurality of knowledge maps with different types according to external knowledge information, wherein the external knowledge information is acquired from open source information, the external knowledge information comprises information related to subject terms, and the knowledge maps comprise at least two of the following items: domain knowledge maps, general knowledge maps, physical knowledge maps and spatio-temporal knowledge maps;
and acquiring entities and relations of external knowledge information from a plurality of knowledge maps.
According to an embodiment of the present disclosure, the external knowledge information is external data related to the subject term, which may be acquired from different media platforms. The external data may include, among other things, domain data, generic data, event evolution data, and spatio-temporal data. Different types of knowledge-maps may be obtained based on different types of data in the external data.
At present, in the related art, a method for falsely detecting event information by combining external knowledge information only depends on a single map related to the external knowledge information to falsely detect the event information. And the single map related to the external knowledge information provides limited knowledge quantity, weak knowledge relevance and insufficient knowledge context information.
According to the embodiment of the disclosure, a plurality of knowledge maps with different types are constructed according to external knowledge information, so that a plurality of types of knowledge included by the knowledge maps can be used for subsequent false detection of the subject vocabulary entry, and the false detection accuracy of the subject vocabulary entry can be improved because the knowledge information included by the knowledge maps is richer.
According to an embodiment of the present disclosure, generating a heterogeneous graph corresponding to event information according to an entity and a relationship of external knowledge information, an entity and a relationship of text information, an entity and a relationship of visual information, and an entity and a relationship of social context information includes:
entity alignment is carried out on an entity of external knowledge information, an entity of text information, an entity of visual information and an entity of social context information by utilizing an entity link technology, so that the entity of external knowledge information, the entity of text information, the entity of visual information and the entity of social context information with uniform entity names are obtained;
and generating a heterogeneous graph corresponding to the event information by utilizing the entity and the relationship of the external knowledge information with unified entity names, the entity and the relationship of the text information with unified entity names, the entity and the relationship of the visual information with unified entity names and the entity and the relationship of the social context information with unified entity names.
According to the embodiment of the disclosure, for example, an entity linking method of knowledge base retrieval may be adopted to perform entity alignment on an entity of external knowledge information, an entity of text information, an entity of visual information and an entity of social context information, so as to obtain the entity of external knowledge information, the entity of text information, the entity of visual information and the entity of social context information with unified entity names.
According to an embodiment of the present disclosure, with respect to operation S203 shown in fig. 2, extracting features from entities of the heterogeneous map to obtain entity features of the entities of the heterogeneous map may include the following operations:
acquiring an entity of the heterogeneous graph;
and processing the entity by using a BERT model to obtain the entity characteristics of the entity of the heterogeneous graph.
According to an embodiment of the present disclosure, for operation S204 shown in fig. 2, processing the heterogeneous graph and the entity feature to obtain the multi-modal feature of the event information and the event false recognition result may include the following operations:
processing the heterogeneous image to obtain a heterogeneous image adjacent matrix;
inputting the adjacent matrix of the heterogeneous graph and the entity characteristics into a characteristic extraction module in a Hierarchical Attention Network (HAN) model to obtain multi-modal characteristics; and inputting the multi-mode features into a full connection layer and a classification layer of the HAN model in sequence to obtain an event false identification result.
According to embodiments of the present disclosure, the HAN model is a hierarchical attention network that contains node-level attention and semantic-level attention. The HAN model learns the weight of neighbors based on meta-paths through node-level attention, obtains node features of specific semantics in an aggregation mode, and obtains the optimal weighted combination of nodes by weighting all meta-paths through semantic-level attention. The specific learning process of the HAN model is as follows.
For each type of node (type phi) i ) Designing a specific type of transformation matrix M φi And mapping the features of different types of nodes to the same feature space, wherein the mapping relation can be represented by formula (2). The type of the node is the type of the entity corresponding to the node. And the type of the entity is determined by an extraction algorithm in the process of extracting the entity related to the event information and the external knowledge information.
Figure BDA0003928742250000161
Wherein h is i Characterizing the original features of node i, h i ' features of the mapped original features characterizing the node i. Wherein, the original feature of the node i may be a knowledge representation vector of the node i.
Learning the weights of the different types of nodes using a self-attention mechanism, in the event that a node pair (i,j) In the case of (2), node level attention
Figure BDA0003928742250000162
Can be expressed by equation (3).
Figure BDA0003928742250000171
Wherein, att node Deep learning network to characterize node level attention, att node Is shared at a given meta-path phi.
Figure BDA0003928742250000172
Can represent the importance of node j to node i under meta-path phi, h j ' feature of the original feature of the feature node j after mapping.
The node-level deep attention learning network may compute the attention of all neighbors j of node i based on meta-path phi, where,
Figure BDA0003928742250000173
using softmax function pairs
Figure BDA0003928742250000174
Performing normalization of the weight coefficients, the normalized weight coefficients
Figure BDA0003928742250000175
Can be expressed by equation (4).
Figure BDA0003928742250000176
Wherein,
Figure BDA0003928742250000177
all neighbor nodes on the meta-path phi where the node i is located are represented, sigma represents an activation function, and | l represents connection operation,
Figure BDA0003928742250000178
the node-level attention vector characterizing the meta-path phi.
Node i features based on meta-path phi
Figure BDA0003928742250000179
Equal to the normalized weight coefficient corresponding to the feature of node i based on meta-path phi
Figure BDA00039287422500001710
Multiplying by a weighted sum of the neighbor features, the feature of the node i based on the meta-path φ
Figure BDA00039287422500001711
Can be expressed by equation (5).
Figure BDA00039287422500001712
With a multi-head attention mechanism, the attention at semantic level is repeated K times, which can be expressed as:
Figure BDA00039287422500001713
given a meta-path set phi 1 ,φ 2 ,...,φ P Get P groups of node characteristics after node level attention
Figure BDA00039287422500001714
The importance of different meta-paths is learned by adopting the attention of the semantic hierarchy, and the learned weight can be expressed as:
Figure BDA0003928742250000181
wherein, att sem A deep learning network for expressing the attention of semantic level, P represents the number of node features,
Figure BDA0003928742250000182
the weight of each meta-path is characterized.
Learning the importance of each meta-path, and passing the characteristics of each node in the meta-path through a non-linear conversion layer, phi of each meta-path i Importance of weight of
Figure BDA0003928742250000183
Can be expressed by equation (8).
Figure BDA0003928742250000184
Wherein W represents a weight matrix, b represents a bias vector, q represents a semantic level attention vector, and | V | represents a meta-path phi i The number of nodes.
After obtaining the weight coefficient, normalizing by a softmax function, phi i Normalized weight of (2)
Figure BDA0003928742250000185
Can be expressed by equation (9).
Figure BDA0003928742250000186
And performing weighted sum on the obtained weight coefficient of the element path normalization and the node level characteristics to obtain final characteristics Z, wherein the characteristics Z can be represented by a formula (10).
Figure BDA0003928742250000187
According to embodiments of the present disclosure, feature Z may be treated as a multi-modal feature.
And (4) false detection is carried out on the event information based on the final characteristic Z, namely the final characteristic Z is input into a softmax layer of the HAN model, and the HAN model outputs the category R of the event information, namely a false identification result of the event information. When the event information is false, the category R of the event information is represented by 0, and when the event information is true, the category R of the event information is represented by 1.
According to an embodiment of the present disclosure, for operation S206 shown in fig. 2, generating the temporal feature sequence based on the plurality of multi-modal features and the plurality of event false positive recognition results may include the following operations:
performing feature splicing on the multi-modal features and event false identification results corresponding to the multi-modal features to generate local perception vectors of event information;
and sequencing and combining the local sensing vectors according to the respective issuing moments of the event information to obtain a time characteristic sequence.
According to embodiments of the present disclosure, for example, the subject term includes n event information, which may be represented by γ i The local sensing vectors representing the ith event information in the event information are sorted and combined according to the respective release time of the event information to obtain a time characteristic sequence gamma n ={γ 1 ,γ 2 ,...,γ n }。
According to the embodiment of the present disclosure, the time feature sequence γ can be converted n ={γ 1 ,γ 2 ,...,γ n Inputting the terms into a BI-LSTM-ATTENTION model, constructing a joint model for performing global representation on the subject terms and performing bidirectional enhancement on the local perception information, and outputting a false detection result of the subject entry based on the model.
It should be noted that, unless explicitly stated that a sequence of execution exists between different operations or a sequence of execution exists in technical implementation of different operations, an execution sequence between multiple operations may not be sequential, and multiple operations may also be executed at the same time in the flowchart in the embodiment of the present disclosure.
Fig. 3 schematically shows a block diagram of a multimodal false information detection apparatus according to an embodiment of the present disclosure.
As shown in fig. 3, the multi-modal false information detection apparatus 300 includes a plurality of event information obtaining modules 310, a heterogeneous graph generating module 320, an entity feature obtaining module 330, an event false recognition result obtaining module 340, a time feature sequence generating module 350, and a false recognition result obtaining module 360 of a subject term.
A plurality of event information acquiring modules 310 for acquiring a plurality of event information about the same subject term.
And a heterogeneous graph generating module 320, configured to generate a heterogeneous graph corresponding to the event information according to the external knowledge information and the event information.
An entity feature obtaining module 330, configured to extract features from the entities of the heterogeneous map to obtain entity features of the entities of the heterogeneous map.
The event false recognition result obtaining module 340 is configured to process the heterogeneous graph and the entity feature to obtain a multi-modal feature of the event information and an event false recognition result.
The temporal feature sequence generating module 350 is configured to generate a temporal feature sequence based on a plurality of multi-modal features and a plurality of event false positive results, where the temporal feature sequence includes a plurality of local sensing vectors, the plurality of local sensing vectors are ordered according to respective issue times of a plurality of event information, and each local sensing vector includes a spliced multi-modal feature corresponding to the event information and an event false positive result.
And a false recognition result obtaining module 360 for obtaining a false recognition result about the subject entry according to the time feature sequence.
According to the embodiment of the disclosure, the heterogeneous graph generation module comprises a multi-mode information acquisition sub-module, an entity and relationship extraction sub-module related to multi-mode information, an entity and relationship acquisition sub-module of external knowledge information and a heterogeneous graph generation sub-module.
The multi-mode information acquisition sub-module is used for acquiring multi-mode information of the event information, and the multi-mode information comprises at least two of the following items: text information in the event information, visual information in the event information, and social context information related to the event information.
And the entity and relation extraction submodule is used for extracting the entity and the relation of the text information from the text information of the multi-modal information by utilizing a preset extraction rule, extracting the entity and the relation of the visual information from the visual information and extracting the entity and the relation of the social context information from the social context information.
And the entity and relationship obtaining submodule of the external knowledge information is used for obtaining the entity and relationship of the external knowledge information related to the subject vocabulary entry according to the external knowledge information.
And the heterogeneous graph generating submodule is used for generating a heterogeneous graph corresponding to the event information according to the entity and the relation of the external knowledge information, the entity and the relation of the text information, the entity and the relation of the visual information and the entity and the relation of the social context information.
According to an embodiment of the disclosure, the entity and relationship extraction submodule related to multimodal information includes a text and social context information entity extraction unit, a visual information entity extraction unit, a text information relationship extraction unit, a social context information relationship extraction unit, and a visual information relationship extraction unit.
And the text and social context information entity extraction unit is used for extracting the entity of the text information from the text information and the entity of the social context information from the social context information by utilizing the pre-trained language model.
And the visual information entity extraction unit is used for extracting the entity of the visual information from the visual information by using the residual error neural network model.
The text information relation extraction unit is used for extracting the relation of the text information between the entities of the text information from the text information by utilizing a segmented convolutional neural network model according to the entities of the text information and the text information;
the social context information relation extraction unit is used for extracting the relation of the social context information between the entities of the social context information from the social context information by utilizing a segmented convolutional neural network model according to the entities of the social context information and the social context information;
and the visual information relation extraction unit is used for extracting the relation of the visual information between the entities of the visual information from the visual information by utilizing a segmented convolutional neural network model according to the entities of the visual information and the visual information.
According to the embodiment of the disclosure, the entity and relationship obtaining submodule of the external knowledge information includes a knowledge graph construction unit and an entity and relationship obtaining unit of the external knowledge information.
The knowledge graph construction unit is used for constructing a plurality of knowledge graphs with different types according to external knowledge information, wherein the external knowledge information is acquired from open source information, the external knowledge information comprises information related to subject vocabulary entries, and the knowledge graphs comprise at least two of the following items: domain knowledge maps, general knowledge maps, physical knowledge maps, and spatio-temporal knowledge maps.
And the entity and relationship obtaining unit of the external knowledge information is used for obtaining the entity and relationship of the external knowledge information from the knowledge graphs.
According to the embodiment of the disclosure, the heterogeneous graph generation submodule comprises an entity obtaining unit with unified entity names and a heterogeneous graph generation unit.
And the entity obtaining unit with uniform entity names is used for aligning the entity of the external knowledge information, the entity of the text information, the entity of the visual information and the entity of the social context information by utilizing an entity linking technology to obtain the entity of the external knowledge information, the entity of the text information, the entity of the visual information and the entity of the social context information with uniform entity names.
And the heterogeneous graph generating unit is used for generating a heterogeneous graph corresponding to the event information by utilizing the entity and the relationship of the external knowledge information with unified entity names, the entity and the relationship of the text information with unified entity names, the entity and the relationship of the visual information with unified entity names and the entity and the relationship of the social context information with unified entity names.
According to the embodiment of the disclosure, the entity feature obtaining module comprises an entity obtaining sub-module and an entity feature obtaining sub-module.
And the entity obtaining submodule is used for obtaining the entity of the heterogeneous graph.
And the entity characteristic obtaining submodule is used for processing the entity by utilizing the pre-training language model to obtain the entity characteristics of the entity of the heterogeneous graph.
According to the embodiment of the disclosure, the event false judgment result obtaining module comprises a heterogeneous graph adjacency matrix obtaining sub-module, a multi-mode characteristic obtaining sub-module and an event false judgment result obtaining sub-module.
And the heterogeneous graph adjacency matrix obtaining submodule is used for processing the heterogeneous graph to obtain the heterogeneous graph adjacency matrix.
And the multi-modal feature obtaining submodule is used for inputting the heterogeneous graph adjacency matrix and the entity features into the feature extraction module in the level attention model to obtain the multi-modal features.
And the event false identification result obtaining submodule is used for sequentially inputting the multi-mode characteristics into the full connection layer and the classification layer of the hierarchical attention model to obtain an event false identification result.
According to the embodiment of the disclosure, the temporal feature sequence generation module comprises a local perception vector generation submodule and a temporal feature sequence obtaining submodule.
And the local perception vector generation submodule is used for performing feature splicing on the multi-modal features and event false identification results corresponding to the multi-modal features to generate local perception vectors of event information.
And the time characteristic sequence obtaining submodule is used for sequencing and combining the local sensing vectors according to the respective issuing time of the event information to obtain a time characteristic sequence.
Fig. 4 schematically shows a block diagram of an electronic device adapted for the above described method according to an embodiment of the present disclosure. The electronic device shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 4, an electronic device 400 according to an embodiment of the present disclosure includes a processor 401 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 402 or a program loaded from a storage section 408 into a Random Access Memory (RAM) 403. Processor 401 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 401 may also include onboard memory for caching purposes. Processor 401 may include a single processing unit or multiple processing units for performing the different actions of the method flows in accordance with embodiments of the present disclosure.
In the RAM403, various programs and data necessary for the operation of the electronic apparatus 400 are stored. The processor 401, ROM402 and RAM403 are connected to each other by a bus 404. The processor 401 performs various operations of the method flows according to the embodiments of the present disclosure by executing programs in the ROM402 and/or the RAM 403. Note that the programs may also be stored in one or more memories other than the ROM402 and RAM 403. The processor 401 may also perform various operations of the method flows according to embodiments of the present disclosure by executing programs stored in the one or more memories.
According to an embodiment of the present disclosure, electronic device 400 may also include an input/output (I/O) interface 405, input/output (I/O) interface 405 also being connected to bus 404. The system 400 may also include one or more of the following components connected to the I/O interface 405: an input portion 406 including a keyboard, a mouse, and the like; an output section 407 including a display device such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 408 including a hard disk and the like; and a communication section 409 including a network interface card such as a LAN card, a modem, or the like. The communication section 409 performs communication processing via a network such as the internet. A driver 410 is also connected to the I/O interface 405 as needed. A removable medium 411 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 410 as needed, so that a computer program read out therefrom is mounted in the storage section 408 as needed.
According to embodiments of the present disclosure, method flows according to embodiments of the present disclosure may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable storage medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 409, and/or installed from the removable medium 411. The computer program, when executed by the processor 401, performs the above-described functions defined in the system of the embodiments of the present disclosure. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.
The present disclosure also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.
According to an embodiment of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium. Examples may include, but are not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
For example, according to embodiments of the present disclosure, a computer-readable storage medium may include ROM402 and/or RAM403 and/or one or more memories other than ROM402 and RAM403 described above.
Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the method provided by the embodiments of the present disclosure, when the computer program product runs on an electronic device, the program code is configured to enable the electronic device to implement the multimodal false information detection method provided by the embodiments of the present disclosure.
The computer program, when executed by the processor 401, performs the above-described functions defined in the system/apparatus of the embodiments of the present disclosure. The systems, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.
In one embodiment, the computer program may be hosted on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed in the form of a signal on a network medium, downloaded and installed through the communication section 409, and/or installed from the removable medium 411. The computer program containing program code may be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
In accordance with embodiments of the present disclosure, program code for executing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, these computer programs may be implemented using high level procedural and/or object oriented programming languages, and/or assembly/machine languages. The programming language includes, but is not limited to, programming languages such as Java, C + +, python, the "C" language, or the like. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.
The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used in advantageous combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims (10)

1. A multi-modal false information detection method comprises the following steps:
acquiring a plurality of event information about the same subject entry;
generating a heterogeneous graph corresponding to the event information according to external knowledge information and the event information;
extracting features from the entity of the heterogeneous graph to obtain the entity features of the entity of the heterogeneous graph;
processing the heterogeneous graph and the entity characteristics to obtain multi-mode characteristics of the event information and event false identification results;
generating a temporal feature sequence based on the multi-modal features and the event false recognition results, wherein the temporal feature sequence comprises a plurality of local perception vectors, the local perception vectors are ordered according to respective release times of the event information, and each local perception vector comprises the spliced multi-modal features and the event false recognition results corresponding to the event information;
and obtaining a false judgment result about the subject entry according to the time characteristic sequence.
2. The method of claim 1, wherein the generating a heterogeneous graph corresponding to the event information from external knowledge information and the event information comprises:
obtaining multi-modal information of the event information, the multi-modal information including at least two of: text information in the event information, visual information in the event information, and social context information related to the event information;
extracting entities and relations of text information from the text information of the multi-modal information by using a preset extraction rule, extracting entities and relations of visual information from the visual information, and extracting entities and relations of social context information from the social context information;
obtaining entities and relations of external knowledge information related to the subject vocabulary entry according to the external knowledge information;
and generating the heterogeneous graph corresponding to the event information according to the entity and the relation of the external knowledge information, the entity and the relation of the text information, the entity and the relation of the visual information and the entity and the relation of the social context information.
3. The method of claim 2, wherein the extracting textual information entities and relationships from the textual information of the multimodal information, visual information entities and relationships from the visual information, and social context information entities and relationships from the social context information using preset extraction rules comprises:
extracting entities of the text information from the text information and entities of the social context information from the social context information by utilizing a pre-trained language model;
extracting an entity of the visual information from the visual information by using a residual neural network model;
extracting the relation of the text information between the entities of the text information from the text information by utilizing a segmented convolutional neural network model according to the entities of the text information and the text information;
extracting the relation of the social context information between the entities of the social context information from the social context information by utilizing a segmented convolutional neural network model according to the entities of the social context information and the social context information;
and extracting the relation of the visual information between the entities of the visual information from the visual information by utilizing a segmented convolutional neural network model according to the entities of the visual information and the visual information.
4. The method of claim 2, wherein the deriving the entities and relationships of the external knowledge information related to the subject term from the external knowledge information comprises:
constructing a plurality of knowledge graphs with different types according to the external knowledge information, wherein the external knowledge information is acquired from open source information, the external knowledge information comprises information related to the subject entry, and the knowledge graphs comprise at least two of the following items: domain knowledge maps, general knowledge maps, physical knowledge maps and spatio-temporal knowledge maps;
and acquiring the entities and the relations of the external knowledge information from the knowledge maps.
5. The method of claim 2, wherein the generating the heterogeneous graph corresponding to the event information according to the entities and relationships of the external knowledge information, the entities and relationships of the textual information, the entities and relationships of the visual information, and the entities and relationships of the social context information comprises:
entity alignment is carried out on the entity of the external knowledge information, the entity of the text information, the entity of the visual information and the entity of the social context information by utilizing an entity link technology, so that the entity of the external knowledge information, the entity of the text information, the entity of the visual information and the entity of the social context information with unified entity names are obtained;
and generating the heterogeneous graph corresponding to the event information by utilizing the entity and the relationship of the external knowledge information with unified entity names, the entity and the relationship of the text information with unified entity names, the entity and the relationship of the visual information with unified entity names and the entity and the relationship of the social context information with unified entity names.
6. The method of claim 1, wherein the extracting features from the entities of the heterogeneous map, resulting in entity features of the entities of the heterogeneous map comprises:
acquiring an entity of the heterogeneous graph;
and processing the entity by utilizing a pre-training language model to obtain the entity characteristics of the entity of the heterogeneous graph.
7. The method of claim 1, wherein the processing the heterogeneous graph and the entity features to obtain multi-modal features of the event information and event false positive results comprises:
processing the heterogeneous graph to obtain a heterogeneous graph adjacency matrix;
inputting the heterogeneous graph adjacency matrix and the entity characteristics into a characteristic extraction module in a hierarchical attention model to obtain the multi-modal characteristics;
and sequentially inputting the multi-modal characteristics into a full connection layer and a classification layer of a hierarchical attention model to obtain the event false identification result.
8. The method of claim 1, wherein the generating a temporal feature sequence based on the plurality of multi-modal features and the plurality of event false positive findings comprises:
performing feature splicing on the multi-modal features and event false identification results corresponding to the multi-modal features to generate local perception vectors of the event information;
and sequencing and combining the local sensing vectors according to the respective release time of the event information to obtain the time characteristic sequence.
9. A multimodal false information detection apparatus comprising:
the system comprises a plurality of event information acquisition modules, a plurality of event information acquisition modules and a plurality of event information acquisition modules, wherein the event information acquisition modules are used for acquiring a plurality of event information about the same subject entry;
the heterogeneous graph generating module is used for generating a heterogeneous graph corresponding to the event information according to external knowledge information and the event information;
an entity characteristic obtaining module, configured to extract characteristics from the entity of the heterogeneous map to obtain entity characteristics of the entity of the heterogeneous map;
an event false recognition result obtaining module, configured to process the heterogeneous graph and the entity feature to obtain a multi-modal feature of the event information and an event false recognition result;
a temporal feature sequence generating module, configured to generate a temporal feature sequence based on the multi-modal features and the event false positive results, where the temporal feature sequence includes a plurality of local sensing vectors, the local sensing vectors are sorted according to respective release times of the event information, and each local sensing vector includes a spliced multi-modal feature corresponding to the event information and an event false positive result;
and the false recognition result obtaining module is used for obtaining a false recognition result about the subject entry according to the time characteristic sequence.
10. An electronic device, comprising:
one or more processors;
a memory to store one or more instructions that,
wherein the one or more instructions, when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-8.
CN202211388128.0A 2022-11-07 2022-11-07 Multi-mode false information detection method, device and equipment Pending CN115525781A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211388128.0A CN115525781A (en) 2022-11-07 2022-11-07 Multi-mode false information detection method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211388128.0A CN115525781A (en) 2022-11-07 2022-11-07 Multi-mode false information detection method, device and equipment

Publications (1)

Publication Number Publication Date
CN115525781A true CN115525781A (en) 2022-12-27

Family

ID=84704789

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211388128.0A Pending CN115525781A (en) 2022-11-07 2022-11-07 Multi-mode false information detection method, device and equipment

Country Status (1)

Country Link
CN (1) CN115525781A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118052600A (en) * 2024-04-16 2024-05-17 成都信通信息技术有限公司 Method for screening advertisement delivery platform by utilizing digital analysis and related equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118052600A (en) * 2024-04-16 2024-05-17 成都信通信息技术有限公司 Method for screening advertisement delivery platform by utilizing digital analysis and related equipment

Similar Documents

Publication Publication Date Title
US11599714B2 (en) Methods and systems for modeling complex taxonomies with natural language understanding
CN111107048B (en) Phishing website detection method and device and storage medium
WO2017121076A1 (en) Information-pushing method and device
CN113688310B (en) Content recommendation method, device, equipment and storage medium
CN112766284B (en) Image recognition method and device, storage medium and electronic equipment
CN107291774B (en) Error sample identification method and device
CN113779240A (en) Information identification method, device, computer system and readable storage medium
CN113297525B (en) Webpage classification method, device, electronic equipment and storage medium
CN117633228A (en) Model training method and device
CN113033707B (en) Video classification method and device, readable medium and electronic equipment
CN115525781A (en) Multi-mode false information detection method, device and equipment
CN114579878A (en) Training method of false news discrimination model, false news discrimination method and device
CN112348615B (en) Method and device for auditing information
CN116756281A (en) Knowledge question-answering method, device, equipment and medium
CN116560661A (en) Code optimization method, device, equipment and storage medium
CN116451700A (en) Target sentence generation method, device, equipment and storage medium
CN115759292A (en) Model training method and device, semantic recognition method and device, and electronic device
WO2022100401A1 (en) Image recognition-based price information processing method and apparatus, device, and medium
CN114579876A (en) False information detection method, device, equipment and medium
CN113935334A (en) Text information processing method, device, equipment and medium
CN113609018A (en) Test method, training method, device, apparatus, medium, and program product
CN112906726A (en) Model training method, image processing method, device, computing device and medium
CN116070695B (en) Training method of image detection model, image detection method and electronic equipment
CN114385903B (en) Application account identification method and device, electronic equipment and readable storage medium
CN113360734B (en) Webpage classification method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination