CN115169293A - Text steganalysis method, system, device and storage medium - Google Patents
Text steganalysis method, system, device and storage medium Download PDFInfo
- Publication number
- CN115169293A CN115169293A CN202211068809.9A CN202211068809A CN115169293A CN 115169293 A CN115169293 A CN 115169293A CN 202211068809 A CN202211068809 A CN 202211068809A CN 115169293 A CN115169293 A CN 115169293A
- Authority
- CN
- China
- Prior art keywords
- text
- graph
- analyzed
- neural network
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 47
- 238000003860 storage Methods 0.000 title claims abstract description 18
- 238000012549 training Methods 0.000 claims abstract description 51
- 238000013528 artificial neural network Methods 0.000 claims abstract description 50
- 230000004927 fusion Effects 0.000 claims abstract description 13
- 238000011176 pooling Methods 0.000 claims abstract description 12
- 238000010586 diagram Methods 0.000 claims description 30
- 239000013598 vector Substances 0.000 claims description 26
- 230000006870 function Effects 0.000 claims description 23
- 238000004590 computer program Methods 0.000 claims description 11
- 239000000126 substance Substances 0.000 claims description 10
- 230000004931 aggregating effect Effects 0.000 claims description 3
- 230000000717 retained effect Effects 0.000 claims description 2
- 238000004458 analytical method Methods 0.000 abstract description 3
- 238000000605 extraction Methods 0.000 abstract description 3
- 230000007547 defect Effects 0.000 abstract description 2
- 230000008569 process Effects 0.000 description 6
- 238000012545 processing Methods 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 125000004122 cyclic group Chemical group 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000379 polymerizing effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/151—Transformation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Databases & Information Systems (AREA)
- Computer Security & Cryptography (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Bioethics (AREA)
- Data Mining & Analysis (AREA)
- Probability & Statistics with Applications (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a text steganalysis method, a system, a device and a storage medium, comprising the following steps: acquiring a text to be analyzed, inputting the text to be analyzed into a pre-trained multi-graph neural network to obtain network output, wherein if the network output is smaller than a preset threshold value, the text to be analyzed is a normal text, and otherwise, the text to be analyzed is a steganographic text; when the multi-graph neural network is trained, a logic graph, a semantic graph and a syntactic graph are generated by utilizing texts in a training sample set, statistical relations, semantic relations and syntactic relations among the texts are analyzed, the three relations are integrated to carry out message updating and feature extraction on the texts, and features with higher distinguishability are obtained, so that the defect that a sequence model does not consider global features in steganalysis is made up, and the analysis efficiency of the multi-graph neural network is greatly improved; and performing inter-graph fusion on the three updated graphs to obtain a total graph, and pooling the total graph to obtain a final representation of the text to be analyzed, so that the final representation contains richer information, and the accuracy of steganalysis of the text is improved.
Description
Technical Field
The invention relates to a text steganalysis method, a system, a device and a storage medium, belonging to the technical field of encryption.
Background
With the continuous development of the internet, people frequently use the internet to communicate with each other, and the safety problem in information transmission cannot be ignored; lawbreakers hide secret information into a text in a certain steganography mode for invisible transmission, which brings huge hidden danger to life and property safety and social stability of people; the method comprises the steps of analyzing a text to judge whether secret information is contained in the text and is widely accepted, wherein one method is a text steganalysis method based on a neural network, extracting text features by using the neural network, and judging whether the text is steganographically or not according to different distributions of the text features in a high-dimensional semantic space.
At present, methods for performing text steganalysis by using a neural network include: extracting features of different scales of the text by using convolution kernels with different sizes for judgment; the local features and the global features extracted by the convolutional neural network and the cyclic neural network are subjected to fusion feature analysis; and extracting the salient features of the text by using a multi-head attention mechanism for judgment.
Combining the text local features and the text long-distance features extracted by the convolutional neural network and the cyclic neural network; the features extracted by the method are more distinguishable, but some irrelevant redundant features exist in the features, so that the text steganography efficiency is influenced.
Extracting text saliency features by using a multi-head attention mechanism; according to the method, suspicious information in the text can be more concerned by using a multi-head attention mechanism, and the multi-head operation can accelerate the feature extraction speed so as to improve the efficiency of text steganalysis; however, only the feature relation in the current text is focused, and the global correlation between the texts is not considered.
Disclosure of Invention
The invention aims to provide a text steganalysis method, a system, a device and a storage medium, which solve the problems that the text steganalysis efficiency is low, the global correlation among texts is not considered, and the like in the prior art.
In order to realize the purpose, the invention is realized by adopting the following technical scheme:
a method of textual steganalysis comprising:
acquiring a text to be analyzed;
inputting a text to be analyzed into a pre-trained multi-graph neural network to obtain network output, wherein if the network output is smaller than a preset threshold value, the text to be analyzed is a normal text, and otherwise, the text to be analyzed is a steganographic text;
the multi-graph neural network is trained by:
acquiring a training sample set, and converting texts in the training sample set into word vectors;
inputting the word vectors into a composition module during each training, generating three graphs comprising a logic graph, a linguistic intention and a syntactic graph, and updating the information in the three graphs according to the information of each target node in the three graphs and the nodes around the target node;
carrying out inter-graph fusion on the three updated graphs to obtain a total graph;
performing graph pooling on the general graph to obtain a final representation of a text;
inputting the final representation of the text into a classifier to obtain classifier output;
and updating the multi-graph neural network by taking the cross entropy function as a loss function according to the output of the classifier, and repeatedly performing iterative training until the texts in the training sample set are used up to obtain the trained multi-graph neural network.
Preferably, the training sample set consists of a steganographic sample data set and a normal sample data set.
Preferably, in the process of generating the logic diagram, the edge weight in the logic diagram is calculated by the following formula:
wherein, the first and the second end of the pipe are connected with each other,is a word in a logic diagrama、bThe edge weight of the edge in between,representing wordsa、bThe probability of co-occurrence of each other,representing wordsaThe probability of occurrence in the corpus is,representing wordsbProbability of occurrence in the corpus.
Preferably, in the process of generating the semantic graph, the edge weight in the semantic graph is calculated by the following formula:
wherein, the first and the second end of the pipe are connected with each other,representing words in a semantic grapha、bThe edge weight of the edge in between,representing wordsa、bThe number of the sliding windows with semantic relation,representing wordsa、bThe number of simultaneously occurring sliding windows.
Preferably, in generating the syntax map, the edge weights in the syntax map are calculated by the following formula:
wherein the content of the first and second substances,representing words in a syntactic grapha、bThe edge weight of the edge in between,representing wordsa、bThe number of sliding windows with syntactic relations,representing wordsa、bThe number of simultaneously occurring sliding windows.
Preferably, the intra-graph information updating of the logic diagram, the linguistic meaning and the syntactic diagram includes:
for any target node in any graph, information is collected from the surrounding nodes of each target node in the graph by:
wherein the content of the first and second substances,mnwhich is indicative of the information that was collected,maxthe representation takes the maximum value of each dimension in the surrounding node information,indicating connection to target nodepThe number of the nodes is one,e c representing wordscThe weight between the node and the target node,representing wordscThe word vector of (2);
aggregating the collected information with the target node itself by:
wherein the content of the first and second substances,representing wordsaThe aggregated word vector is then used to generate a new word vector,brepresents information toThe extent of the retention is such that,。
preferably, the expression of the penalty function is:
wherein the content of the first and second substances,y i a prediction tag that represents a sample of the sample,p i a prediction tag that represents a sample is provided,Nis the number of samples.
A text steganalysis system comprising:
a text acquisition module: the method comprises the steps of obtaining a text to be analyzed;
the text steganalysis module: the method comprises the steps that a text to be analyzed is input into a pre-trained multi-graph neural network to obtain network output, if the network output is smaller than a preset threshold value, the text to be analyzed is a normal text, and if not, the text to be analyzed is a steganographic text;
the text steganalysis module comprises a network training unit and is used for training the multi-graph neural network by the following method:
acquiring a training sample set, and converting texts in the training sample set into word vectors;
inputting the word vectors into a composition module during each training, generating three graphs comprising a logic graph, a linguistic intention and a syntactic graph, and updating the information in the three graphs according to the information of each target node in the three graphs and the nodes around the target node;
carrying out inter-graph fusion on the three updated graphs to obtain a total graph;
pooling the general graph to obtain a final representation of the text;
inputting the final representation of the text into a classifier to obtain classifier output;
and updating the multi-graph neural network by taking the cross entropy function as a loss function according to the output of the classifier, and repeatedly performing iterative training until the texts in the training sample set are used completely to obtain the trained multi-graph neural network.
A text steganalysis device comprises a processor and a storage medium;
the storage medium is to store instructions;
the processor is configured to operate according to the instructions to perform the steps of any of the above methods.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of any of the above.
Compared with the prior art, the invention has the following beneficial effects:
according to the text steganalysis method, the text steganalysis system, the text steganalysis device and the storage medium, steganalysis is carried out on a text to be analyzed through a multi-graph neural network trained in advance, a composition module in the multi-graph neural network is utilized to generate a logic diagram, a semantic diagram and a syntactic diagram, statistical relations, semantic relations and syntactic relations among the texts are analyzed, the three relations are integrated to carry out message updating and feature extraction on the text, features with higher distinguishing degree are obtained, the defect that a sequence model does not consider global features in steganalysis is made up, and the analysis efficiency of the multi-graph neural network is greatly improved; and performing inter-graph fusion on the updated logic diagram, the semantic meaning and the syntactic diagram to obtain a total diagram, pooling the total diagram to obtain a final representation of the text to be analyzed, so that the final representation contains richer information, and the accuracy of steganalysis of the text is improved.
Drawings
Fig. 1 is a flowchart of a text steganalysis method according to an embodiment of the present invention;
FIG. 2 is a second flowchart of a text steganalysis method according to an embodiment of the present invention;
fig. 3 is a schematic diagram of updating the information in the graph according to an embodiment of the present invention.
Detailed Description
The present invention is further described with reference to the accompanying drawings, and the following examples are only for clearly illustrating the technical solutions of the present invention, and should not be taken as limiting the scope of the present invention.
Example 1
As shown in fig. 1, a text steganalysis method provided in an embodiment of the present invention includes:
s1, obtaining a text to be analyzed.
And receiving the text to be analyzed through a communication receiving terminal.
And S2, inputting the text to be analyzed into a pre-trained multi-graph neural network to obtain network output, wherein if the network output is smaller than a preset threshold value, the text to be analyzed is a normal text, and otherwise, the text to be analyzed is a text containing a secret number.
The multi-graph neural network is trained in advance, a training sample set is formed by a steganographic sample data set and a normal sample data set, and after each training, parameters of the multi-graph neural network are updated by taking a cross entropy function as a loss function until a text in the training sample set is used completely, so that the trained multi-graph neural network is obtained.
In this embodiment, the specific process of training is as follows:
6000 steganographic samples generated in an RNN-stega steganography mode are used as a steganographic sample data set, 6000 normal samples are captured from a real scene to be used as a normal sample data set, a training sample set is formed by the steganographic sample data set and the normal sample data set, and the training sample set comprises 12000 texts.
Training sample set containing 12000 textsInputting the data into an embedded layer in a multi-graph neural network, and converting text into word vectors to obtain a set of word vectors。
Aggregating word vectorsXInputting into a composition module in the multi-graph neural network to generate three graphsRespectively are a logic diagram, a language intention and a syntactic diagram, and each diagram is represented asWhereinA node of a word is represented and,representing the edge weight.
The edge weights in the logic diagram are calculated by the following formula:
wherein, the first and the second end of the pipe are connected with each other,is a word in a logic diagrama、bThe edge weight of the edge in between,representing wordsa、bThe probability of co-occurrence of each other,representing wordsaThe probability of occurrence in the corpus is,representing wordsbProbability of occurrence in the corpus.
The edge weights in the semantic graph are calculated by the following formula:
wherein, the first and the second end of the pipe are connected with each other,representing words in a semantic grapha、bThe edge weight of the edge in between,representing wordsa、bThe number of the sliding windows with semantic relation,representing wordsThe number of simultaneously occurring sliding windows.
The edge weights in the syntax diagrams are calculated by the following formula:
wherein, the first and the second end of the pipe are connected with each other,representing words in a syntactic grapha、bThe edge weight of the edge in between,representing wordsa、bThe number of sliding windows with syntactic relations,representing wordsa、bThe number of simultaneously occurring sliding windows.
The three graphs are respectively updated with the intra-graph information, taking the process of updating the target node in a single graph as an example (as shown in fig. 3, the graph in the figure is updated with the target node in the single graphaIs a target node, andaall connected by solid lines are its surrounding nodes); the target node updating process comprises two steps: and (4) collecting and polymerizing.
First, for any target node in any graph, information is collected from the surrounding nodes of each target node in the graph by the following formula:
wherein the content of the first and second substances,mnwhich is indicative of the information that was collected,maxthe representation takes the maximum value of each dimension in the surrounding node information,indicating connection to target nodepThe number of the nodes is one,e c representing wordscThe weight between the node and the target node,representing wordscThe word vector of (2);
then the collected information is aggregated with the target node by the following formula:
wherein the content of the first and second substances,representing wordsaThe aggregated word vector is then used to generate a new word vector,bindicating the extent to which the information is to be retained,。
in order to enable the obtained text to contain richer information, the updated results of the three pictures are subjected to inter-picture fusion to obtain a general picture containing logic, semantic and syntactic relations among the texts:。
and performing graph pooling operation on the general graph to obtain a final representation of the text:。
the final representation of the text is input to a classifier:output of the classifierpThe text is judged whether the text contains the secret information or not by the following method, wherein the value is a numerical value between 0 and 1: we set a threshold for it asηWhen is coming into contact withWhen the text is considered as containing the ciphertext, whenThe text is considered to be normal text.
The expression for the loss function is:
wherein the content of the first and second substances,y i a prediction tag that represents a sample of the sample,p i a prediction tag that represents a sample is provided,Nis the number of samples.
After the multi-graph neural network is trained, inputting the text to be analyzed into the multi-graph neural network to obtain network output, wherein if the network output is smaller than a preset threshold value, the text to be analyzed is a normal text, and otherwise, the text to be analyzed is a dense text.
The Text steganalysis method provided by the embodiment of the invention can be represented by a flow chart shown in fig. 2, the Text to be analyzed is input into a pre-trained multi-graph neural network, the result of the pooling is input into a classifier to finally obtain the output of the multi-graph neural network after intra-graph information updating, inter-graph information fusion (namely the result obtained after the updating of the three graphs is subjected to inter-graph fusion), and pooling, if the network output is smaller than a preset threshold value, the Text to be analyzed is a normal Text, otherwise, the Text to be analyzed is a dense Text, and steganalysis of the Text to be analyzed is completed.
Example 2
The embodiment of the invention provides a text steganalysis system, which comprises:
a text acquisition module: the method comprises the steps of obtaining a text to be analyzed;
the text steganalysis module: the method comprises the steps that a text to be analyzed is input into a pre-trained multi-graph neural network to obtain network output, if the network output is smaller than a preset threshold value, the text to be analyzed is a normal text, and if not, the text to be analyzed is a steganographic text;
the text steganalysis module comprises a network training unit and is used for training the multi-graph neural network by the following method:
acquiring a training sample set, and converting texts in the training sample set into word vectors;
inputting the word vectors into a composition module during each training, generating three graphs comprising a logic graph, a linguistic intention and a syntactic graph, and updating the information in the three graphs according to the information of each target node in the three graphs and the nodes around the target node;
carrying out inter-graph fusion on the three updated graphs to obtain a total graph;
pooling the general graph to obtain a final representation of the text;
inputting the final representation of the text into a classifier to obtain classifier output;
and updating the multi-graph neural network by taking the cross entropy function as a loss function according to the output of the classifier, and repeatedly performing iterative training until the texts in the training sample set are used completely to obtain the trained multi-graph neural network.
Example 3
The embodiment of the invention provides a text steganalysis device, which comprises a processor and a storage medium;
the storage medium is used for storing instructions;
the processor is configured to operate in accordance with the instructions to perform the steps of the method of:
acquiring a text to be analyzed;
inputting a text to be analyzed into a pre-trained multi-graph neural network to obtain network output, wherein if the network output is smaller than a preset threshold value, the text to be analyzed is a normal text, and otherwise, the text to be analyzed is a steganographic text;
the multi-graph neural network is trained by:
acquiring a training sample set, and converting texts in the training sample set into word vectors;
inputting the word vectors into a composition module during each training, generating three graphs comprising a logic graph, a linguistic intention and a syntactic graph, and updating the information in the three graphs according to the information of each target node in the three graphs and the nodes around the target node;
carrying out inter-graph fusion on the three updated graphs to obtain a total graph;
pooling the general graph to obtain a final representation of the text;
inputting the final representation of the text into a classifier to obtain classifier output;
and updating the multi-graph neural network by taking the cross entropy function as a loss function according to the output of the classifier, and repeatedly performing iterative training until the texts in the training sample set are used up to obtain the trained multi-graph neural network.
Example 4
An embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the following steps of a method:
acquiring a text to be analyzed;
inputting a text to be analyzed into a pre-trained multi-graph neural network to obtain network output, wherein if the network output is smaller than a preset threshold value, the text to be analyzed is a normal text, and otherwise, the text to be analyzed is a steganographic text;
the multi-graph neural network is trained by:
acquiring a training sample set, and converting texts in the training sample set into word vectors;
inputting the word vectors into a composition module during each training, generating three graphs comprising a logic graph, a language intention and a syntactic graph, and updating the graph information of the three graphs according to the information of each target node in the three graphs and the nodes around the target node;
carrying out inter-graph fusion on the three updated graphs to obtain a total graph;
pooling the general graph to obtain a final representation of the text;
inputting the final representation of the text into a classifier to obtain classifier output;
and updating the multi-graph neural network by taking the cross entropy function as a loss function according to the output of the classifier, and repeatedly performing iterative training until the texts in the training sample set are used up to obtain the trained multi-graph neural network.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, it is possible to make various improvements and modifications without departing from the technical principle of the present invention, and those improvements and modifications should be considered as the protection scope of the present invention.
Claims (10)
1. A method for steganalysis of text comprising:
acquiring a text to be analyzed;
inputting a text to be analyzed into a pre-trained multi-graph neural network to obtain network output, wherein if the network output is smaller than a preset threshold value, the text to be analyzed is a normal text, and otherwise, the text to be analyzed is a steganographic text;
the multi-graph neural network is trained by:
acquiring a training sample set, and converting texts in the training sample set into word vectors;
inputting the word vectors into a composition module during each training, generating three graphs comprising a logic graph, a linguistic intention and a syntactic graph, and updating the information in the three graphs according to the information of each target node in the three graphs and the nodes around the target node;
carrying out inter-graph fusion on the three updated graphs to obtain a total graph;
pooling the general graph to obtain a final representation of the text;
inputting the final representation of the text into a classifier to obtain classifier output;
and updating the multi-graph neural network by taking the cross entropy function as a loss function according to the output of the classifier, and repeatedly performing iterative training until the texts in the training sample set are used completely to obtain the trained multi-graph neural network.
2. The method according to claim 1, wherein the training sample set comprises a steganographic sample set and a normal sample set.
3. The method of claim 1, wherein in the step of generating the logic diagram, the edge weight in the logic diagram is calculated by the following formula:
wherein the content of the first and second substances,is a word in a logic diagrama、bThe edge weight of the edge in between,representing wordsa、bThe probability of co-occurrence of each other,representing wordsaThe probability of occurrence in the corpus is,representing wordsbProbability of occurrence in the corpus.
4. The method of claim 1, wherein the edge weights in the semantic graph are calculated by the following formula during the generation of the semantic meaning:
5. The method of claim 1, wherein in the step of generating the syntax map, the edge weight in the syntax map is calculated by the following formula:
wherein, the first and the second end of the pipe are connected with each other,representing words in a syntactic grapha、bThe edge weight of the edge in between,representing wordsa、bThe number of sliding windows with syntactic relations,representing wordsa、bThe number of simultaneously occurring sliding windows.
6. The method of claim 1, wherein the intra-graph information updating of the logic diagram, the linguistic intent and the syntactic diagram comprises:
for any target node in any graph, information is collected from the surrounding nodes of each target node in the graph by:
wherein the content of the first and second substances,mnwhich is indicative of the information that was collected,maxthe representation takes the maximum value of each dimension in the surrounding node information,indicating connection to target nodepThe number of the nodes is equal to the number of the nodes,e c representing wordscThe weight between the node and the target node,representing wordscThe word vector of (2);
aggregating the collected information with the target node itself by:
8. A text steganalysis system comprising:
a text acquisition module: the method comprises the steps of obtaining a text to be analyzed;
the text steganalysis module: the system comprises a text to be analyzed, a pre-trained multi-graph neural network and a network output device, wherein the text to be analyzed is input into the pre-trained multi-graph neural network to obtain the network output, if the network output is smaller than a preset threshold value, the text to be analyzed is a normal text, and otherwise, the text to be analyzed is a steganographic text;
the text steganalysis module comprises a network training unit and is used for training the multi-graph neural network by the following method:
acquiring a training sample set, and converting texts in the training sample set into word vectors;
inputting the word vector into a composition module during each training, generating three graphs comprising a logic graph, a language intention and a syntactic graph, and updating the information in the three graphs according to the information of each target node in the three graphs and the nodes around the target node;
carrying out inter-graph fusion on the three updated graphs to obtain a total graph;
performing graph pooling on the general graph to obtain a final representation of a text;
inputting the final representation of the text into a classifier to obtain classifier output;
and updating the multi-graph neural network by taking the cross entropy function as a loss function according to the output of the classifier, and repeatedly performing iterative training until the texts in the training sample set are used up to obtain the trained multi-graph neural network.
9. A text steganalysis device is characterized by comprising a processor and a storage medium;
the storage medium is used for storing instructions;
the processor is configured to operate according to the instructions to perform the steps of the method according to any one of claims 1 to 7.
10. Computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the steps of a method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211068809.9A CN115169293A (en) | 2022-09-02 | 2022-09-02 | Text steganalysis method, system, device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211068809.9A CN115169293A (en) | 2022-09-02 | 2022-09-02 | Text steganalysis method, system, device and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115169293A true CN115169293A (en) | 2022-10-11 |
Family
ID=83482220
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211068809.9A Pending CN115169293A (en) | 2022-09-02 | 2022-09-02 | Text steganalysis method, system, device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115169293A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115952528A (en) * | 2023-03-14 | 2023-04-11 | 南京信息工程大学 | Multi-scale combined text steganography method and system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110110318A (en) * | 2019-01-22 | 2019-08-09 | 清华大学 | Text Stego-detection method and system based on Recognition with Recurrent Neural Network |
CN114048314A (en) * | 2021-11-11 | 2022-02-15 | 长沙理工大学 | Natural language steganalysis method |
CN114528374A (en) * | 2022-01-19 | 2022-05-24 | 浙江工业大学 | Movie comment emotion classification method and device based on graph neural network |
-
2022
- 2022-09-02 CN CN202211068809.9A patent/CN115169293A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110110318A (en) * | 2019-01-22 | 2019-08-09 | 清华大学 | Text Stego-detection method and system based on Recognition with Recurrent Neural Network |
CN114048314A (en) * | 2021-11-11 | 2022-02-15 | 长沙理工大学 | Natural language steganalysis method |
CN114528374A (en) * | 2022-01-19 | 2022-05-24 | 浙江工业大学 | Movie comment emotion classification method and device based on graph neural network |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115952528A (en) * | 2023-03-14 | 2023-04-11 | 南京信息工程大学 | Multi-scale combined text steganography method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111831790B (en) | False news identification method based on low threshold integration and text content matching | |
CN110781668B (en) | Text information type identification method and device | |
CN111737511B (en) | Image description method based on self-adaptive local concept embedding | |
CN110956037B (en) | Multimedia content repeated judgment method and device | |
CN111078876A (en) | Short text classification method and system based on multi-model integration | |
CN113032001B (en) | Intelligent contract classification method and device | |
CN110502742A (en) | A kind of complexity entity abstracting method, device, medium and system | |
CN110909224A (en) | Sensitive data automatic classification and identification method and system based on artificial intelligence | |
CN107145568A (en) | A kind of quick media event clustering system and method | |
CN115037543A (en) | Abnormal network flow detection method based on bidirectional time convolution neural network | |
CN116150651A (en) | AI-based depth synthesis detection method and system | |
CN112492606A (en) | Classification and identification method and device for spam messages, computer equipment and storage medium | |
CN115169293A (en) | Text steganalysis method, system, device and storage medium | |
CN114254077A (en) | Method for evaluating integrity of manuscript based on natural language | |
CN110929506A (en) | Junk information detection method, device and equipment and readable storage medium | |
CN115314268B (en) | Malicious encryption traffic detection method and system based on traffic fingerprint and behavior | |
CN116881408A (en) | Visual question-answering fraud prevention method and system based on OCR and NLP | |
CN111464687A (en) | Strange call request processing method and device | |
CN114881012A (en) | Article title and content intelligent rewriting system and method based on natural language processing | |
CN115438645A (en) | Text data enhancement method and system for sequence labeling task | |
CN116662557A (en) | Entity relation extraction method and device in network security field | |
CN113626603A (en) | Text classification method and device | |
CN112632229A (en) | Text clustering method and device | |
CN112035670A (en) | Multi-modal rumor detection method based on image emotional tendency | |
CN117786427B (en) | Vehicle type main data matching method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20221011 |
|
RJ01 | Rejection of invention patent application after publication |