CN113961704A - Text-based risk prevention and control processing method, device and equipment - Google Patents

Text-based risk prevention and control processing method, device and equipment Download PDF

Info

Publication number
CN113961704A
CN113961704A CN202111266143.3A CN202111266143A CN113961704A CN 113961704 A CN113961704 A CN 113961704A CN 202111266143 A CN202111266143 A CN 202111266143A CN 113961704 A CN113961704 A CN 113961704A
Authority
CN
China
Prior art keywords
text data
discriminator
historical
generator
countermeasure
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111266143.3A
Other languages
Chinese (zh)
Other versions
CN113961704B (en
Inventor
郑行
武斯斯
邹泊滔
王鑫云
严淮
张天翼
孙清清
陈珺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202111266143.3A priority Critical patent/CN113961704B/en
Publication of CN113961704A publication Critical patent/CN113961704A/en
Application granted granted Critical
Publication of CN113961704B publication Critical patent/CN113961704B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the specification discloses a risk prevention and control processing method, a risk prevention and control processing device and risk prevention and control processing equipment based on a text, wherein the method comprises the following steps: obtaining historical text data aiming at a target service, inputting the historical text data into a pre-constructed generator to generate confrontation text data corresponding to the historical text data, wherein the similarity between the confrontation text data and the historical text data is higher than a preset similarity threshold, inputting the historical text data and the corresponding data pair constructed by the confrontation text data into a pre-constructed discriminator, scoring the historical text data and the corresponding confrontation text data respectively through the discriminator, training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding confrontation text data and a preset loss function to obtain the trained generator and the trained discriminator, and finally performing risk prevention and control processing based on the text data on the target service through the trained discriminator.

Description

Text-based risk prevention and control processing method, device and equipment
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a text-based risk prevention and control processing method, apparatus, and device.
Background
In the business fields of anti-money laundering sanctioning scanning, stealing risk, content security and the like, a large number of text countermeasure scenes exist, and black products try to bypass the interception of a risk prevention and control system taking a text classification algorithm as a core by various countermeasure forms such as keyword rewriting, special character adding, character repetition, syllable replacement, partial deletion, synonym or near synonym replacement, language mixed writing and the like, so that transactions are continuously carried out or statements are published, and further great pressure is brought to the risk prevention and control of the business. Based on this, it is necessary to provide a better technical solution for resisting text attack and defense, so as to enhance the risk control capability for text attack.
Disclosure of Invention
The embodiment of the specification aims to provide a better technical scheme for resisting text attack and defense so as to enhance the risk control capability of the text attack.
In order to implement the above technical solution, the embodiments of the present specification are implemented as follows:
the embodiment of the specification provides a risk prevention and control processing method based on a text, which comprises the following steps: and acquiring historical text data aiming at the target service. Inputting the historical text data into a pre-constructed generator, and generating confrontation text data corresponding to the historical text data, wherein the similarity between the confrontation text data and the historical text data is higher than a preset similarity threshold. Inputting the historical text data and the corresponding data pairs constructed by the countermeasure text data into a pre-constructed discriminator, scoring the historical text data and the corresponding countermeasure text data respectively through the discriminator, and training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the trained generator and the trained discriminator. And carrying out risk prevention and control processing based on text data on the target service through the trained discriminator.
The text-based risk prevention and control processing method provided by the embodiment of the specification is applied to a block chain system, and comprises the following steps: acquiring training rule information of a generator and a discriminator in risk prevention and control based on text data of a target service, generating a corresponding first intelligent contract based on the training rule information of the generator and the discriminator, and deploying the first intelligent contract into the block chain system. When historical text data aiming at a target service are acquired, calling the first intelligent contract, inputting the historical text data into a pre-constructed generator, and generating confrontation text data corresponding to the historical text data, wherein the similarity between the confrontation text data and the historical text data is higher than a preset similarity threshold. Inputting the historical text data and the corresponding data pairs constructed by the countermeasure text data into a pre-constructed discriminator based on the first intelligent contract, scoring the historical text data and the corresponding countermeasure text data respectively through the discriminator, and training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the trained generator and the trained discriminator. And providing the trained discriminator for the target service based on the first intelligent contract so as to perform risk prevention and control processing based on text data on the target service.
The embodiment of this specification provides a risk prevention and control processing apparatus based on text, the apparatus includes: and the historical data acquisition module is used for acquiring historical text data aiming at the target service. And the confrontation data generation module is used for inputting the historical text data into a pre-constructed generator and generating confrontation text data corresponding to the historical text data, wherein the similarity between the confrontation text data and the historical text data is higher than a preset similarity threshold. And the training module is used for inputting the historical text data and the corresponding data pair constructed by the countermeasure text data into a pre-constructed discriminator, scoring the historical text data and the corresponding countermeasure text data through the discriminator respectively, and training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the trained generator and the trained discriminator. And the risk prevention and control module is used for performing risk prevention and control processing based on text data on the target service through the trained discriminator.
The embodiment of the present specification provides a text-based risk prevention and control processing apparatus, where the apparatus is an apparatus in a blockchain system, and the apparatus includes: the contract deployment module is used for acquiring training rule information of a generator and a discriminator in risk control based on text data of a target service, generating a corresponding first intelligent contract based on the training rule information of the generator and the discriminator, and deploying the first intelligent contract into the block chain system. The countermeasure text generation module is used for calling the first intelligent contract when historical text data aiming at the target service is obtained, inputting the historical text data into a pre-constructed generator and generating countermeasure text data corresponding to the historical text data, wherein the similarity between the countermeasure text data and the historical text data is higher than a preset similarity threshold value. And the training module is used for inputting the historical text data and the corresponding data pairs constructed by the countermeasure text data into a pre-constructed discriminator based on the first intelligent contract, scoring the historical text data and the corresponding countermeasure text data through the discriminator respectively, and training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the generator and the discriminator after training. And the discriminator application module is used for providing the trained discriminator for the target service based on the first intelligent contract so as to perform risk prevention and control processing based on text data on the target service.
The embodiment of this specification provides a risk prevention and control processing equipment based on text, includes: a processor; and a memory arranged to store computer executable instructions that, when executed, cause the processor to: and acquiring historical text data aiming at the target service. Inputting the historical text data into a pre-constructed generator, and generating confrontation text data corresponding to the historical text data, wherein the similarity between the confrontation text data and the historical text data is higher than a preset similarity threshold. Inputting the historical text data and the corresponding data pairs constructed by the countermeasure text data into a pre-constructed discriminator, scoring the historical text data and the corresponding countermeasure text data respectively through the discriminator, and training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the trained generator and the trained discriminator. And carrying out risk prevention and control processing based on text data on the target service through the trained discriminator.
The embodiment of the present specification provides a text-based risk prevention and control processing device, where the device is a device in a blockchain system, and the device includes: a processor; and a memory arranged to store computer executable instructions that, when executed, cause the processor to: acquiring training rule information of a generator and a discriminator in risk prevention and control based on text data of a target service, generating a corresponding first intelligent contract based on the training rule information of the generator and the discriminator, and deploying the first intelligent contract into the block chain system. When historical text data aiming at a target service are acquired, calling the first intelligent contract, inputting the historical text data into a pre-constructed generator, and generating confrontation text data corresponding to the historical text data, wherein the similarity between the confrontation text data and the historical text data is higher than a preset similarity threshold. Inputting the historical text data and the corresponding data pairs constructed by the countermeasure text data into a pre-constructed discriminator based on the first intelligent contract, scoring the historical text data and the corresponding countermeasure text data respectively through the discriminator, and training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the trained generator and the trained discriminator. And providing the trained discriminator for the target service based on the first intelligent contract so as to perform risk prevention and control processing based on text data on the target service.
Embodiments of the present specification also provide a storage medium, where the storage medium is used to store computer-executable instructions, and the executable instructions, when executed, implement the following processes: and acquiring historical text data aiming at the target service. Inputting the historical text data into a pre-constructed generator, and generating confrontation text data corresponding to the historical text data, wherein the similarity between the confrontation text data and the historical text data is higher than a preset similarity threshold. Inputting the historical text data and the corresponding data pairs constructed by the countermeasure text data into a pre-constructed discriminator, scoring the historical text data and the corresponding countermeasure text data respectively through the discriminator, and training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the trained generator and the trained discriminator. And carrying out risk prevention and control processing based on text data on the target service through the trained discriminator.
Embodiments of the present specification also provide a storage medium, where the storage medium is used to store computer-executable instructions, and the executable instructions, when executed, implement the following processes: the method comprises the steps of obtaining training rule information of a generator and a discriminator in risk prevention and control based on text data of a target service, generating a corresponding first intelligent contract based on the training rule information of the generator and the discriminator, and deploying the first intelligent contract into a block chain system. When historical text data aiming at a target service are acquired, calling the first intelligent contract, inputting the historical text data into a pre-constructed generator, and generating confrontation text data corresponding to the historical text data, wherein the similarity between the confrontation text data and the historical text data is higher than a preset similarity threshold. Inputting the historical text data and the corresponding data pairs constructed by the countermeasure text data into a pre-constructed discriminator based on the first intelligent contract, scoring the historical text data and the corresponding countermeasure text data respectively through the discriminator, and training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the trained generator and the trained discriminator. And providing the trained discriminator for the target service based on the first intelligent contract so as to perform risk prevention and control processing based on text data on the target service.
Drawings
In order to more clearly illustrate the embodiments of the present specification or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments described in the present specification, and for those skilled in the art, other drawings can be obtained according to the drawings without any creative effort.
FIG. 1 is a diagram illustrating an embodiment of a text-based risk prevention and control processing method according to the present disclosure;
FIG. 2 is another embodiment of a text-based risk prevention and control process of the present disclosure;
FIG. 3A is a block diagram of another embodiment of a text-based risk prevention and control process;
FIG. 3B is a schematic diagram of a text-based risk prevention and control process according to the present disclosure;
FIG. 4 is a block diagram of an embodiment of a text-based risk prevention and control processing apparatus according to the present disclosure;
FIG. 5 is another embodiment of a text-based risk prevention and control processing device of the present disclosure;
fig. 6 is an embodiment of a text-based risk prevention and control processing device according to the present specification.
Detailed Description
The embodiment of the specification provides a risk prevention and control processing method, device and equipment based on a text.
In order to make those skilled in the art better understand the technical solutions in the present specification, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the present specification, and not all of the embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments in the present specification without any inventive step should fall within the scope of protection of the present specification.
Example one
As shown in fig. 1, an execution subject of the method may be a server, where the server may be a server of a certain service (e.g., a service performing a transaction or a financial service), and specifically, the server may be a server of a payment service, or a server of a service related to financial or instant messaging, and the like. The method may specifically comprise the steps of:
in step S102, historical text data for the target service is acquired.
The target service may include multiple services, for example, a related service for online shopping, an insurance service, an anti-money laundering and sanctioning service, a search service, a social service, a news information service, and the like, which may be specifically set according to an actual situation, and this is not limited in this embodiment of the present specification. The historical text data may be text-type service data generated in the process of executing the target service, or may be text data already included in the target service, specifically, related text data acquired by a reverse money laundering scan engine, related text data acquired by an e-commerce or search engine, related text data of a social service or a news information service (for example, text data published by a user in a forum, text data of published news or information in a news information service, and the like), and may be specifically set according to actual situations, which is not limited in this description embodiment.
In implementation, in the business fields of anti-money laundering sanctioning scanning, stealing risk, content security and the like, a large number of text countermeasure scenes exist, and black products try to bypass the interception of a risk prevention and control system taking a text classification algorithm as a core by various countermeasure forms such as keyword rewriting, adding special characters, character repetition, syllable replacement, partial deletion, synonym or near-synonym replacement, language mixing and the like, so that transactions or statements are continuously carried out, and further great pressure is brought to the risk prevention and control of the business.
Generally, corresponding countermeasure text data can be generated based on a set rule, and a corresponding risk prevention and control model is trained in combination with an enhanced training mode, but the above method requires generation of countermeasure samples under guidance of a variation mode summarized manually (or expert experience, etc.), on one hand, more human resources are required, on the other hand, historical variation data which is not found currently or does not appear in business cannot be generated, and the enhanced training mode is to directly mix historical sample data and countermeasure sample data, and perform training after corresponding weights are assigned to the data, so that it is difficult to learn a comparison relationship between the historical sample data and the countermeasure sample data. In addition, countermeasure sample data can be generated in a gradient disturbance-based mode, and a corresponding risk prevention and control model is trained in a mode of combining with an enhanced training mode, but the mode is originated from the picture countermeasure field, the risk prevention and control model is trained in a mode of adjusting parameters in the direction of descending the gradient of the loss function, and correspondingly, the purpose of causing great influence on the output result of the risk prevention and control model by slight disturbance can be achieved by fine adjustment of the countermeasure sample data in the direction of ascending the gradient of the loss function, so that countermeasure is achieved. However, the text space is different from the continuous pixel space of the picture, and the discrete nature of the text space causes that the continuous disturbance of the representation space is difficult to correspond to the disturbance of the text space or the corresponding disturbance amplitude is too large, so that the effectiveness of resisting sample data is poor. In addition, the countermeasure sample data can be generated through a general method (such as GAN) to train a corresponding risk prevention and control model, however, the above method is commonly used in the field of picture countermeasure, the countermeasure sample data is generated through a generator, a discriminator is trained by the original sample data and the generated countermeasure sample data, the discriminator aims to distinguish the generated countermeasure sample data from the original sample data, in a real text service scene, sample data with different labels exists in the original sample data, the discriminator needs to distinguish the types of the sample data, the demanded countermeasure sample data is not required to be distinguished from the original sample data, but the discriminator needs to be confused with small changes to bypass system management and control, and therefore, the above method cannot be directly applied to the above service. Based on this, it is necessary to provide a better technical solution for resisting text attack and defense, so as to enhance the risk control capability for text attack. The embodiment of the present specification provides an optional processing manner, which may specifically include the following:
when a risk prevention and control model of a certain service (i.e. a target service) needs to be trained, historical text data corresponding to the target service (e.g. related text data obtained by the anti-money laundering sanction scanning engine, related text data obtained by an e-commerce or search engine, related text data of a social service or a news information service) and the like can be obtained, in practical applications, the manner of obtaining the historical text data may include various ways, for example, corresponding historical text data stored in a server of the target service may be obtained, or, the relevant text data for the target service can be searched by the search engine in the designated local area network, and the searched text data can be used as the historical text data of the target service, or historical text data can be acquired in other manners, which can be specifically set according to actual conditions, and the embodiments of the present specification are not described herein again.
It should be noted that after the historical text data of the target service is acquired, preprocessing such as data cleaning and duplicate removal can be performed on the historical text data, so that the accuracy and the effectiveness of the historical text data are higher.
In step S104, the history text data is input into a pre-constructed generator, and countermeasure text data corresponding to the history text data is generated, the degree of similarity between the countermeasure text data and the history text data being higher than a preset similarity threshold.
The generator may be composed of an algorithm or a model that generates new data having a certain similarity to given data, for example, the generator may be composed of a neural network model, and the like, and may be specifically set according to actual situations. The similarity threshold may be set according to actual conditions, for example, 99% or 90%.
In implementation, in order to generate more variant data of the historical text data, the generator may be pre-established according to actual conditions, for example, a corresponding algorithm (such as a neural network algorithm, etc.) may be pre-selected according to actual conditions, and a framework of the generator may be constructed based on the algorithm, where the currently constructed framework of the generator includes parameters to be determined. After the history text data is obtained in the above manner, the history text data may be input into a pre-constructed generator, a new text data may be generated for each history text data by the generator, the new text data may be the countermeasure text data, in addition, the countermeasure text data has a certain similarity with the corresponding history text data, that is, a similarity threshold may be preset, and it may be ensured that the similarity between the countermeasure text data and the history text data is higher than the similarity threshold. One or more different countermeasure text data may be generated for each of the history text data in the above manner.
It should be noted that the target to be achieved by the generator may be that the generated confrontation text data has a difference as large as possible from the corresponding historical text data under a certain constraint (usually, a distance metric constraint (or a similarity constraint), that is, the generated confrontation text data and the corresponding historical text data need to be kept highly similar), that is, the generated confrontation text data has a difference as large as possible from the corresponding historical text data under the discriminant, that is, the generated confrontation text data may be generated with a small perturbation on the corresponding historical text data to generate a large discriminant result difference.
In step S106, the historical text data and the corresponding data pair constructed by the countermeasure text data are input into a pre-constructed discriminator, the historical text data and the corresponding countermeasure text data are scored by the discriminator, and the generator and the discriminator are trained by comparing the score of the historical text data with the score of the corresponding countermeasure text data and a preset loss function, so as to obtain a trained generator and a trained discriminator.
The discriminator may be formed by an algorithm or a model for identifying and determining given data, for example, the generator may be formed by a preset classification algorithm, and the like, and may be specifically set according to actual conditions.
In implementation, in order to better defend against the text attack, the discriminator may be pre-established according to the actual situation, for example, a corresponding algorithm (such as a binary algorithm or a multi-classification algorithm) may be pre-selected according to the actual situation, and a framework of the discriminator may be constructed based on the algorithm, where the currently constructed framework of the discriminator includes parameters to be determined. After the history text data and the corresponding countermeasure text data are obtained in the above manner, the history text data and the corresponding countermeasure text data can be combined into one data pair, so that each countermeasure text data and the corresponding history text data form a data pair, for any data pair, the data pair can be input into the framework of the built discriminator, the history text data and the corresponding countermeasure text data in the data pair can be scored respectively through the discriminator, the score value of the history text data and the score value of the countermeasure text data in the data pair can be obtained respectively, then, the score value of the history text data in the data pair and the score value of the countermeasure text data can be compared through the discriminator, and the generator and the discriminator are trained based on the comparison result and in combination with a preset loss function. Then, the next data pair can be used to repeat the above processing procedure to train the generator and the arbiter, and finally the generator and the arbiter meeting the target service requirement, i.e. the trained generator and arbiter, can be obtained.
It should be noted that the required targets of the discriminator may be: on one hand, the discriminator needs to ensure that the categories of the historical text data and the countermeasure text data can be accurately identified; on the other hand, for the historical text data and the corresponding effective countermeasure text data, the discriminator needs to give a scoring conclusion as consistent as possible, so that the discriminator has good recognition capability for the countermeasure text data.
In addition, the processing improves the text attack resisting capability of the discriminator in the mutual game process by introducing the generator to generate the text attack resisting data and by adopting a generator attack and discriminator defense mode, and the comparison score is introduced into the generator and the discriminator, so that the discriminator has better consistency on the recognition effect of the variant text resisting data and the historical text data.
In step S108, the trained discriminator performs risk prevention and control processing based on text data on the target service.
In implementation, after the trained generator and the discriminator are obtained in the above manner, high-quality countermeasure text data for a target service can be generated by the trained generator, and risk prevention and control processing based on the text data can be performed on the target service by the trained discriminator, specifically, for example, for a scene where a certain transaction is performed or an utterance is published, text data related to the transaction or the utterance of the to-be-published list can be input into the trained discriminator, if countermeasures such as keyword rewriting, adding special characters, character repetition, syllable replacement, partial deletion, synonym or near-synonym replacement, language mixing and writing exist in the text data, the discriminator can intercept the text data and can accurately recognize that the text data has a risk, so that a notification message indicating that the transaction has a risk or the to-be-published utterance is not in compliance can be output, subsequently, corresponding risk prevention and control processing may be performed based on the notification message, for example, the transaction may be cancelled or terminated, or publication of the statement may be prevented, which may be specifically set according to actual situations, and this is not limited in this embodiment of the specification.
The embodiment of the specification provides a risk prevention and control processing method based on a text, which comprises the steps of obtaining historical text data aiming at a target service, inputting the historical text data into a pre-constructed generator to generate countermeasure text data corresponding to the historical text data, enabling the similarity between the countermeasure text data and the historical text data to be higher than a preset similarity threshold, inputting the historical text data and the corresponding data constructed by the countermeasure text data into a pre-constructed discriminator, scoring the historical text data and the corresponding countermeasure text data through the discriminator, training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain a trained generator and discriminator, and finally, carrying out risk prevention and control processing based on the text data on the target service through the trained discriminator, therefore, the generator is introduced to generate the confrontation text data, the text attack confrontation capacity of the discriminator is improved in the mutual game process through the generator attack and discriminator defense mode, and the comparison scoring is introduced into the generator and the discriminator, so that the discriminator has better consistency on the identification effect of the confrontation mutation text data and the historical text data.
Example two
As shown in fig. 3, an execution subject of the method may be a server, where the server may be a server of a certain service (e.g., a service performing a transaction or a financial service), and specifically, the server may be a server of a payment service, or a server of a service related to financial or instant messaging, for example. The method may specifically comprise the steps of:
in step S202, historical text data for the target service is acquired.
In practical applications, risk prevention and control for a target service may be implemented by generating a countermeasure network GAN, where the GAN may be composed of a generator and a discriminator, where the discriminator may be constructed by a preset classification model, and the preset classification model may be a textCNN (text convolutional Neural Networks) model, an LSTM (Long Short-Term Memory) model, or a BERT (Bidirectional Encoder Representation based on a converter) model, and the like. Based on this, the training of GAN can be completed by the following process.
In step S204, model training is performed on the classification model based on the historical text data, and an initialized discriminator is obtained.
In step S206, the history text data is input to a pre-constructed generator, and countermeasure text data corresponding to the history text data is generated with a first loss function as an optimization target, the first loss function being determined by a distance metric between the history text data and the corresponding countermeasure text data constraining a loss, and a score contrast loss between the history text data and the corresponding countermeasure text data.
Wherein the distance metric constraint loss between the historical text data and the corresponding confrontation text data can be determined by a distance between the historical text data and the corresponding confrontation text data, and the distance between the historical text data and the corresponding confrontation text data can be determined by a similarity algorithm, wherein the similarity algorithm can include one or more of an euclidean distance algorithm, a pearson correlation coefficient algorithm, a cosine similarity algorithm, and a generalized Jaccard similarity coefficient algorithm. The score contrast loss between the historical text data and the corresponding countermeasure text data can be determined by the score value between the historical text data and the corresponding countermeasure text data, the historical text data and the corresponding countermeasure text data can be respectively scored through a plurality of different algorithms or models to obtain corresponding scores, and the score contrast loss between the historical text data and the corresponding countermeasure text data can be determined through the corresponding scores.
In implementation, the historical text data may be input into a pre-constructed generator, and the loss may be constrained by a distance metric between the historical text data and the corresponding countermeasure text data, and a first loss function for score versus loss determination between the historical text data and the corresponding confrontational text data, meanwhile, the countermeasure text data corresponding to the historical text data can be generated, the optimization goal of the generator can be that the generated countermeasure text data has a certain constraint with the corresponding historical text data (namely, the distance measurement constraint is that the generated countermeasure text data has a higher similarity with the corresponding historical text data), the difference between the generated countermeasure text data and the corresponding historical text data is as large as possible under the discriminator, that is, the generated confrontation text data can be small in disturbance on the corresponding historical text data to generate a large difference of discrimination results. The optimization objective of the generator (i.e., the first loss function) includes the minimum of the difference between the distance between the historical text data and the corresponding countermeasure text data and the square of the difference between the logarithms of the scores of the historical text data and the corresponding countermeasure text data, i.e., the first loss function is determined by the following equation:
min{distance(xadv,xori)-λ*[log(padv)-log(pori)]2}
wherein x isadv,xoriRespectively representing historical text data and corresponding countermeasure text data, distance (x)adv,xori) Representing the distance, p, between the historical text data and the corresponding countermeasure text dataadv,poriRespectively representing the scores (scoring values) of the historical text data and the corresponding countermeasure text data, λ beingA predetermined coefficient or a first weight.
In practical applications, a general discriminator is only suitable for discriminating the generated complete text data, and the optimization target of the generator cannot be evaluated in real time when the confrontation text data is generated at each step, but is a final optimization target of the generator, and at this time, a reinforcement learning model may be used to perform corresponding processing, and the specific processing of the step S206 may include the following steps: and inputting the historical text data into a pre-constructed generator, and generating countermeasure text data corresponding to the historical text data in a mode of optimizing through Policy Gradient by taking a first loss function as a long-term benefit.
In practice, the first loss function is optimized as a long-term yield by means of a Policy Gradient (Policy Gradient). The instant profit for each step in the process of specifically generating the confrontation text data can be estimated by Monte Cario Search (Monte Cario Search), that is, the profit for currently generating the word/segment is obtained by randomly sampling the long-term profit weighting of N completed sentences (N is a natural number greater than or equal to 1) after the current step.
Based on the initialized discriminator, the processing of step S106 in the first embodiment may include: and inputting the historical text data and the corresponding confrontation text data as data pairs into an initialized discriminator to obtain a trained generator and the discriminator.
For the above specific processing procedure, reference may be made to relevant contents in the first embodiment, which are not described herein again.
In step S208, the historical text data and the corresponding countermeasure text data are input as a data pair into a pre-constructed discriminator, the historical text data and the corresponding countermeasure text data are scored by the discriminator, respectively, and the generator and the discriminator are trained by comparing the score values of the historical text data and the corresponding countermeasure text data and a second loss function as an optimization target, resulting in a trained generator and discriminator, the second loss function being determined by the classification loss of the historical text data and the corresponding countermeasure text data and the score comparison loss between the historical text data and the corresponding countermeasure text data.
In implementation, the optimization target of the discriminator consists of two parts, firstly, the discriminator needs to ensure that the historical text data and the countermeasure text data belong to the categories, and in addition, for the historical text data and the corresponding countermeasure text data, the discriminator needs to give a scoring conclusion as consistent as possible, so that the discriminator has good recognition capability for the countermeasure text data. The optimization target (i.e., the second loss function) of the discriminator includes a classification loss of the historical text data and the corresponding countermeasure text data, which may be determined based on the score value corresponding to the historical text data and the category to which the historical text data belongs, and a score contrast loss between the historical text data and the corresponding countermeasure text data, which may be determined based on a square of a difference between logarithms of scores of the historical text data and the corresponding countermeasure text data and a preset second weight. In practical applications, the second loss function may be determined by the following equation:
Figure BDA0003326962950000091
wherein p isiIs the score value, y, of the ith historical text dataiAlpha is a second weight, and is the type to which the ith historical text data belongs.
The generator and the arbiter can be trained by the above processing until the final effect of the arbiter meets the business requirement. After the trained generator and discriminator are obtained, the risk prevention and control process may be performed on the target service, which may be specifically referred to in the following processes of step S210 and step S212.
In step S210, target text data to be detected of the target service is acquired.
In step S212, the target text data is input to the trained discriminator to obtain a risk identification result corresponding to the target text data.
The embodiment of the specification provides a risk prevention and control processing method based on a text, which comprises the steps of obtaining historical text data aiming at a target service, inputting the historical text data into a pre-constructed generator to generate countermeasure text data corresponding to the historical text data, enabling the similarity between the countermeasure text data and the historical text data to be higher than a preset similarity threshold, inputting the historical text data and the corresponding data constructed by the countermeasure text data into a pre-constructed discriminator, scoring the historical text data and the corresponding countermeasure text data through the discriminator, training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain a trained generator and discriminator, and finally, carrying out risk prevention and control processing based on the text data on the target service through the trained discriminator, therefore, the generator is introduced to generate the confrontation text data, the text attack confrontation capacity of the discriminator is improved in the mutual game process through the generator attack and discriminator defense mode, and the comparison scoring is introduced into the generator and the discriminator, so that the discriminator has better consistency on the identification effect of the confrontation mutation text data and the historical text data.
EXAMPLE III
As shown in fig. 3A and fig. 3B, an execution subject of the method may be a blockchain system, and the blockchain system may be composed of a terminal device and/or a server, where the server may be a server of a certain service (such as a transaction service or a financial service), and specifically, the server may be a server of a payment service, or a server of a service related to financial or instant messaging, and the like. The method may specifically comprise the steps of:
in step S302, training rule information of a generator and a discriminator in risk prevention and control based on text data for a target service is obtained, a corresponding first intelligent contract is generated based on the training rule information of the generator and the discriminator, and the first intelligent contract is deployed in a block chain system.
Wherein the first intelligent contract may be a computer protocol intended to propagate, verify or execute contracts in an informational manner, the first intelligent contract allowing trusted interaction without third parties, the process of such interaction being traceable and irreversible, the first intelligent contract including agreements on which contract participants may execute rights and obligations agreed.
In implementation, in order to make traceability of risk prevention and control based on text data for a target service better, a specified blockchain system may be created or added, so that text data in the target service may be detected based on the blockchain system, specifically, a corresponding application program may be installed in a blockchain node, an input box and/or a selection box of training rule information of a generator and a discriminator may be set in the application program, and corresponding information may be set in the input box and/or the selection box. The blockchain system may then receive training rule information for the generator and the arbiter. The blockchain system can generate a corresponding first intelligent contract based on the training rule information of the generator and the arbiter, and can deploy the first intelligent contract into the blockchain system, so that the training rule information of the generator and the arbiter and the corresponding first intelligent contract are stored in the blockchain system, other users cannot tamper with the training rule information of the generator and the arbiter and the corresponding first intelligent contract, and the blockchain system trains the generator and the arbiter through the first intelligent contract.
In step S304, when the historical text data for the target service is acquired, a first intelligent contract is invoked, the historical text data is input into a pre-constructed generator, countermeasure text data corresponding to the historical text data is generated, and a similarity between the countermeasure text data and the historical text data is higher than a preset similarity threshold.
In implementation, the first intelligent contract may be provided with a rule for generating countermeasure text data corresponding to the historical text data through the generator, and the corresponding processing may be implemented based on the rule information in the first intelligent contract, which may be specifically referred to above, and is not described herein again.
In addition, the discriminator is constructed by a preset classification model, and at this time, the discriminator may be initialized in the following manner, which may specifically include the following: and performing model training on the classification model through historical text data based on a second intelligent contract pre-deployed in the block chain system to obtain an initialized discriminator.
Wherein the second intelligent contract can be used for initializing the judgment. The preset classification model may be a textCNN model, an LSTM model, a BERT model, or the like. For the above specific processing procedure, reference may be made to the above related contents, which are not described herein again.
Further, the processing of step S304 described above may also be implemented by: calling a first intelligent contract, inputting the historical text data into a pre-constructed generator, and generating countermeasure text data corresponding to the historical text data by taking a first loss function as an optimization target, wherein the first loss function is determined by distance measurement constraint loss between the historical text data and the corresponding countermeasure text data and score contrast loss between the historical text data and the corresponding countermeasure text data.
Wherein the first loss function may include a minimum of a difference between a distance between the historical text data and the corresponding countermeasure text data and a square of a difference between logarithms of scores of the historical text data and the corresponding countermeasure text data.
In addition, the processing of calling the first intelligent contract, inputting the historical text data into a pre-constructed generator, and generating countermeasure text data corresponding to the historical text data with the first loss function as an optimization target may include: calling a first intelligent contract, inputting historical text data into a pre-constructed generator, and generating countermeasure text data corresponding to the historical text data in a mode of optimizing through Policy Gradient with a first loss function as a long-term benefit.
In step S306, based on the first intelligent contract, inputting the data pairs constructed by the historical text data and the corresponding countermeasure text data into a pre-constructed discriminator, scoring the historical text data and the corresponding countermeasure text data respectively by the discriminator, and training the generator and the discriminator by comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function, so as to obtain a trained generator and a trained discriminator.
In implementation, the first intelligent contract may further be provided with relevant rule information for training the judgment, so that the corresponding processing may be implemented based on the rule information in the first intelligent contract, which may be referred to in detail as above, and is not described herein again.
Based on the content of the initialization process performed on the discriminator, the process of step S306 may further include: based on the first intelligent contract, inputting the historical text data and the corresponding countermeasure text data as data pairs into an initialized discriminator to obtain a trained generator and the discriminator.
Further, the processing of step S306 described above may also be implemented by: based on a first intelligent contract, inputting historical text data and corresponding countermeasure text data serving as data pairs into a pre-constructed discriminator, scoring the historical text data and the corresponding countermeasure text data through the discriminator respectively, and training the generator and the discriminator by comparing the score values of the historical text data and the corresponding countermeasure text data and taking a second loss function as an optimization target to obtain a trained generator and the discriminator, wherein the second loss function is determined by classification loss of the historical text data and the corresponding countermeasure text data and score comparison loss between the historical text data and the corresponding countermeasure text data.
The classification loss of the historical text data and the corresponding countermeasure text data is determined based on the corresponding score value of the historical text data and the category to which the historical text data belongs, and the score contrast loss between the historical text data and the corresponding countermeasure text data is determined based on the square of the difference between the logarithms of the scores of the historical text data and the corresponding countermeasure text data and a preset second weight.
In step S308, based on the first intelligent contract, the trained discriminator is provided to the target service, so as to perform risk prevention and control processing based on the text data on the target service.
In implementation, the first intelligent contract may be provided with a display policy that determines the privacy information according to the risk level information and the privacy protection level information, and display related rule information of the privacy information based on the determined display policy, so that the corresponding processing may be implemented based on the rule information in the first intelligent contract, which may be referred to in detail for the above related contents, and is not described herein again.
The specific processing in step S304 to step S308 may refer to the relevant contents in the first embodiment and the second embodiment, that is, various processing related to the first embodiment and the second embodiment may be implemented by the first smart contract.
The embodiment of the specification provides a risk prevention and control processing method based on a text, which is applied to a block chain system, generates a corresponding first intelligent contract based on training rule information of a generator and a discriminator in risk prevention and control based on text data of a target service by acquiring the training rule information of the generator and the discriminator, deploys the first intelligent contract into the block chain system, calls the first intelligent contract when acquiring historical text data aiming at the target service, inputs the historical text data into the pre-constructed generator, generates confrontation text data corresponding to the historical text data, has the similarity between the confrontation text data and the historical text data higher than a preset similarity threshold value, and inputs data pairs constructed by the historical text data and the corresponding confrontation text data into the pre-constructed discriminator based on the first intelligent contract, respectively scoring the historical text data and the corresponding countermeasure text data through a discriminator, training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the trained generator and discriminator, finally providing the trained discriminator for a target service based on a first intelligent contract to perform risk prevention and control processing based on the text data on the target service, thus generating the countermeasure text data through introducing the generator, improving the capability of the discriminator for resisting the text attack in the mutual game process through the mode of generator attack and discriminator defense, and introducing comparison scoring in the generator and the discriminator to ensure that the discriminator has better consistency with the historical text data on the recognition effect of the countermeasure variant text data, the scheme can be applied to the business fields of sanctioning and scanning text countermeasures and the like, and has a remarkable effect on prevention and control of text countermeasures and bypass risks.
Example four
Based on the same idea, the text-based risk prevention and control processing method provided in the embodiment of the present specification further provides a text-based risk prevention and control processing device, as shown in fig. 4.
The text-based risk prevention and control processing device comprises: a historical data acquisition module 401, a confrontation data generation module 402, a training module 403 and a risk prevention and control module 404, wherein:
a historical data acquisition module 401, which acquires historical text data for a target service;
a confrontation data generation module 402, which inputs the historical text data into a pre-constructed generator, and generates confrontation text data corresponding to the historical text data, wherein the similarity between the confrontation text data and the historical text data is higher than a preset similarity threshold;
a training module 403, configured to input the data pairs constructed by the historical text data and the corresponding countermeasure text data into a pre-constructed discriminator, score the historical text data and the corresponding countermeasure text data through the discriminator, and train the generator and the discriminator by comparing score values of the historical text data and score values of the corresponding countermeasure text data with a preset loss function to obtain the generator and the discriminator after training;
and a risk prevention and control module 404, which performs risk prevention and control processing based on text data on the target service through the trained arbiter.
In an embodiment of this specification, the discriminator is constructed by a preset classification model, and the apparatus further includes:
the initialization module is used for carrying out model training on the classification model based on the historical text data to obtain an initialized discriminator;
the training module 403 inputs the historical text data and the corresponding confrontation text data as a data pair into the initialized discriminator to obtain the trained generator and the discriminator.
In an embodiment of the present specification, the preset classification model is a textCNN model, an LSTM model, or a BERT model.
In this embodiment of the present specification, the countermeasure data generation module 402 inputs the historical text data into a pre-constructed generator, and generates countermeasure text data corresponding to the historical text data with a first loss function as an optimization target, where the first loss function is determined by a distance metric constraint loss between the historical text data and the corresponding countermeasure text data, and a score comparison loss between the historical text data and the corresponding countermeasure text data.
In an embodiment of the present specification, the first loss function includes a minimum value of a difference between a distance between the history text data and the corresponding countermeasure text data and a square of a difference between logarithms of scores of the history text data and the corresponding countermeasure text data.
In this embodiment of the present specification, the confrontation data generation module 402 inputs the historical text data into a pre-constructed generator, and generates confrontation text data corresponding to the historical text data in a manner of optimizing by Policy Gradient with a first loss function as a long-term benefit.
In this embodiment of the present specification, the training module 403 inputs the historical text data and the corresponding countermeasure text data as a data pair into a pre-constructed discriminator, scores the historical text data and the corresponding countermeasure text data respectively through the discriminator, and trains the generator and the discriminator by comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and using a second loss function as an optimization target, so as to obtain the generator and the discriminator after training, where the second loss function is determined by the classification loss of the historical text data and the corresponding countermeasure text data and the score comparison loss between the historical text data and the corresponding countermeasure text data.
In this specification embodiment, the classification loss of the history text data and the corresponding countermeasure text data is determined based on the score value corresponding to the history text data and the category to which the history text data belongs, and the score contrast loss between the history text data and the corresponding countermeasure text data is determined based on the square of the difference between the logarithms of the scores of the history text data and the corresponding countermeasure text data and a preset second weight.
In this embodiment, the risk prevention and control module 404 includes:
the target data acquisition unit is used for acquiring target text data to be detected of the target service;
and the risk prevention and control unit is used for inputting the target text data into the trained discriminator to obtain a risk identification result corresponding to the target text data.
The embodiment of the specification provides a risk prevention and control processing device based on a text, which is characterized in that historical text data aiming at a target service are obtained, then the historical text data are input into a pre-constructed generator to generate countermeasure text data corresponding to the historical text data, the similarity between the countermeasure text data and the historical text data is higher than a preset similarity threshold, the historical text data and the corresponding data constructed by the countermeasure text data are input into a pre-constructed discriminator, the historical text data and the corresponding countermeasure text data are respectively graded through the discriminator, the generator and the discriminator are trained by comparing the grading value of the historical text data with the grading value of the corresponding countermeasure text data and a preset loss function, the trained generator and the discriminator are obtained, and finally, the risk prevention and control processing based on the text data is carried out on the target service through the trained discriminator, therefore, the generator is introduced to generate the confrontation text data, the text attack confrontation capacity of the discriminator is improved in the mutual game process through the generator attack and discriminator defense mode, and the comparison scoring is introduced into the generator and the discriminator, so that the discriminator has better consistency on the identification effect of the confrontation mutation text data and the historical text data.
EXAMPLE five
Based on the same idea, the embodiments of the present specification further provide a text-based risk prevention and control processing device, which is a device in a blockchain system, as shown in fig. 5.
The text-based risk prevention and control processing device comprises: a contract deployment module 501, a confrontation text generation module 502, a training module 503, and a discriminator application module 504, wherein:
the contract deployment module 501 is configured to acquire training rule information of a generator and a discriminator in risk control based on text data for a target service, generate a corresponding first intelligent contract based on the training rule information of the generator and the discriminator, and deploy the first intelligent contract to the block chain system;
the countermeasure text generation module 502 is used for calling the first intelligent contract when historical text data for a target service is acquired, inputting the historical text data into a pre-constructed generator, and generating countermeasure text data corresponding to the historical text data, wherein the similarity between the countermeasure text data and the historical text data is higher than a preset similarity threshold;
a training module 503, configured to input the data pairs constructed by the historical text data and the corresponding countermeasure text data into a pre-constructed discriminator based on the first intelligent contract, score the historical text data and the corresponding countermeasure text data through the discriminator, and train the generator and the discriminator by comparing the score value of the historical text data and the score value of the corresponding countermeasure text data with a preset loss function to obtain the generator and the discriminator after training;
and the discriminator application module 504 is used for providing the trained discriminator for the target service based on the first intelligent contract so as to perform risk prevention and control processing based on text data on the target service.
In an embodiment of this specification, the discriminator is constructed by a preset classification model, and the apparatus further includes:
the initialization module is used for carrying out model training on the classification model through the historical text data based on a second intelligent contract which is pre-deployed in the block chain system to obtain an initialized discriminator;
the training module 503, based on the first intelligent contract, inputs the historical text data and the corresponding confrontation text data as a data pair into the initialized discriminator to obtain the trained generator and the discriminator.
The embodiment of the specification provides a risk prevention and control processing device based on a text, training rule information of a generator and a discriminator in risk prevention and control based on text data of a target service is obtained, a corresponding first intelligent contract is generated based on the training rule information of the generator and the discriminator, the first intelligent contract is deployed in a block chain system, when historical text data aiming at the target service is obtained, the first intelligent contract is called, the historical text data is input into the pre-constructed generator, countermeasure text data corresponding to the historical text data is generated, the similarity between the countermeasure text data and the historical text data is higher than a preset similarity threshold value, data pairs constructed by the historical text data and the corresponding countermeasure text data are input into the pre-constructed discriminator based on the first intelligent contract, and the historical text data and the corresponding countermeasure text data are respectively scored through the discriminator, training the generator and the discriminator by comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the trained generator and discriminator, and finally providing the trained discriminator for the target service based on a first intelligent contract to perform risk prevention and control processing based on the text data on the target service, by introducing the generator to generate the confrontation text data, and by means of generator attack and discriminator defense, the capability of the discriminator for resisting text attacks is improved in the mutual game process, and the generator and the discriminator are introduced with comparison scoring, the method and the device have the advantages that the identification effect of the discriminator on the confrontation variation text data is better consistent with the historical text data, the scheme can be applied to the business fields of sanctioning scanning text confrontation and the like, and the prevention and control effect on the text confrontation bypass risks is obvious.
EXAMPLE six
Based on the same idea, the text-based risk prevention and control processing apparatus provided in the embodiments of the present specification further provides a text-based risk prevention and control processing device, as shown in fig. 6.
The text-based risk prevention and control processing device may be a server provided in the above embodiment, or a device in a blockchain system.
The text-based risk prevention processing device may have a relatively large difference due to different configurations or performances, and may include one or more processors 601 and a memory 602, and one or more stored applications or data may be stored in the memory 602. Wherein the memory 602 may be transient or persistent storage. The application program stored in memory 602 may include one or more modules (not shown), each of which may include a series of computer-executable instructions for a text-based risk control processing device. Still further, processor 601 may be configured to communicate with memory 602 to execute a series of computer-executable instructions in memory 602 on a text-based risk prevention processing device. The text-based risk prevention processing apparatus may also include one or more power supplies 603, one or more wired or wireless network interfaces 604, one or more input-output interfaces 605, and one or more keyboards 606.
In particular, in this embodiment, the text-based risk prevention and treatment device includes a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer-executable instructions for the text-based risk prevention and treatment device, and the one or more programs configured to be executed by the one or more processors include computer-executable instructions for:
acquiring historical text data aiming at a target service;
inputting the historical text data into a pre-constructed generator, and generating confrontation text data corresponding to the historical text data, wherein the similarity between the confrontation text data and the historical text data is higher than a preset similarity threshold;
inputting the historical text data and the corresponding data pairs constructed by the countermeasure text data into a pre-constructed discriminator, scoring the historical text data and the corresponding countermeasure text data respectively through the discriminator, and training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the trained generator and the trained discriminator;
and carrying out risk prevention and control processing based on text data on the target service through the trained discriminator.
In an embodiment of this specification, the discriminator is constructed by a preset classification model, and further includes:
performing model training on the classification model based on the historical text data to obtain an initialized discriminator;
inputting the historical text data and the data pair constructed by the corresponding confrontation text data into a pre-constructed discriminator to obtain the trained generator and the trained discriminator, wherein the generator and the discriminator comprise:
and inputting the historical text data and the corresponding confrontation text data as data pairs into the initialized discriminator to obtain the trained generator and the discriminator.
In an embodiment of the present specification, the preset classification model is a textCNN model, an LSTM model, or a BERT model.
In this embodiment of the present specification, the inputting the historical text data into a pre-constructed generator to generate confrontation text data corresponding to the historical text data includes:
inputting the historical text data into a pre-constructed generator, and generating countermeasure text data corresponding to the historical text data by taking a first loss function as an optimization target, wherein the first loss function is determined by the distance metric between the historical text data and the corresponding countermeasure text data to restrict loss and the score contrast loss between the historical text data and the corresponding countermeasure text data.
In an embodiment of the present specification, the first loss function includes a minimum value of a difference between a distance between the history text data and the corresponding countermeasure text data and a square of a difference between logarithms of scores of the history text data and the corresponding countermeasure text data.
In this embodiment of the present specification, the inputting the historical text data into a pre-constructed generator, and generating countermeasure text data corresponding to the historical text data with a first loss function as an optimization target includes:
and inputting the historical text data into a pre-constructed generator, and generating countermeasure text data corresponding to the historical text data in a mode of optimizing through Policy Gradient with a first loss function as a long-term benefit.
In this embodiment of the present specification, the inputting of the historical text data and the corresponding countermeasure text data as data pairs into a pre-constructed discriminator, scoring the historical text data and the corresponding countermeasure text data by the discriminator, and training the generator and the discriminator by comparing the score of the historical text data with the score of the corresponding countermeasure text data and a preset loss function to obtain the generator and the discriminator after training includes:
inputting the historical text data and the corresponding countermeasure text data into a pre-constructed discriminator as a data pair, scoring the historical text data and the corresponding countermeasure text data through the discriminator, respectively, and training the generator and the discriminator by comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and taking a second loss function as an optimization target to obtain the trained generator and the discriminator, wherein the second loss function is determined by the classification loss of the historical text data and the corresponding countermeasure text data and the score comparison loss between the historical text data and the corresponding countermeasure text data.
In this specification embodiment, the classification loss of the history text data and the corresponding countermeasure text data is determined based on the score value corresponding to the history text data and the category to which the history text data belongs, and the score contrast loss between the history text data and the corresponding countermeasure text data is determined based on the square of the difference between the logarithms of the scores of the history text data and the corresponding countermeasure text data and a preset second weight.
In an embodiment of this specification, the performing, by the trained arbiter, risk prevention and control processing based on text data on the target service includes:
acquiring target text data to be detected of the target service;
and inputting the target text data into the trained discriminator to obtain a risk identification result corresponding to the target text data.
In particular, in this embodiment, the text-based risk prevention and treatment device includes a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer-executable instructions for the text-based risk prevention and treatment device, and the one or more programs configured to be executed by the one or more processors include computer-executable instructions for:
acquiring training rule information of a generator and a discriminator in risk prevention and control based on text data of a target service, generating a corresponding first intelligent contract based on the training rule information of the generator and the discriminator, and deploying the first intelligent contract into the block chain system;
when historical text data aiming at a target service are acquired, calling the first intelligent contract, inputting the historical text data into a pre-constructed generator, and generating confrontation text data corresponding to the historical text data, wherein the similarity between the confrontation text data and the historical text data is higher than a preset similarity threshold;
inputting the historical text data and the corresponding data pairs constructed by the countermeasure text data into a pre-constructed discriminator based on the first intelligent contract, scoring the historical text data and the corresponding countermeasure text data respectively through the discriminator, and training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the generator and the discriminator after training;
and providing the trained discriminator for the target business based on the first intelligent contract so as to perform risk prevention and control processing based on text data on the target business.
In an embodiment of this specification, the discriminator is constructed by a preset classification model, and further includes:
model training is carried out on the classification model through the historical text data based on a second intelligent contract which is pre-deployed in the block chain system, and an initialized discriminator is obtained;
inputting the historical text data and the corresponding data pair constructed by the countermeasure text data into a pre-constructed discriminator based on the first intelligent contract to obtain the trained generator and the discriminator, wherein the generator comprises:
and inputting the historical text data and the corresponding countermeasure text data into the initialized discriminator as a data pair based on the first intelligent contract to obtain the trained generator and the discriminator.
The embodiment of the specification provides a risk prevention and control processing device based on a text, which is characterized in that historical text data aiming at a target service are obtained, then the historical text data are input into a pre-constructed generator to generate countermeasure text data corresponding to the historical text data, the similarity between the countermeasure text data and the historical text data is higher than a preset similarity threshold, the historical text data and the corresponding data constructed by the countermeasure text data are input into a pre-constructed discriminator, the historical text data and the corresponding countermeasure text data are respectively graded through the discriminator, the generator and the discriminator are trained by comparing the grading value of the historical text data with the grading value of the corresponding countermeasure text data and a preset loss function, the trained generator and the discriminator are obtained, and finally, the risk prevention and control processing based on the text data is carried out on the target service through the trained discriminator, therefore, the generator is introduced to generate the confrontation text data, the text attack confrontation capacity of the discriminator is improved in the mutual game process through the generator attack and discriminator defense mode, and the comparison scoring is introduced into the generator and the discriminator, so that the discriminator has better consistency on the identification effect of the confrontation mutation text data and the historical text data.
EXAMPLE seven
Further, based on the methods shown in fig. 1 to fig. 3B, one or more embodiments of the present specification further provide a storage medium for storing computer-executable instruction information, in a specific embodiment, the storage medium may be a usb disk, an optical disk, a hard disk, and the like, and when the storage medium stores the computer-executable instruction information, the storage medium implements the following processes:
acquiring historical text data aiming at a target service;
inputting the historical text data into a pre-constructed generator, and generating confrontation text data corresponding to the historical text data, wherein the similarity between the confrontation text data and the historical text data is higher than a preset similarity threshold;
inputting the historical text data and the corresponding data pairs constructed by the countermeasure text data into a pre-constructed discriminator, scoring the historical text data and the corresponding countermeasure text data respectively through the discriminator, and training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the trained generator and the trained discriminator;
and carrying out risk prevention and control processing based on text data on the target service through the trained discriminator.
In an embodiment of this specification, the discriminator is constructed by a preset classification model, and further includes:
performing model training on the classification model based on the historical text data to obtain an initialized discriminator;
inputting the historical text data and the data pair constructed by the corresponding confrontation text data into a pre-constructed discriminator to obtain the trained generator and the trained discriminator, wherein the generator and the discriminator comprise:
and inputting the historical text data and the corresponding confrontation text data as data pairs into the initialized discriminator to obtain the trained generator and the discriminator.
In an embodiment of the present specification, the preset classification model is a textCNN model, an LSTM model, or a BERT model.
In this embodiment of the present specification, the inputting the historical text data into a pre-constructed generator to generate confrontation text data corresponding to the historical text data includes:
inputting the historical text data into a pre-constructed generator, and generating countermeasure text data corresponding to the historical text data by taking a first loss function as an optimization target, wherein the first loss function is determined by the distance metric between the historical text data and the corresponding countermeasure text data to restrict loss and the score contrast loss between the historical text data and the corresponding countermeasure text data.
In an embodiment of the present specification, the first loss function includes a minimum value of a difference between a distance between the history text data and the corresponding countermeasure text data and a square of a difference between logarithms of scores of the history text data and the corresponding countermeasure text data.
In this embodiment of the present specification, the inputting the historical text data into a pre-constructed generator, and generating countermeasure text data corresponding to the historical text data with a first loss function as an optimization target includes:
and inputting the historical text data into a pre-constructed generator, and generating countermeasure text data corresponding to the historical text data in a mode of optimizing through Policy Gradient with a first loss function as a long-term benefit.
In this embodiment of the present specification, the inputting of the historical text data and the corresponding countermeasure text data as data pairs into a pre-constructed discriminator, scoring the historical text data and the corresponding countermeasure text data by the discriminator, and training the generator and the discriminator by comparing the score of the historical text data with the score of the corresponding countermeasure text data and a preset loss function to obtain the generator and the discriminator after training includes:
inputting the historical text data and the corresponding countermeasure text data into a pre-constructed discriminator as a data pair, scoring the historical text data and the corresponding countermeasure text data through the discriminator, respectively, and training the generator and the discriminator by comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and taking a second loss function as an optimization target to obtain the trained generator and the discriminator, wherein the second loss function is determined by the classification loss of the historical text data and the corresponding countermeasure text data and the score comparison loss between the historical text data and the corresponding countermeasure text data.
In this specification embodiment, the classification loss of the history text data and the corresponding countermeasure text data is determined based on the score value corresponding to the history text data and the category to which the history text data belongs, and the score contrast loss between the history text data and the corresponding countermeasure text data is determined based on the square of the difference between the logarithms of the scores of the history text data and the corresponding countermeasure text data and a preset second weight.
In an embodiment of this specification, the performing, by the trained arbiter, risk prevention and control processing based on text data on the target service includes:
acquiring target text data to be detected of the target service;
and inputting the target text data into the trained discriminator to obtain a risk identification result corresponding to the target text data.
In addition, in another specific embodiment, the storage medium may be a usb disk, an optical disk, a hard disk, or the like, and the storage medium stores computer executable instruction information that, when executed by the processor, can implement the following process:
acquiring training rule information of a generator and a discriminator in risk prevention and control based on text data of a target service, generating a corresponding first intelligent contract based on the training rule information of the generator and the discriminator, and deploying the first intelligent contract into the block chain system;
when historical text data aiming at a target service are acquired, calling the first intelligent contract, inputting the historical text data into a pre-constructed generator, and generating confrontation text data corresponding to the historical text data, wherein the similarity between the confrontation text data and the historical text data is higher than a preset similarity threshold;
inputting the historical text data and the corresponding data pairs constructed by the countermeasure text data into a pre-constructed discriminator based on the first intelligent contract, scoring the historical text data and the corresponding countermeasure text data respectively through the discriminator, and training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the generator and the discriminator after training;
and providing the trained discriminator for the target business based on the first intelligent contract so as to perform risk prevention and control processing based on text data on the target business.
In an embodiment of this specification, the discriminator is constructed by a preset classification model, and the method further includes:
model training is carried out on the classification model through the historical text data based on a second intelligent contract which is pre-deployed in the block chain system, and an initialized discriminator is obtained;
inputting the historical text data and the corresponding data pair constructed by the countermeasure text data into a pre-constructed discriminator based on the first intelligent contract to obtain the trained generator and the discriminator, wherein the generator comprises:
and inputting the historical text data and the corresponding countermeasure text data into the initialized discriminator as a data pair based on the first intelligent contract to obtain the trained generator and the discriminator.
The embodiment of the specification provides a storage medium, which is characterized in that historical text data for a target service are acquired, then the historical text data are input into a pre-constructed generator to generate confrontation text data corresponding to the historical text data, the similarity between the confrontation text data and the historical text data is higher than a preset similarity threshold, the historical text data and the corresponding confrontation text data constructed data are input into a pre-constructed discriminator, the historical text data and the corresponding confrontation text data are respectively graded through the discriminator, the generator and the discriminator are trained through comparing the grading value of the historical text data with the grading value of the corresponding confrontation text data and a preset loss function, the trained generator and the discriminator are obtained, and finally, the target service is subjected to risk prevention and control processing based on the text data through the trained discriminator, therefore, the generator is introduced to generate the confrontation text data, the text attack confrontation capacity of the discriminator is improved in the mutual game process through the generator attack and discriminator defense mode, and the comparison scoring is introduced into the generator and the discriminator, so that the discriminator has better consistency on the identification effect of the confrontation mutation text data and the historical text data.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually making an Integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Hardware Description Language), traffic, pl (core universal Programming Language), HDCal (jhdware Description Language), lang, Lola, HDL, laspam, hardward Description Language (vhr Description Language), vhal (Hardware Description Language), and vhigh-Language, which are currently used in most common. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, and an embedded microcontroller, examples of which include, but are not limited to, the following microcontrollers: the ARC625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic for the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may thus be considered a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the various elements may be implemented in the same one or more software and/or hardware implementations in implementing one or more embodiments of the present description.
As will be appreciated by one skilled in the art, embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, one or more embodiments of the present description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of the present description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present description are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the description. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable fraud case serial-parallel apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable fraud case serial-parallel apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable fraud case to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable fraud case serial-parallel apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, one or more embodiments of the present description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of the present description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
One or more embodiments of the present description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. One or more embodiments of the specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only an example of the present specification, and is not intended to limit the present specification. Various modifications and alterations to this description will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present specification should be included in the scope of the claims of the present specification.

Claims (17)

1. A text-based risk prevention and control processing method, the method comprising:
acquiring historical text data aiming at a target service;
inputting the historical text data into a pre-constructed generator, and generating confrontation text data corresponding to the historical text data, wherein the similarity between the confrontation text data and the historical text data is higher than a preset similarity threshold;
inputting the historical text data and the corresponding data pairs constructed by the countermeasure text data into a pre-constructed discriminator, scoring the historical text data and the corresponding countermeasure text data respectively through the discriminator, and training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the trained generator and the trained discriminator;
and carrying out risk prevention and control processing based on text data on the target service through the trained discriminator.
2. The method of claim 1, the discriminators being constructed from a preset classification model, the method further comprising:
performing model training on the classification model based on the historical text data to obtain an initialized discriminator;
inputting the historical text data and the data pair constructed by the corresponding confrontation text data into a pre-constructed discriminator to obtain the trained generator and the trained discriminator, wherein the generator and the discriminator comprise:
and inputting the historical text data and the corresponding confrontation text data as data pairs into the initialized discriminator to obtain the trained generator and the discriminator.
3. The method of claim 2, the preset classification model being a textCNN model, an LSTM model, or a BERT model.
4. The method of claim 3, the inputting the historical text data into a pre-constructed generator, generating confrontational text data corresponding to the historical text data, comprising:
inputting the historical text data into a pre-constructed generator, and generating countermeasure text data corresponding to the historical text data by taking a first loss function as an optimization target, wherein the first loss function is determined by the distance metric between the historical text data and the corresponding countermeasure text data to restrict loss and the score contrast loss between the historical text data and the corresponding countermeasure text data.
5. The method of claim 4, the first loss function comprising a minimum of a difference between a distance between the historical text data and the corresponding countermeasure text data and a square of a difference between logarithms of scores of the historical text data and the corresponding countermeasure text data.
6. The method of claim 4, wherein the inputting the historical text data into a pre-constructed generator and generating countermeasure text data corresponding to the historical text data with a first loss function as an optimization objective comprises:
and inputting the historical text data into a pre-constructed generator, and generating countermeasure text data corresponding to the historical text data in a mode of optimizing through Policy Gradient with a first loss function as a long-term benefit.
7. The method of claim 1, wherein the historical text data and the corresponding confrontation text data are input into a pre-constructed discriminator as a data pair, the historical text data and the corresponding confrontation text data are respectively scored by the discriminator, and the generator and the discriminator are trained by comparing the score of the historical text data and the score of the corresponding confrontation text data with a preset loss function, so as to obtain the trained generator and the trained discriminator, and the method comprises the following steps:
inputting the historical text data and the corresponding countermeasure text data into a pre-constructed discriminator as a data pair, scoring the historical text data and the corresponding countermeasure text data through the discriminator, respectively, and training the generator and the discriminator by comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and taking a second loss function as an optimization target to obtain the trained generator and the discriminator, wherein the second loss function is determined by the classification loss of the historical text data and the corresponding countermeasure text data and the score comparison loss between the historical text data and the corresponding countermeasure text data.
8. The method of claim 7, wherein the classification loss of the historical text data and the corresponding countermeasure text data is determined based on a score value corresponding to the historical text data and a category to which the historical text data belongs, and the score contrast loss between the historical text data and the corresponding countermeasure text data is determined based on a second weight preset as a sum of squares of differences between logarithms of scores of the historical text data and the corresponding countermeasure text data.
9. The method of claim 1, wherein the processing of risk control based on text data for the target service by the trained arbiter comprises:
acquiring target text data to be detected of the target service;
and inputting the target text data into the trained discriminator to obtain a risk identification result corresponding to the target text data.
10. A risk prevention and control processing method based on text is applied to a block chain system, and comprises the following steps:
acquiring training rule information of a generator and a discriminator in risk prevention and control based on text data of a target service, generating a corresponding first intelligent contract based on the training rule information of the generator and the discriminator, and deploying the first intelligent contract into the block chain system;
when historical text data aiming at a target service are acquired, calling the first intelligent contract, inputting the historical text data into a pre-constructed generator, and generating confrontation text data corresponding to the historical text data, wherein the similarity between the confrontation text data and the historical text data is higher than a preset similarity threshold;
inputting the historical text data and the corresponding data pairs constructed by the countermeasure text data into a pre-constructed discriminator based on the first intelligent contract, scoring the historical text data and the corresponding countermeasure text data respectively through the discriminator, and training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the generator and the discriminator after training;
and providing the trained discriminator for the target business based on the first intelligent contract so as to perform risk prevention and control processing based on text data on the target business.
11. The method of claim 10, the discriminators being constructed from a preset classification model, the method further comprising:
model training is carried out on the classification model through the historical text data based on a second intelligent contract which is pre-deployed in the block chain system, and an initialized discriminator is obtained;
inputting the historical text data and the corresponding data pair constructed by the countermeasure text data into a pre-constructed discriminator based on the first intelligent contract to obtain the trained generator and the discriminator, wherein the generator comprises:
and inputting the historical text data and the corresponding countermeasure text data into the initialized discriminator as a data pair based on the first intelligent contract to obtain the trained generator and the discriminator.
12. A text-based risk prevention processing apparatus, the apparatus comprising:
the historical data acquisition module is used for acquiring historical text data aiming at the target service;
the countermeasure data generation module is used for inputting the historical text data into a pre-constructed generator and generating countermeasure text data corresponding to the historical text data, and the similarity between the countermeasure text data and the historical text data is higher than a preset similarity threshold;
the training module is used for inputting the historical text data and the corresponding data pair constructed by the countermeasure text data into a pre-constructed discriminator, scoring the historical text data and the corresponding countermeasure text data through the discriminator respectively, and training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the generator and the discriminator after training;
and the risk prevention and control module is used for performing risk prevention and control processing based on text data on the target service through the trained discriminator.
13. A text-based risk prevention processing device, the device being a device in a blockchain system, the device comprising:
the contract deployment module is used for acquiring training rule information of a generator and a discriminator in risk control based on text data of a target service, generating a corresponding first intelligent contract based on the training rule information of the generator and the discriminator, and deploying the first intelligent contract into the block chain system;
the countermeasure text generation module is used for calling the first intelligent contract when historical text data aiming at the target service is obtained, inputting the historical text data into a pre-constructed generator and generating countermeasure text data corresponding to the historical text data, wherein the similarity between the countermeasure text data and the historical text data is higher than a preset similarity threshold;
the training module is used for inputting the historical text data and the corresponding data pairs constructed by the countermeasure text data into a pre-constructed discriminator based on the first intelligent contract, scoring the historical text data and the corresponding countermeasure text data through the discriminator respectively, and training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the generator and the discriminator after training;
and the discriminator application module is used for providing the trained discriminator for the target service based on the first intelligent contract so as to perform risk prevention and control processing based on text data on the target service.
14. A text-based risk prevention processing device, the text-based risk prevention processing device comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
acquiring historical text data aiming at a target service;
inputting the historical text data into a pre-constructed generator, and generating confrontation text data corresponding to the historical text data, wherein the similarity between the confrontation text data and the historical text data is higher than a preset similarity threshold;
inputting the historical text data and the corresponding data pairs constructed by the countermeasure text data into a pre-constructed discriminator, scoring the historical text data and the corresponding countermeasure text data respectively through the discriminator, and training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the trained generator and the trained discriminator;
and carrying out risk prevention and control processing based on text data on the target service through the trained discriminator.
15. A text-based risk prevention processing device, the device being a device in a blockchain system, the text-based risk prevention processing device comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
acquiring training rule information of a generator and a discriminator in risk prevention and control based on text data of a target service, generating a corresponding first intelligent contract based on the training rule information of the generator and the discriminator, and deploying the first intelligent contract into the block chain system;
when historical text data aiming at a target service are acquired, calling the first intelligent contract, inputting the historical text data into a pre-constructed generator, and generating confrontation text data corresponding to the historical text data, wherein the similarity between the confrontation text data and the historical text data is higher than a preset similarity threshold;
inputting the historical text data and the corresponding data pairs constructed by the countermeasure text data into a pre-constructed discriminator based on the first intelligent contract, scoring the historical text data and the corresponding countermeasure text data respectively through the discriminator, and training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the generator and the discriminator after training;
and providing the trained discriminator for the target service based on the first intelligent contract so as to perform risk prevention and control processing based on text data on the target service.
16. A storage medium for storing computer-executable instructions, which when executed by a processor implement the following:
acquiring historical text data aiming at a target service;
inputting the historical text data into a pre-constructed generator, and generating confrontation text data corresponding to the historical text data, wherein the similarity between the confrontation text data and the historical text data is higher than a preset similarity threshold;
inputting the historical text data and the corresponding data pairs constructed by the countermeasure text data into a pre-constructed discriminator, scoring the historical text data and the corresponding countermeasure text data respectively through the discriminator, and training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the trained generator and the trained discriminator;
and carrying out risk prevention and control processing based on text data on the target service through the trained discriminator.
17. A storage medium for storing computer-executable instructions, which when executed by a processor implement the following:
acquiring training rule information of a generator and a discriminator in risk prevention and control based on text data of a target service, generating a corresponding first intelligent contract based on the training rule information of the generator and the discriminator, and deploying the first intelligent contract into a block chain system;
when historical text data aiming at a target service are acquired, calling the first intelligent contract, inputting the historical text data into a pre-constructed generator, and generating confrontation text data corresponding to the historical text data, wherein the similarity between the confrontation text data and the historical text data is higher than a preset similarity threshold;
inputting the historical text data and the corresponding data pairs constructed by the countermeasure text data into a pre-constructed discriminator based on the first intelligent contract, scoring the historical text data and the corresponding countermeasure text data respectively through the discriminator, and training the generator and the discriminator through comparing the score value of the historical text data with the score value of the corresponding countermeasure text data and a preset loss function to obtain the generator and the discriminator after training;
and providing the trained discriminator for the target service based on the first intelligent contract so as to perform risk prevention and control processing based on text data on the target service.
CN202111266143.3A 2021-10-28 2021-10-28 Text-based risk prevention and control processing method, device and equipment Active CN113961704B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111266143.3A CN113961704B (en) 2021-10-28 2021-10-28 Text-based risk prevention and control processing method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111266143.3A CN113961704B (en) 2021-10-28 2021-10-28 Text-based risk prevention and control processing method, device and equipment

Publications (2)

Publication Number Publication Date
CN113961704A true CN113961704A (en) 2022-01-21
CN113961704B CN113961704B (en) 2024-06-14

Family

ID=79468168

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111266143.3A Active CN113961704B (en) 2021-10-28 2021-10-28 Text-based risk prevention and control processing method, device and equipment

Country Status (1)

Country Link
CN (1) CN113961704B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220201008A1 (en) * 2020-12-21 2022-06-23 Citrix Systems, Inc. Multimodal modelling for systems using distance metric learning
CN117349403A (en) * 2023-09-26 2024-01-05 北京慧博科技有限公司 Short message marketing method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101843066B1 (en) * 2017-08-23 2018-05-15 주식회사 뷰노 Method for classifying data via data augmentation of the data for machine-learning and apparatus using the same
CN109922155A (en) * 2019-03-18 2019-06-21 众安信息技术服务有限公司 The method and device of intelligent agent is realized in block chain network
CN112966112A (en) * 2021-03-25 2021-06-15 支付宝(杭州)信息技术有限公司 Text classification model training and text classification method and device based on counterstudy

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101843066B1 (en) * 2017-08-23 2018-05-15 주식회사 뷰노 Method for classifying data via data augmentation of the data for machine-learning and apparatus using the same
CN109922155A (en) * 2019-03-18 2019-06-21 众安信息技术服务有限公司 The method and device of intelligent agent is realized in block chain network
CN112966112A (en) * 2021-03-25 2021-06-15 支付宝(杭州)信息技术有限公司 Text classification model training and text classification method and device based on counterstudy

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220201008A1 (en) * 2020-12-21 2022-06-23 Citrix Systems, Inc. Multimodal modelling for systems using distance metric learning
CN117349403A (en) * 2023-09-26 2024-01-05 北京慧博科技有限公司 Short message marketing method and system

Also Published As

Publication number Publication date
CN113961704B (en) 2024-06-14

Similar Documents

Publication Publication Date Title
Crothers et al. Machine-generated text: A comprehensive survey of threat models and detection methods
Zhang et al. Explainable artificial intelligence applications in cyber security: State-of-the-art in research
US11537852B2 (en) Evolving graph convolutional networks for dynamic graphs
CN113361658B (en) Method, device and equipment for training graph model based on privacy protection
CN111506708A (en) Text auditing method, device, equipment and medium
Chai et al. An explainable multi-modal hierarchical attention model for developing phishing threat intelligence
Noviandy et al. Credit card fraud detection for contemporary financial management using xgboost-driven machine learning and data augmentation techniques
Bian et al. Image-based scam detection method using an attention capsule network
CN113221747B (en) Privacy data processing method, device and equipment based on privacy protection
CN113961704B (en) Text-based risk prevention and control processing method, device and equipment
Ra et al. DeepAnti-PhishNet: Applying deep neural networks for phishing email detection
CN111930623A (en) Test case construction method and device and electronic equipment
CN114429222A (en) Model training method, device and equipment
Mambina et al. Classifying swahili smishing attacks for mobile money users: A machine-learning approach
CN113837638A (en) Method, device and equipment for determining dialect
Naiem et al. Enhancing the efficiency of gaussian naïve bayes machine learning classifier in the detection of ddos in cloud computing
CN110263817B (en) Risk grade classification method and device based on user account
CN113221717B (en) Model construction method, device and equipment based on privacy protection
Soni et al. Learning-Based Model for Phishing Attack Detection
Zhao et al. Natural backdoor attacks on deep neural networks via raindrops
CN110705622A (en) Decision-making method and system and electronic equipment
Papaioannou et al. Risk-based user authentication for mobile passenger ID devices for land and sea border control
Luo et al. Ai-powered fraud detection in decentralized finance: A project life cycle perspective
Linh et al. Real-time phishing detection using deep learning methods by extensions
CN115210722A (en) Method and system for graph computation using hybrid inference

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant