CN117787922B - Method, system, equipment and medium for processing money-back service based on distillation learning and automatic learning - Google Patents

Method, system, equipment and medium for processing money-back service based on distillation learning and automatic learning Download PDF

Info

Publication number
CN117787922B
CN117787922B CN202410211499.4A CN202410211499A CN117787922B CN 117787922 B CN117787922 B CN 117787922B CN 202410211499 A CN202410211499 A CN 202410211499A CN 117787922 B CN117787922 B CN 117787922B
Authority
CN
China
Prior art keywords
model
mobilebert
learning
user
dimension
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410211499.4A
Other languages
Chinese (zh)
Other versions
CN117787922A (en
Inventor
伍思文
洪建帮
陈春旺
刘婷
林诗杰
曹磊
鲍晓莹
翁志鹏
夏骏
丁有韬
罗卓尔
孙国为
金龙
凌凯文
戴剑芸
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank Of East Asia China Co ltd
Original Assignee
Bank Of East Asia China Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank Of East Asia China Co ltd filed Critical Bank Of East Asia China Co ltd
Priority to CN202410211499.4A priority Critical patent/CN117787922B/en
Publication of CN117787922A publication Critical patent/CN117787922A/en
Application granted granted Critical
Publication of CN117787922B publication Critical patent/CN117787922B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a method, a system, equipment and a medium for processing a money back-washing service based on distillation learning and automatic learning, which relate to the technical field of money back-washing early warning, and comprise the following steps: performing PEP list filtering according to the target list filtering model to obtain a filtered target list; inputting the filtered target list and the corresponding transaction information thereof into a blacklist service system for user screening to obtain a blacklist user; checking the blacklist user after screening through a target checking and matching model, and screening users with matching results of user matching information corresponding to the blacklist user lower than a preset matching threshold; and outputting the screened user verification result to a service processing system. The money laundering model system based on distillation learning and automatic learning has great improvement on reducing risk, reducing invalid alarm, assisting in checking and improving business efficiency.

Description

Method, system, equipment and medium for processing money-back service based on distillation learning and automatic learning
Technical Field
The application relates to the technical field of money laundering early warning, in particular to a money laundering business processing method, a system, equipment and a medium based on distillation learning and automatic learning.
Background
At present, the money back-flushing business in banking business needs to carry out money back-flushing early warning scanning for retail, public and internet customers when relevant business is carried out. In the related art, batch and real-time scanning is performed through a blacklist service system Black NAME SCREENING (BLN), suspicious clients are identified by comparison with sanctions such as road penetration, red traffic and the like and supervision lists, then an alarm is sent out, and after manual checking, a result is pushed to a downstream system for business use. However, the BLN system and its algorithm predict that there are very many false alarms, resulting in huge workload of subsequent manager checking, high manpower cost and low overall operation efficiency.
Disclosure of Invention
The application aims to provide a method, a system, equipment and a medium for processing money back washing business based on distillation learning and automatic learning, so as to improve the early warning accuracy of money back washing.
In a first aspect, the present invention provides a method for processing money laundering service based on distillation learning and automatic learning, the method comprising:
performing PEP list filtering according to the target list filtering model to obtain a filtered target list; the target list filtering model is used for predicting the PEP text through the distillation deep learning NLP model;
inputting the filtered target list and the transaction information corresponding to the target list into a blacklist service system for user screening to obtain blacklist users;
Checking the blacklist user after screening through a target checking and matching model, and screening users with matching results of user matching information corresponding to the blacklist user lower than a preset matching threshold;
And outputting the screened user verification result to a service processing system.
In an alternative embodiment, the targeted list filtering model includes a MobileBERT-multi-dimension model that improves a transducer-based MobileBERT model of distilled learning natural language processing;
the MobileBERT-multi-dimension model includes a multi-dimensional embedded layer; wherein the MobileBERT-multi-dimension model includes a first MobileBERT network of a first number of layers in a first dimension, a second MobileBERT network of a second number of layers in a second dimension, and a third MobileBERT network of a third number of layers in a third dimension;
the MobileBERT-multi-dimension model is formed by splicing MobileBERT networks with different dimensions, and the input and output of each module of IB-BERT and MobileBERT-multi-dimension are kept consistent.
In an alternative embodiment, the MobileBERT-multi-dimension model includes a multi-dimensional embedded layer MobileBERT network combination of 64-dimensional 8 layers, 128-dimensional 8 layers, and 256-dimensional 8 layers.
In an alternative embodiment, the target verification matching model is a hybrid architecture MobileViT model of a convolutional neural network and a transducer network model;
The method further comprises the steps of:
taking the comparison result of the confirmed photo and the confirmed age as a picture label, and training an initial MobileViT model based on a training sample until the model converges;
And identifying the user list of which the user images are not matched with the ages in the target list through a trained target MobileViT model.
In an alternative embodiment, the method further comprises:
identifying an image reasoning result through a MobileViT model, and deriving an image variable as one of the characteristics of the random forest model;
when the variable is derived, setting the matching result of the variable as a preset code according to a sponsored rule;
Splicing variables corresponding to the user information according to the preset codes to obtain target feature codes;
training and predicting a random forest model through the target feature codes, and automatically learning by using a silent feature code library based on the prediction result of the random forest model.
In an alternative embodiment, the method further comprises:
Automatically storing target feature codes which are inconsistent in the judgment of the prediction result and the checking result of the random forest model into a silent feature code library; the target feature codes are obtained by coding various user information through a preset rule coder;
And when the silencing duration of the target feature code in the silencing feature code library exceeds a preset silencing duration, the target feature code is released from the silencing feature code library until the target feature code is inconsistent with the checking result again, and then the target feature code is stored in the silencing feature code library.
In an alternative embodiment, the length of silencing is determined by:
Wait Days=A+Max ((B-Min (last days,C)–pass alerts amount/D),E)
Wherein Wait Days is the number of Days that silencing is required; last days is the last silencing ex-warehouse distance today; PASS ALERTS am count is the number of alarms with pass as the result of the checked and model prediction is consistent; A. b, C, D, E is a constant.
In a second aspect, the present invention provides a money laundering service processing system based on distillation learning and automatic learning, the system comprising:
The list filtering module is used for performing PEP list filtering according to the target list filtering model to obtain a filtered target list; the target list filtering model is used for predicting the PEP text through the distillation deep learning NLP model;
The blacklist user screening module is used for inputting the filtered target list and the transaction information corresponding to the target list into the blacklist service system for user screening to obtain blacklist users;
The verification matching module is used for verifying the blacklist users after screening through the target verification matching model, and screening users with matching results of the user matching information corresponding to the blacklist users lower than a preset matching threshold;
And the service processing module is used for outputting the screened user verification result to the service processing system.
In a third aspect, the present invention provides an electronic device comprising a processor and a memory storing computer executable instructions executable by the processor to implement the method of distillation learning and auto-learning based money laundering business of any of the previous embodiments.
In a fourth aspect, the present invention provides a computer readable storage medium storing computer executable instructions that, when invoked and executed by a processor, cause the processor to implement the method for processing money laundering based on distillation learning and automatic learning according to any of the previous embodiments.
According to the distillation learning and automatic learning-based money back-flushing service processing method, system, equipment and medium provided by the application, PEP list filtering is carried out through the target list filtering module (an NLP model of an improved distillation learning algorithm), the number of reduced invalid alarms is remarkable, and the accuracy of the verification effect is high when manual spot check is carried out; the accuracy of the CV lightweight model and the automatic learning mode-silencing feature code library which use the target verification matching model after segmentation according to the threshold value is greatly improved. Therefore, the money laundering model system based on distillation learning and automatic learning has great improvement on the aspects of reducing risk, reducing invalid alarms, assisting in checking and improving business efficiency.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present application, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for processing money laundering business based on distillation learning and automatic learning according to an embodiment of the present application;
Fig. 2 is a network structure diagram of MobileBERT according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a silent feature library according to an embodiment of the present application;
Fig. 4 is a schematic flow chart in practical application according to an embodiment of the present application;
FIG. 5 is a block diagram of a money laundering service processing system based on distillation learning and automatic learning according to an embodiment of the present application;
Fig. 6 is a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
The embodiment of the application provides a money laundering service processing method based on distillation learning and automatic learning, which is shown in fig. 1 and mainly comprises the following steps:
Step S110, PEP list filtering is carried out according to the target list filtering model, and a filtered target list is obtained; the target list filtering model is used for predicting the PEP text through the distillation deep learning NLP model.
In one embodiment, the target list filtering model is a natural language processing (Natural Language Processing, NLP) model that learns through knowledge distillation, and the text information of the PEP list is subjected to definition prediction through a distillation deep learning NLP model, so that PEP (Person of Exceptional Potential, i.e. "people with special potential") list filtering is performed.
The above-described target list filtering model includes a MobileBERT-multi-dimension model that improves a transducer-based MobileBERT model of distillation learning natural language processing; the MobileBERT-multi-dimension model includes a multi-dimensional embedded layer; wherein the MobileBERT-multi-dimension model includes a first MobileBERT network of a first number of layers in a first dimension, a second MobileBERT network of a second number of layers in a second dimension, and a third MobileBERT network of a third number of layers in a third dimension;
The MobileBERT-multi-dimension model is implemented by splicing MobileBERT networks with different dimensions, and the input and output of each module of IB-BERT and MobileBERT-multi-dimension are kept consistent.
In an alternative embodiment, the MobileBERT-multi-dimension model includes a multi-dimensional embedded layer MobileBERT network combination of 64-dimensional 8 layers, 128-dimensional 8 layers, and 256-dimensional 8 layers.
In a specific operation, the following steps may be adopted:
And marking the predefined pep name list as1 and marking the undefined name list as 0 through pep text in the history alarm checking case confirmed list and pep text confirmed by the compliance department. Each PEP defines a token sequence as text. Improved distillation learning NLP model based on transducer MobileBERT: mobileBERT-Multi-dimension. 1) Knowledge distillation definition: the teacher model of the larger network is used for accurate learning, and knowledge is transmitted to the student model of the smaller network through the teacher model for learning. The relative magnitude of the probabilities of the various classes in the teacher model result implicitly includes knowledge, also known as the mapping from input vectors to output vectors, and the knowledge in the teacher model includes correct information and also includes the relative relationships between error information.
2) Using a soft target (soft targets): one way to migrate the generalization ability of a large model to a small model is to train the small model with the class probabilities generated by the large model as soft targets, replacing hard targets as training targets. And wherein the degree to which the teacher probability target contains knowledge can be adjusted by adjusting the temperature (T), the greater the T, the more knowledge the probability tag contains. For example, the true labels of alarm A, B, C are pass, pass, block, block, i.e., 0,1, and 0.1, 0.3, 0.7, 0.9, predicted by the teacher model. Therefore, the original label is converted into a teacher label which can distinguish the same kind of different alarms, and the student model is used for learning.
3) The network structure of Mobile-Bert is shown in fig. 2, which is a block structure diagram of Bidirectional Encoder Representations from Transformers (Bert), inverted-Bottleneck BERT (IB-Bert) and MobileBERT, respectively, and table 1 is an overall network parameter design. IB-BERT and MobileBERT have the same number of layers as bert_l, but the use of each layer MobileBERT is reduced by about eight times the width Embedding, thus providing about twenty times the parameter reduction compared to bert_l, while maintaining considerable accuracy.
Network structure of Table 1 MobileBERT
4) However, the structure of each MobileBERT of the network is consistent, and for texts with different lengths, the information capturing capability of texts with different categories is relatively single, so that the MobileBERT network is modified according to the embodiment of the application, and the parameters of the modified network are shown in table 2:
TABLE 2 MobileBERT network structure for each dimension
By constructing the multi-scale embedding (embedded layer) MobileBERT. The original 128-dimensional 24-layer stacked network was modified to a multi-dimensional embedding MobileBERT network combination of 64-dimensional 8-layers, 128-dimensional 8-layers, 256-dimensional 8-layers total 24-layers, and the network structure is as shown in table 2. The method realizes diversification and enrichment of the original network, so as to better extract and fuse multi-granularity characteristics of different texts, and more accurately extract real definitions and details in the PEP text, thereby improving the richness and robustness of the algorithm.
MobileBERT networks of different dimensions are spliced, upper block 256 dimension, middle block 128 dimension, lower block 64 dimension, and multi-dimension transformers are stacked as shown in table 3:
TABLE 3 MobileBERT-Multi-dimension Overall network architecture
Through the network structure, the enrichment of global and local codes of the text can be realized, so that the learning capacity of the network is improved. By keeping the block inputs and outputs of IB-BERT and MobileBERT-multi-dimension 512, similar to original MobileBERT, feature map distillation can be performed at each layer, thereby maintaining its efficient knowledge migration network.
The comparison results of the scene model are shown in Table 4, and the accuracy of the scene model reaches 95.3% by using MobileBERT-multi-dimension model, which is close to the effect of Bert-base, and the amount of parameters used is only 30% of the original parameters. The effect is obviously improved compared with the original MobileBERT, and the parameter difference is not large; the effect of improving the model accuracy is obvious under the condition of limited computing resources.
Table 4 PEP text predicts model comparisons
Step S120, the filtered target list and the corresponding transaction information are input into a blacklist service system for user screening, and blacklist users are obtained.
The filtered target list is a money laundering suspicious user list, and the corresponding transaction information can comprise trade financing, core deposit and withdrawal, foreign currency settlement, foreign exchange payment platform, second generation payment, network loan, credit card approval, network transaction and the like.
And (3) carrying out batch and real-time scanning on the filtered target list through a blacklist service system Black NAME SCREENING (BLN), identifying suspicious clients by comparing the sanctions such as road penetration, red traffic and the like and monitoring lists, sending an alarm, and pushing results to a downstream system for business use after manual checking.
And step S130, checking the blacklist user after screening through the target checking and matching model, and screening the users with the matching result of the user matching information corresponding to the blacklist user lower than a preset matching threshold.
In one embodiment, the target verification matching model is a hybrid architecture MobileViT model of a convolutional neural network and a transducer network model, and the MobileViT model is used for user image recognition, so as to recognize the age of the user in the user identity information and the stored photo information, so as to determine whether the age and the photo are consistent. And then taking the image recognition result as a characteristic of the input subsequent random forest model.
In one embodiment, when judging whether the age and the image are consistent through MobileViT models, firstly, using the comparison result of the confirmed photo and the age as a picture label, and training the initial MobileViT model based on training samples until the model converges; and further identifying the user list of which the user images are not matched with the ages in the target list through the trained target MobileViT model.
Further, recognizing an image reasoning result through the MobileViT model, and deriving an image variable as one of the characteristics of the random forest model; the image reasoning result is used for identifying whether the user image in the target list accords with the age matching. When the variable is derived, setting the matching result of the variable as a preset code according to a sponsored rule; splicing variables corresponding to the user information according to preset codes to obtain target feature codes; training and predicting a random forest model through target feature codes, and automatically learning by using a silent feature code library based on the prediction result of the random forest model.
In practical application, the method of the application is applied to the back-money-removing business processing of banking business. When the inventor further performs alarm analysis on the filtered list, the difficulty of improvement is found that the service has high requirement on the model, and reasonable alarm cannot be missed, namely the AUC is required to be extremely high, but the judgment by the manager has certain subjectivity and the sample has certain noise. The machine learning model is difficult to perfect, the data of the bank is very thin, the data volume similar to the Internet does not exist, the automatic training optimization mode cannot be used due to strict bank requirements, and under the scene, the automatic training optimization mode is a great challenge for modeling.
Through trial and error and trial innovation of a scheme, the application introduces an automatic learning mode-silent feature coding library by constructing a traditional machine learning model. The alert analysis module employs rules based on business approval from which refined rule variables are derived. The judgment rules applied by the business personnel in the investigation work are summarized through researching the investigation operation manual of the business party and communicating with the business party. The method mainly comprises the following steps:
1. And establishing a machine learning hit model to output the alarm hit probability. The direct pressure drop model predicts a low hit case. The derived variables generated using the above method are trained using a conventional machine learning model random forest. After repeated tuning, the AUC can reach more than 0.92.
2. The subthreshold values are cut using a model. After the model cuts the threshold, the accuracy is as high as 99.8% at below the threshold score, but there are still few alarms that are judged by the model to be pass but the sponsor to be block. The strict requirements of the service require the model to be optimized further.
3. A rule encoder is used. When the variable is derived, the matching result of the variable is set to 0,1 and 1 according to the sponsor rule. And splicing all the variables to obtain the rule encoder. Such as feature one: name match 0, feature two: sex 1, feature three: age 1, feature four: nationality 0. The resulting feature encoder is 0110. By this, the model uses 20 variables in total, and a 20-bit regular feature encoder is made.
4. Automatic learning mode-silent feature coding library.
1) And when the model is used, comparing the checking result of the manager or the spot check. The rule codes inconsistent with the administrative confirmation judgment are automatically put into the silent feature code library, as shown in fig. 3. That is, when the alarm feature hits the silent rule code, the model score will be scored as 1 and output to the manager for major verification.
2) When the silencing rules code is stored for more than a certain number of days, such as 30 days, the feature encoder automatically releases the silencing feature code from the silencing feature code library, and when the subsequent alarms with the same features appear, the codes are not forcedly scored as 1, and the codes are placed into the silencing feature code library again until the manager confirms that the verification is inconsistent.
In an alternative embodiment, the method further comprises:
Automatically storing target feature codes with inconsistent prediction results and checking results of random forest models into a silent feature code library; the target feature codes are obtained by coding various user information through a preset rule coder; and when the silencing time of the target feature code in the silencing feature code library exceeds the preset silencing time, the target feature code is released from the silencing feature code library until the target feature code is inconsistent with the checking result, and then the target feature code is stored in the silencing feature code library.
In an alternative embodiment, the period of silence is determined by:
Wait Days=A+Max ((B-Min (last days,C)–pass alerts amount/D),E)
Wherein Wait Days is the number of Days that silencing is required; last days is the last silencing ex-warehouse distance today; PASS ALERTS am count is the number of alarms with pass as the result of the checked and model prediction is consistent; A. b, C, D, E is a constant.
In a specific example, the silencing period may be:
Wait Days=10+Max ((60-Min (last days,30)–pass alerts amount/100),0)
wait Days: days of silencing required; last days: the last silent ex-warehouse distance is counted today; PASS ALERTS amounts; the sponsored audit and model prediction agree to the number of alarms for pass.
And step S140, outputting the screened user verification result to the service processing system.
The user checking result after screening is the result after the alarm analysis, further carries out manual initial check and manual review, and then enters a downstream system (namely a service processing system) to carry out subsequent service processing.
Optionally, if the user verification result after screening is that the user belongs to the blacklist user, the downstream service processing system processes a money back flushing alarm or other monitoring systems aiming at the determined blacklist user; if the filtered user verification result is that the user does not belong to the blacklist user, the downstream business processing system processes other transaction business aiming at the conventional user.
For the method of the present application, the machine learning model and the deep learning model used in the scheme may be replaced by similar algorithms, for example, the machine learning model lightgbm used in the present example may be replaced by models such as catboost and adaboost, or some parameters may be adjusted, and similar results may also be obtained. The deep learning model may also be approximated using other transducer models or non-transducer models, such as MobileBERT may be substituted for Roberta, electric, GPT, distill-Bert, bert-Large, mobileBERT tiny, etc., as examples only and not as specific limitations.
In summary, referring to fig. 4, in the conventional back money laundering manner, two processing logics of AI preprocessing and AI alarm analysis are added, wherein the AI preprocessing uses a distillation deep learning algorithm model, and the AI alarm analysis uses a rule engine, a machine learning algorithm model and a deep learning algorithm model. The application upgrades the list matching into the user portrait matching, and combines the actual business situation to carry out fine adjustment, so that the client and the blacklist information are more comprehensively compared, and the final alarm accuracy is improved.
And (3) using an AI preprocessed pep list filter to innovate an NLP model of a distillation learning algorithm. Finally, combining business rules and understanding, PEP definition of 40% alarms is predicted to be inconsistent with PEP, the number of reduced invalid alarms is obvious, and the accuracy of the verification effect is high when manual spot check is performed. The CV lightweight model and the automatic learning mode-silencing feature encoding library using AI alarm analysis cut out alarms by threshold of about 20%. The accuracy reaches 99.95 percent. Overall, the optimization rate of the project is about 60%.
The money laundering model system based on distillation learning and automatic learning has remarkable effects in reducing risks, reducing invalid alarms, assisting in checking, improving business efficiency and the like. The advanced enabling and the use of the AI technology in the service are promoted, and an important step is provided in the field of money-back-washing alarm models without the same scene in the industry.
Based on the method embodiment, the embodiment of the application also provides a money laundering service processing system based on distillation learning and automatic learning, which is shown in fig. 5 and mainly comprises the following parts:
The list filtering module 510 is configured to perform PEP list filtering according to the target list filtering model, to obtain a filtered target list; the target list filtering model is used for predicting the PEP text through the distillation deep learning NLP model;
The blacklist user screening module 520 is configured to input the filtered target list and the transaction information corresponding to the target list into the blacklist service system for user screening, so as to obtain a blacklist user;
The verification matching module 530 is configured to verify the blacklist user after screening through the target verification matching model, and screen the user whose matching result of the user matching information corresponding to the blacklist user is lower than a preset matching threshold;
and the service processing module 540 is configured to output the filtered user verification result to the service processing system.
In a possible embodiment, the target list filtering model includes a MobileBERT-multi-dimension model that improves a transform-based MobileBERT model of distilled learning natural language processing;
The MobileBERT-multi-dimension model includes a multi-dimensional embedded layer; wherein the MobileBERT-multi-dimension model includes a first MobileBERT network of a first number of layers in a first dimension, a second MobileBERT network of a second number of layers in a second dimension, and a third MobileBERT network of a third number of layers in a third dimension;
The MobileBERT-multi-dimension model is implemented by splicing MobileBERT networks with different dimensions, and the input and output of each module of IB-BERT and MobileBERT-multi-dimension are kept consistent.
In a possible embodiment, the MobileBERT-multi-dimension model includes a multi-dimensional embedded layer MobileBERT network combination of 64-dimensional 8-layers, 128-dimensional 8-layers, and 256-dimensional 8-layers.
In a possible implementation, the target verification matching model is a hybrid architecture MobileViT model of a convolutional neural network and a transducer network model;
The device further comprises:
taking the comparison result of the confirmed photo and the confirmed age as a picture label, and training an initial MobileViT model based on a training sample until the model converges;
And identifying the user list of which the user images are not matched with the ages in the target list through the trained target MobileViT model.
In a possible embodiment, the above device further comprises:
identifying an image reasoning result through a MobileViT model, and deriving an image variable as one of the characteristics of the random forest model;
when the variable is derived, setting the matching result of the variable as a preset code according to a sponsored rule;
Splicing variables corresponding to the user information according to preset codes to obtain target feature codes;
training and predicting a random forest model through target feature codes, and automatically learning by using a silent feature code library based on the prediction result of the random forest model.
In a possible embodiment, the above device further comprises:
automatically storing target feature codes with inconsistent prediction results and checking results of random forest models into a silent feature code library; the target feature codes are obtained by coding various user information through a preset rule coder;
and when the silencing time of the target feature code in the silencing feature code library exceeds the preset silencing time, the target feature code is released from the silencing feature code library until the target feature code is inconsistent with the checking result, and then the target feature code is stored in the silencing feature code library.
In a possible embodiment, the silencing period is determined by:
Wait Days=A+Max ((B-Min (last days,C)–pass alerts amount/D),E)
Wherein Wait Days is the number of Days that silencing is required; last days is the last silencing ex-warehouse distance today; PASS ALERTS am count is the number of alarms with pass as the result of the checked and model prediction is consistent; A. b, C, D, E is a constant.
The embodiment of the application provides a back-money-washing business processing system based on distillation learning and automatic learning, which has the same implementation principle and technical effects as those of the embodiment of the method, and for brief description, reference can be made to corresponding contents in the embodiment of the back-money-washing business processing method based on distillation learning and automatic learning, where the embodiment of the back-money-washing business processing system based on distillation learning and automatic learning is not mentioned.
The embodiment of the present application further provides an electronic device, as shown in fig. 6, which is a schematic structural diagram of the electronic device, where the electronic device 100 includes a processor 61 and a memory 60, and the memory 60 stores computer executable instructions that can be executed by the processor 61, and the processor 61 executes the computer executable instructions to implement any of the foregoing money laundering service processing methods based on distillation learning and automatic learning.
In the embodiment shown in fig. 6, the electronic device further comprises a bus 62 and a communication interface 63, wherein the processor 61, the communication interface 63 and the memory 60 are connected by means of the bus 62.
The memory 60 may include a high-speed random access memory (RAM, random Access Memory), and may further include a non-volatile memory (non-volatile memory), such as at least one disk memory. The communication connection between the system network element and at least one other network element is achieved via at least one communication interface 63 (which may be wired or wireless), and may use the internet, a wide area network, a local network, a metropolitan area network, etc. Bus 62 may be an ISA (Industry Standard Architecture ) bus, a PCI (PERIPHERAL COMPONENT INTERCONNECT, peripheral component interconnect standard) bus, or EISA (Extended Industry Standard Architecture ) bus, among others. The bus 62 may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, only one bi-directional arrow is shown in FIG. 6, but not only one bus or type of bus.
The processor 61 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in the processor 61 or by instructions in the form of software. The processor 61 may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), and the like; but may also be a digital signal Processor (DIGITAL SIGNAL Processor, DSP), application Specific Integrated Circuit (ASIC), field-Programmable gate array (FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory, and the processor 61 reads the information in the memory, and combines the hardware to complete the steps of the method for processing the money back washing business based on distillation learning and automatic learning of the previous embodiment.
The embodiment of the application also provides a computer readable storage medium, which stores computer executable instructions that, when being called and executed by a processor, cause the processor to implement the above-mentioned money laundering service processing method based on distillation learning and automatic learning, and the detailed implementation of the method can be found in the foregoing embodiment and will not be described herein.
The computer program product of the distillation learning and automatic learning-based money laundering service processing method, system, equipment and medium provided by the embodiment of the application comprises a computer readable storage medium storing program codes, and the instructions included in the program codes can be used for executing the method in the previous method embodiment, and specific implementation can be referred to the method embodiment and will not be repeated here.
The relative steps, numerical expressions and numerical values of the components and steps set forth in these embodiments do not limit the scope of the present application unless it is specifically stated otherwise.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer readable storage medium executable by a processor. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In the description of the present application, it should be noted that the terms "first," "second," "third," and the like are used merely to distinguish between descriptions and are not to be construed as indicating or implying relative importance.
In the description of the present application, it should also be noted that, unless explicitly specified and limited otherwise, the terms "connected," "connected," and "connected" are to be construed broadly, and may be either fixedly connected, detachably connected, or integrally connected, for example; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the above terms in the present application will be understood in specific cases by those of ordinary skill in the art.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application.

Claims (9)

1. A method for processing money laundering business based on distillation learning and automatic learning, which is characterized by comprising the following steps:
performing PEP list filtering according to the target list filtering model to obtain a filtered target list; the target list filtering model is used for predicting the PEP text through the distillation deep learning NLP model;
inputting the filtered target list and the corresponding transaction information thereof into a blacklist service system for user screening to obtain a blacklist user;
Checking the blacklist user after screening through a target checking and matching model, and screening users with matching results of user matching information corresponding to the blacklist user lower than a preset matching threshold;
Outputting the screened user verification result to a service processing system;
The target list filtering model includes a MobileBERT-multi-dimension model that improves a transducer-based MobileBERT model of distillation learning natural language processing;
the MobileBERT-multi-dimension model includes a multi-dimensional embedded layer; wherein the MobileBERT-multi-dimension model includes a first MobileBERT network of a first number of layers in a first dimension, a second MobileBERT network of a second number of layers in a second dimension, and a third MobileBERT network of a third number of layers in a third dimension;
the MobileBERT-multi-dimension model is formed by splicing MobileBERT networks with different dimensions, and the input and output of each module of IB-BERT and MobileBERT-multi-dimension are kept consistent.
2. The method for processing the money laundering service based on distillation learning and automatic learning according to claim 1, wherein the MobileBERT-multi-dimension model comprises a multi-dimensional embedded layer MobileBERT network combination of 64-dimensional 8 layers, 128-dimensional 8 layers and 256-dimensional 8 layers.
3. The method for processing the money laundering service based on distillation learning and automatic learning according to claim 1, wherein the target verification matching model is a hybrid architecture MobileViT model of a convolutional neural network and a Transformer network model;
The method further comprises the steps of:
taking the comparison result of the confirmed photo and the confirmed age as a picture label, and training an initial MobileViT model based on a training sample until the model converges;
and identifying a user list of which the user images are inconsistent with the age matching in the screened blacklist users through a trained target MobileViT model.
4. The method for processing money laundering service based on distillation learning and automatic learning according to claim 3, wherein the method further comprises:
identifying an image reasoning result through a MobileViT model, and deriving an image variable as one of the characteristics of the random forest model;
when the variable is derived, setting the matching result of the variable as a preset code according to a sponsored rule;
Splicing variables corresponding to the user information according to the preset codes to obtain target feature codes;
training and predicting a random forest model through the target feature codes, and automatically learning by using a silent feature code library based on the prediction result of the random forest model.
5. The method for processing money laundering service based on distillation learning and automatic learning according to claim 4, wherein the method further comprises:
Automatically storing target feature codes with inconsistent prediction results and checking results of random forest models into a silent feature code library; the target feature codes are obtained by coding various user information through a preset rule coder;
And when the silencing duration of the target feature code in the silencing feature code library exceeds a preset silencing duration, the target feature code is released from the silencing feature code library until the target feature code is inconsistent with the checking result again, and then the target feature code is stored in the silencing feature code library.
6. The method for processing money laundering service based on distillation learning and automatic learning according to claim 5, wherein the silencing period is determined by the following formula:
Wait Days=A+Max ((B-Min (last days,C)–pass alerts amount/D),E)
Wherein Wait Days is the number of Days that silencing is required; last days is the last silencing ex-warehouse distance today; PASS ALERTS am count is the number of alarms with pass as the result of the checked and model prediction is consistent; A. b, C, D, E is a constant.
7. A money laundering service processing system based on distillation learning and automatic learning, the system comprising:
The list filtering module is used for performing PEP list filtering according to the target list filtering model to obtain a filtered target list; the target list filtering model is used for predicting the PEP text through the distillation deep learning NLP model;
The blacklist user screening module is used for inputting the filtered target list and the corresponding transaction information thereof into the blacklist service system for user screening to obtain blacklist users;
The verification matching module is used for verifying the blacklist users after screening through the target verification matching model, and screening users with matching results of the user matching information corresponding to the blacklist users lower than a preset matching threshold;
the service processing module is used for outputting the screened user verification result to the service processing system;
The target list filtering model includes a MobileBERT-multi-dimension model that improves a transducer-based MobileBERT model of distillation learning natural language processing;
the MobileBERT-multi-dimension model includes a multi-dimensional embedded layer; wherein the MobileBERT-multi-dimension model includes a first MobileBERT network of a first number of layers in a first dimension, a second MobileBERT network of a second number of layers in a second dimension, and a third MobileBERT network of a third number of layers in a third dimension;
the MobileBERT-multi-dimension model is formed by splicing MobileBERT networks with different dimensions, and the input and output of each module of IB-BERT and MobileBERT-multi-dimension are kept consistent.
8. An electronic device comprising a processor and a memory, the memory storing computer executable instructions executable by the processor, the processor executing the computer executable instructions to implement the distillation learning and auto-learning based money laundering business processing method of any of claims 1 to 6.
9. A computer readable storage medium storing computer executable instructions which, when invoked and executed by a processor, cause the processor to implement the distillation learning and auto-learning based money laundering service processing method of any one of claims 1 to 6.
CN202410211499.4A 2024-02-27 2024-02-27 Method, system, equipment and medium for processing money-back service based on distillation learning and automatic learning Active CN117787922B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410211499.4A CN117787922B (en) 2024-02-27 2024-02-27 Method, system, equipment and medium for processing money-back service based on distillation learning and automatic learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410211499.4A CN117787922B (en) 2024-02-27 2024-02-27 Method, system, equipment and medium for processing money-back service based on distillation learning and automatic learning

Publications (2)

Publication Number Publication Date
CN117787922A CN117787922A (en) 2024-03-29
CN117787922B true CN117787922B (en) 2024-05-31

Family

ID=90396684

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410211499.4A Active CN117787922B (en) 2024-02-27 2024-02-27 Method, system, equipment and medium for processing money-back service based on distillation learning and automatic learning

Country Status (1)

Country Link
CN (1) CN117787922B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109919454A (en) * 2019-02-20 2019-06-21 中国银行股份有限公司 Anti money washing monitoring method and system
CN111127200A (en) * 2019-11-25 2020-05-08 中国建设银行股份有限公司 Method and device for monitoring suspicious transactions of anti-money laundering
CN111145026A (en) * 2019-12-30 2020-05-12 第四范式(北京)技术有限公司 Anti-money laundering model training method and device
CN111768305A (en) * 2020-06-24 2020-10-13 中国工商银行股份有限公司 Anti-money laundering identification method and device
CN112330512A (en) * 2020-11-27 2021-02-05 新华智云科技有限公司 Prediction method, system, equipment and storage medium of knowledge distillation learning model
CN114298287A (en) * 2022-01-11 2022-04-08 平安科技(深圳)有限公司 Knowledge distillation-based prediction method and device, electronic equipment and storage medium
WO2022126683A1 (en) * 2020-12-15 2022-06-23 之江实验室 Method and platform for automatically compressing multi-task-oriented pre-training language model
CN114880347A (en) * 2022-04-27 2022-08-09 北京理工大学 Method for converting natural language into SQL statement based on deep learning
KR20220160996A (en) * 2021-05-28 2022-12-06 주식회사 유스비 Face Authentication based Non-face-to-face Authentication and blacklist person identification system for Anti-Money Laundering and method of the same
CN116805162A (en) * 2023-04-27 2023-09-26 光控特斯联(重庆)信息技术有限公司 Transformer model training method based on self-supervision learning

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11544471B2 (en) * 2020-10-14 2023-01-03 Feedzai—Consultadoria e Inovação Tecnológica, S.A. Weakly supervised multi-task learning for concept-based explainability
US20230196024A1 (en) * 2021-12-21 2023-06-22 Genesys Cloud Services, Inc. Systems and methods relating to knowledge distillation in natural language processing models
US20240005648A1 (en) * 2022-06-29 2024-01-04 Objectvideo Labs, Llc Selective knowledge distillation

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109919454A (en) * 2019-02-20 2019-06-21 中国银行股份有限公司 Anti money washing monitoring method and system
CN111127200A (en) * 2019-11-25 2020-05-08 中国建设银行股份有限公司 Method and device for monitoring suspicious transactions of anti-money laundering
CN111145026A (en) * 2019-12-30 2020-05-12 第四范式(北京)技术有限公司 Anti-money laundering model training method and device
CN111768305A (en) * 2020-06-24 2020-10-13 中国工商银行股份有限公司 Anti-money laundering identification method and device
CN112330512A (en) * 2020-11-27 2021-02-05 新华智云科技有限公司 Prediction method, system, equipment and storage medium of knowledge distillation learning model
WO2022126683A1 (en) * 2020-12-15 2022-06-23 之江实验室 Method and platform for automatically compressing multi-task-oriented pre-training language model
KR20220160996A (en) * 2021-05-28 2022-12-06 주식회사 유스비 Face Authentication based Non-face-to-face Authentication and blacklist person identification system for Anti-Money Laundering and method of the same
CN114298287A (en) * 2022-01-11 2022-04-08 平安科技(深圳)有限公司 Knowledge distillation-based prediction method and device, electronic equipment and storage medium
CN114880347A (en) * 2022-04-27 2022-08-09 北京理工大学 Method for converting natural language into SQL statement based on deep learning
CN116805162A (en) * 2023-04-27 2023-09-26 光控特斯联(重庆)信息技术有限公司 Transformer model training method based on self-supervision learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于知识蒸馏的轻量型浮游植物检测网络;张彤彤;董军宇;赵浩然;李琼;孙鑫;;应用科学学报;20200530(03);全文 *

Also Published As

Publication number Publication date
CN117787922A (en) 2024-03-29

Similar Documents

Publication Publication Date Title
CN111695033B (en) Enterprise public opinion analysis method, enterprise public opinion analysis device, electronic equipment and medium
CN110009174B (en) Risk recognition model training method and device and server
CN108550065B (en) Comment data processing method, device and equipment
CN110163242B (en) Risk identification method and device and server
CN111177367B (en) Case classification method, classification model training method and related products
CN111444873A (en) Method and device for detecting authenticity of person in video, electronic device and storage medium
US20220358493A1 (en) Data acquisition method and apparatus for analyzing cryptocurrency transaction
JP2024507626A (en) Method and system for safely deploying artificial intelligence models
CN111753496B (en) Industry category identification method and device, computer equipment and readable storage medium
CN113806548A (en) Petition factor extraction method and system based on deep learning model
CN113159796A (en) Trade contract verification method and device
CN117558270B (en) Voice recognition method and device and keyword detection model training method and device
CN111259216B (en) Information identification method, device and equipment
CN114119191A (en) Wind control method, overdue prediction method, model training method and related equipment
CN117787922B (en) Method, system, equipment and medium for processing money-back service based on distillation learning and automatic learning
CN117195319A (en) Verification method and device for electronic part of file, electronic equipment and medium
CN116467930A (en) Transformer-based structured data general modeling method
CN115713399B (en) User credit evaluation system combined with third-party data source
CN115907968A (en) Wind control rejection inference method and device based on pedestrian credit
CN113706258B (en) Product recommendation method, device, equipment and storage medium based on combined model
CN114048330A (en) Risk conduction probability knowledge graph generation method, device, equipment and storage medium
CN114579876A (en) False information detection method, device, equipment and medium
CN116092094A (en) Image text recognition method and device, computer readable medium and electronic equipment
CN113657986A (en) Hybrid neural network-based enterprise illegal funding risk prediction method
CN112632219A (en) Method and device for intercepting junk short messages

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant