CN113554099A - Method and device for identifying abnormal commercial tenant - Google Patents

Method and device for identifying abnormal commercial tenant Download PDF

Info

Publication number
CN113554099A
CN113554099A CN202110849076.1A CN202110849076A CN113554099A CN 113554099 A CN113554099 A CN 113554099A CN 202110849076 A CN202110849076 A CN 202110849076A CN 113554099 A CN113554099 A CN 113554099A
Authority
CN
China
Prior art keywords
merchant
transaction
transaction data
network model
identified
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110849076.1A
Other languages
Chinese (zh)
Inventor
郑策
万高峰
刘清
刘阳
顾小微
任雅楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Unionpay Co Ltd
Original Assignee
China Unionpay Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Unionpay Co Ltd filed Critical China Unionpay Co Ltd
Priority to CN202110849076.1A priority Critical patent/CN113554099A/en
Publication of CN113554099A publication Critical patent/CN113554099A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention provides a method and a device for identifying an abnormal merchant, wherein the method comprises the following steps: acquiring merchant transaction data to be identified, wherein the merchant transaction data to be identified is obtained according to transaction data of merchants within preset time; determining a feature vector containing a preset number of feature fields from the transaction data of the merchant to be identified; the preset number of characteristic fields is determined according to the importance degree of each characteristic field in the transaction data of the merchant to be identified to the identification of the abnormal merchant; inputting the characteristic vector into a graph convolution network model to determine whether the merchant is an abnormal merchant, wherein the graph convolution network model comprises a transaction weight parameter, the transaction weight parameter is used for representing the interconnection degree of merchant nodes in the graph convolution network model, and the transaction weight parameter is determined by transaction times and transaction weights of users among merchants in historical merchant transaction data. Therefore, the accuracy of abnormal transaction data detection can be improved.

Description

Method and device for identifying abnormal commercial tenant
Technical Field
The present application relates to the field of network technologies, and in particular, to a method and an apparatus for identifying an abnormal merchant.
Background
With continuous innovation and rapid development of financial industry business, a payment system based on a clearing function in a business system is gradually improved and matured, and the coverage and influence of the business are increasingly improved. Meanwhile, various illegal transactions aiming at illegal profit also begin to appear in large quantities, and gradually present a complicated and large-scale situation, which greatly affects the stability of the financial environment. In order to ensure the safe and healthy operation of financial transactions, the analysis and handling of transactions involving violations are required.
In the prior art, transaction data is generally trained through a deep learning model/machine learning model, and the transaction data in production is detected through the trained deep learning model/machine learning model to detect abnormal transaction data. But the transaction data belongs to Non-Euclidean (Non-Euclidean) data which does not have a fixed topological structure, so that the application of a conventional deep learning model, such as a convolutional neural network, a neural network model and the like, is greatly limited; therefore, the traditional machine learning models, such as a Logistic model and a decision tree, can not realize effective detection and identification because the transaction data lack accurate shallow feature representation.
Therefore, there is a need for a method and apparatus for identifying abnormal merchants, which is used to improve the accuracy of detecting abnormal transaction data.
Disclosure of Invention
The embodiment of the invention provides a method and a device for identifying abnormal merchants, which are used for improving the accuracy of abnormal transaction data detection.
In a first aspect, an embodiment of the present invention provides a method for identifying an abnormal merchant, where the method includes:
acquiring merchant transaction data to be identified, wherein the merchant transaction data to be identified is obtained according to transaction data of merchants within preset time;
determining a feature vector containing a preset number of feature fields from the transaction data of the merchant to be identified; the characteristic fields with the preset number are determined according to the importance degree of each characteristic field in the transaction data of the to-be-identified merchant to the identification of the abnormal merchant;
inputting the characteristic vector into a graph convolution network model, and determining whether the merchant is merchant transaction data to be identified by an abnormal merchant, wherein the graph convolution network model comprises transaction weight parameters, the transaction weight parameters are used for representing the interconnection degree of merchant nodes in the graph convolution network model, and the transaction weight parameters are determined by transaction times and transaction weights of users among merchants in historical merchant transaction data.
In the method, the transaction data of the merchant to be identified comprises summarized transaction data of transactions of the merchant in preset time, and the transaction data of the merchant to be identified comprises user behavior information, such as transaction frequency information. Therefore, the to-be-identified merchant transaction data comprises information of merchant dimensions and user dimensions, and the identification accuracy of the to-be-identified merchant transaction data is improved. And determining the characteristic fields and the importance degrees of the characteristic fields of the merchant transaction data to be identified through a characteristic screening model, wherein the characteristic fields are in the preset number of fields with the maximum importance degree. Therefore, the characteristic vector can accurately represent the possibility that the merchant transaction data to be identified is abnormal merchant transaction data. The graph convolution network model comprises transaction weight parameters, the transaction weight parameters are used for representing the interconnection degree of the merchant nodes in the graph convolution network model, and the interconnection degree is determined by the transaction times and transaction weights of users among the merchant nodes. In this way, the graph convolution network model can acquire the identification result according to two dimensions, namely, the merchant dimension and the user dimension information.
Optionally, after the feature vector is input into the graph convolution network model to determine the recognition result of the transaction data of the merchant to be recognized, the method further includes:
verifying the identification result and the real result of the transaction data of the merchant to be identified;
and if the identification result is different from the real result, updating the feature screening model and the graph convolution network model according to the transaction data of the merchant to be identified containing the real result.
In the method, the real result can be the real result of the transaction data of the commercial tenant to be identified, which is determined by manual sampling, or can be a common judgment result obtained by a plurality of models for identifying abnormal transaction data of the commercial tenant; for example, each period of time (which may be periodic or aperiodic and determined as required) may extract part of the to-be-identified merchant transaction data, respectively input a plurality of models for identifying merchant abnormal transaction data to obtain a plurality of determination results, respectively determine the determination results as the normal merchant transaction data determination results or the proportion of the abnormal merchant transaction data determination results to all the determination results, when the proportion is greater than a certain proportion value, determine the determination results corresponding to the proportion as real results, and check the identification results obtained by the graph convolution network model determination; if the real result is different from the identification result, the characteristic screening model and the graph convolution network model are updated according to the transaction data of the merchant to be identified containing the real result, so that the characteristic screening model and the graph convolution network model can automatically adapt to the structural change of the transaction data of the merchant in production, the high accuracy is kept for a long time, and the long-term effectiveness of the characteristic screening model and the graph convolution network model is improved.
Optionally, determining a feature vector including a preset number of feature fields from the transaction data of the merchant to be identified includes: determining a feature vector containing a preset number of feature fields from the merchant transaction data to be identified through a feature screening model; the characteristic screening model is used for determining the importance degree of each characteristic field in the merchant transaction data to be identified and outputting the characteristic field of the merchant transaction data to be identified, wherein the importance degree of the characteristic field is positioned at the top N positions; and N is the preset number.
In the method, the importance degree is positioned in the characteristic field of the first N merchant transaction data to be identified; and N is the preset number. Therefore, the obtained feature vector can more accurately represent the abnormal possibility of the merchant transaction data to be identified, and the identification result of the merchant transaction data to be identified is further accurately obtained.
Optionally, the feature screening model is a random forest algorithm/distributed gradient enhancement library, the random forest algorithm/distributed gradient enhancement library is used for obtaining each field of the to-be-identified merchant transaction data and the importance degree of each field, sorting the importance degrees of each field from large to small, and selecting a preset number of fields from front to back according to the importance degree sorting as each feature field of the to-be-identified merchant transaction data.
In the method, the random forest algorithm and the distributed gradient enhancement library are enabled to select the preset number of characteristic fields with the maximum importance degree, and the characteristic vector is determined according to the preset number of characteristic fields. Therefore, the obtained feature vector can accurately represent the abnormal possibility of the transaction data of the merchant to be identified, and the identification result of the transaction data of the merchant to be identified is further accurately obtained.
Optionally, the graph convolution network model satisfies the following conditions:
Figure BDA0003181693540000041
wherein H(l)For the l-th hidden layer feature of the graph convolution network model,
Figure BDA0003181693540000042
a degree matrix of the network model is convolved for the graph,
Figure BDA0003181693540000043
an augmentation matrix of the transaction weight matrix composed of transaction weight parameters,
Figure BDA0003181693540000044
is an augmentation matrix of the adjacency matrix between the merchant nodes, theta(l)And (3) obtaining a ith layer convolution kernel parameter matrix of the graph convolution network model, wherein sigma is an activation function.
In the method, the augmentation matrix of the weight matrix is added in the calculation formula of the graph convolution network model, namely user behavior information is added, so that the problem that the identification result is inaccurate due to the fact that the formula is only over-fitted based on the dimensionality of a merchant is solved.
Optionally, in the training process of the graph convolution network model, a mode of alternately training the convolution kernel parameter matrix and the transaction weight matrix is adopted for any sample.
In the method, the training accuracy can be improved by alternately training the convolution kernel parameter matrix and the transaction weight matrix.
Optionally, the transaction weight parameter satisfies:
Figure BDA0003181693540000045
wherein the content of the first and second substances,
Figure BDA0003181693540000046
representing the number of transactions that occur with merchant i for C users between merchant i and merchant j,
Figure BDA0003181693540000047
the number of transaction strokes between the C users between the merchant i and the merchant j is represented, k is used for representing the user, (k is 1,2 … C),
Figure BDA0003181693540000048
representing the weight of the transaction between the C users and the merchant i in the graph volume network model,
Figure BDA0003181693540000049
representing the weight of the transaction between the C users and the merchant i in the graph convolution network model;
Figure BDA00031816935400000410
representing the number of transactions between the user and merchant i,
Figure BDA00031816935400000411
representing the number of transactions between the user and merchant j,
Figure BDA00031816935400000412
representing the transaction weight vector between the user and merchant i,
Figure BDA00031816935400000413
representing the transaction weight vector between the user and merchant j.
In the method, the connection between the two merchants is established through the behavior information of a plurality of users, namely, the transaction stroke number representation is carried out, and the weight value of the transaction between the users and the merchants is set. Therefore, the user behavior information between the merchants is calculated more accurately, and the accuracy of the transaction data identification result of the merchant to be identified is improved. Here, the initial weight may be set as a weight of a transaction between the user and the merchant, and the initial weight may be randomly preset by Xavier (an initialization function, which is a very effective neural network initialization method) or Gaussian (Gaussian) function, so as to obtain an accurate weight in model training.
Optionally, the feature screening model and the graph convolution network model are obtained by the following method, including: acquiring a sample data set, wherein the sample data set comprises merchant transaction data of normal labels, merchant transaction data of abnormal labels and merchant transaction data of no labels; training an initial feature screening model according to the sample data set to obtain the feature screening model; obtaining a feature vector of a sample through the feature screening model according to any sample in the sample data set; inputting the characteristic vector of the sample into an initial graph convolution network model for training to obtain the graph convolution network model; and the merchant node in the initial graph convolution network model is determined according to the sample data set. Initial feature screening model the initial graph convolution network model.
In the method, the sample data set comprises merchant transaction data of normal labels, merchant transaction data of abnormal labels and merchant transaction data of no labels. Thus, semi-supervised training is realized, so that the model learns the internal structure of the non-labeled merchant transaction data, and the model parameters are adjusted through the labeled merchant transaction data; the accuracy of the feature screening model and the graph convolution network model can be improved.
Optionally, training an initial feature screening model according to the sample data set includes:
inputting the sample data set into the initial characteristic screening model, determining fields contained in each transaction data in the sample data set, and determining the importance degree of each field in each transaction data, wherein the importance degree is used for representing the influence degree of the corresponding field on the transaction data which is abnormal merchant transaction data;
sorting the importance degrees of the fields from large to small, and selecting a preset number of fields from front to back according to the importance degree sorting as each characteristic field of the transaction data of each merchant;
determining a feature vector according to the feature field of each merchant transaction data;
inputting the feature vector of each merchant transaction data into the initial graph volume network model to obtain the identification result of each merchant transaction data;
and optimizing the initial characteristic screening model according to the recognition result of each merchant transaction data.
In the method, the fields contained in the transaction data and the importance degree of each field are obtained through an initial feature screening model, the preset number of fields with the maximum importance degree are determined as feature fields, and the feature vectors are obtained according to the feature fields. Therefore, when the transaction data to be identified are subsequently transacted, the feature vector capable of accurately representing the abnormal degree of the transaction data can be obtained through the feature screening model obtained after the training is completed.
Optionally, training an initial graph convolution network model according to the sample data set includes:
inputting the feature vector of each merchant transaction data in the sample data set into the initial graph volume network model to obtain the identification result of each merchant transaction data, wherein the initial graph volume network model represents a connecting line between a merchant node and a merchant node by using the feature vector of each merchant transaction data, and the connecting line represents user behavior information;
and optimizing a convolution kernel parameter matrix and transaction weight parameters in the initial graph convolution network model according to the recognition result of each merchant transaction data.
In the method, transaction weight parameters are added into an initial graph convolution network model to represent user behavior information. The weight value of the transaction between the user and the merchant and the initial weight value are randomly preset by an Xavier or Gaussian function and are adjusted in the subsequent training process. Therefore, the phenomenon that the accuracy is reduced due to overfitting caused by training of the initial graph convolution network model only according to the data of the merchant dimensionality is avoided.
Optionally, the method further includes: and updating the feature screening model and the graph convolution network model.
In the method, the merchant transaction data in the subsequent identification can be collected to serve as a sample updating feature screening model and a graph volume network model. Therefore, the accuracy of identifying the feature screening model and the graph convolution network model is improved.
In a second aspect, an embodiment of the present invention provides an apparatus for identifying an abnormal merchant, where the apparatus includes:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring merchant transaction data to be identified, and the merchant transaction data to be identified is obtained according to transaction data of merchants within preset time;
the processing module is used for determining a feature vector containing a preset number of feature fields from the transaction data of the merchant to be identified; the characteristic fields with the preset number are determined according to the importance degree of each characteristic field in the transaction data of the to-be-identified merchant to the identification of the abnormal merchant;
the processing module is further configured to input the feature vector into a graph convolution network model, and determine whether the merchant is an abnormal merchant, where the graph convolution network model includes a transaction weight parameter, the transaction weight parameter is used to represent the degree of interconnection of merchant nodes in the graph convolution network model, and the transaction weight parameter is determined by the transaction times and transaction weights of the user among merchants in historical merchant transaction data.
In a third aspect, an embodiment of the present application further provides a computing device, including: a memory for storing a program; a processor for calling the program stored in said memory and executing the method as described in the various possible designs of the first aspect according to the obtained program.
In a fourth aspect, embodiments of the present application further provide a computer-readable non-transitory storage medium including a computer-readable program which, when read and executed by a computer, causes the computer to perform the method as described in the various possible designs of the first aspect.
These and other implementations of the present application will be more readily understood from the following description of the embodiments.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a schematic diagram of an architecture for identifying an abnormal merchant according to an embodiment of the present invention;
fig. 2 is a schematic diagram of an architecture for identifying an abnormal merchant according to an embodiment of the present invention;
fig. 3 is a schematic flowchart of a method for identifying an abnormal merchant according to an embodiment of the present invention;
fig. 4(a) is a schematic diagram of a merchant node connection according to an embodiment of the present invention;
fig. 4(b) is a schematic diagram of a merchant node connection according to an embodiment of the present invention;
fig. 5 is a schematic flowchart of a method for identifying an abnormal merchant according to an embodiment of the present invention;
fig. 6 is a schematic diagram of an apparatus for identifying an abnormal merchant according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a system architecture for identifying an abnormal merchant according to an embodiment of the present invention, where the system architecture for identifying an abnormal merchant includes a feature screening model 102 and a graph convolution network model 103, where the feature screening model 102 and the graph convolution network model 103 are obtained by training through a sample data set including merchant transaction data of a normal tag, merchant transaction data of an abnormal tag, and merchant transaction data of an unlabeled tag, that is, through semi-supervised model training, a model learns an internal structure of merchant transaction data of an unlabeled tag, and adjusts a model parameter through merchant transaction data of a tag; the accuracy of the feature screening model 102 and the graph convolution network model 103 may be improved. After the trained feature screening model 102 and graph convolution network model 103 are obtained, the data acquisition unit 101 is used for obtaining the transaction data of the merchant to be identified, the transaction data of the merchant to be identified is cleaned and then input into the feature screening model 102 to obtain the feature vector, and the feature vector is input into the graph convolution network model 103 to obtain the identification result.
As shown in fig. 2, the system architecture further includes a feedback mechanism 204, configured to obtain an identification result of the to-be-identified merchant transaction data obtained through the data acquisition device 201, the feature screening model 202, and the graph convolution network model 203, and train the feature screening model 202 and the graph convolution network model 203 according to a verification condition of a real result of the to-be-identified merchant transaction data in the feedback mechanism 204 and the identification result of the to-be-identified merchant transaction data, so as to continuously update parameters of the feature screening model 202 and the graph convolution network model 203. The real result of the transaction data of the merchant to be identified can be determined through manual sampling, and can also be a common judgment result obtained by a plurality of models for identifying abnormal transaction data; for example, each period of time (which may be periodic or aperiodic, and is determined as required) may extract part of the to-be-identified merchant transaction data, respectively input a plurality of models for identifying abnormal transaction data to obtain a plurality of determination results, respectively determine the determination results as the normal transaction data determination results or the proportion of the abnormal transaction data identification results to all the determination results, and when the proportion is greater than a certain proportion value, determine the determination result corresponding to the proportion as a real result, where a method for obtaining the real result is not specifically limited.
Based on this, an embodiment of the present application provides a flow of a method for identifying an abnormal merchant, as shown in fig. 3, including:
301, acquiring merchant transaction data to be identified, wherein the merchant transaction data to be identified is obtained according to transaction data of merchants within a preset time;
step 302, determining a feature vector containing a preset number of feature fields from the transaction data of the merchant to be identified; the characteristic fields with the preset number are determined according to the importance degree of each characteristic field in the transaction data of the to-be-identified merchant to the identification of the abnormal merchant;
step 303, inputting the feature vector into a graph volume network model, and determining whether the merchant is to-be-identified merchant transaction data of an abnormal merchant, where the graph volume network model includes transaction weight parameters, the transaction weight parameters are used to represent the interconnection degree of merchant nodes in the graph volume network model, and the transaction weight parameters are determined by the transaction times and transaction weights of users among merchants in historical merchant transaction data.
In the method, the merchant transaction data to be identified comprises merchant behavior information and user behavior information. Therefore, the to-be-identified merchant transaction data comprises information of merchant dimensions and user dimensions, and the identification accuracy of the to-be-identified merchant transaction data is improved. And determining the characteristic fields and the importance degrees of the characteristic fields of the merchant transaction data to be identified through a characteristic screening model, wherein the characteristic fields are in the preset number of fields with the maximum importance degree. Therefore, the characteristic vector can accurately represent the possibility that the transaction data of the merchant to be identified is abnormal transaction data. The graph convolution network model comprises transaction weight parameters, the transaction weight parameters are used for representing the interconnection degree of the merchant nodes in the graph convolution network model, and the interconnection degree is determined by the transaction times and transaction weights of users among the merchant nodes. In this way, the graph convolution network model can acquire the identification result according to two dimensions, namely, the merchant dimension and the user dimension information.
The embodiment of the application provides a recognition result verification method, which comprises the following steps that after the feature vector input graph convolution network model judges the recognition result of the transaction data of the merchant to be recognized, the method further comprises the following steps: verifying the identification result and the real result of the transaction data of the merchant to be identified; and if the identification result is different from the real result, updating the feature screening model and the graph convolution network model according to the transaction data of the merchant to be identified containing the real result. That is to say, when the feature screening model and the graph convolution network model are applied in production, parameters can be updated according to needs, so that the feature screening model and the graph convolution network model can learn the change of a data structure, the identification accuracy is ensured, and the service life is prolonged. The real result is the real result that the transaction data of the merchant to be identified is normal transaction data or abnormal transaction data. And if the identification result is different from the real identification result, updating the feature screening model and the graph volume network model according to the transaction data of the merchant to be identified containing the real result.
The embodiment of the application provides a method for obtaining a feature vector, which determines the feature vector containing a preset number of feature fields from transaction data of a merchant to be identified, and comprises the following steps:
determining a feature vector containing a preset number of feature fields from the merchant transaction data to be identified through a feature screening model; the characteristic screening model is used for determining the importance degree of each characteristic field in the merchant transaction data to be identified and outputting the characteristic field of the merchant transaction data to be identified, wherein the importance degree of the characteristic field is positioned at the top N positions; and N is the preset number. That is to say, the feature vector containing the preset number of feature fields can be determined through the feature screening model, the feature screening model can acquire each field in the merchant transaction data to be identified, sort the fields according to the importance degree corresponding to each field, and select the first N feature fields of the merchant transaction data to be identified. Therefore, the abnormal possibility that the characteristic vector can represent the transaction data of the merchant to be identified to the maximum extent is ensured.
The embodiment of the application provides a feature screening model, the feature screening model is random forest algorithm/distributed gradient enhancement storehouse, random forest algorithm/distributed gradient enhancement storehouse is used for obtaining each field of the merchant transaction data of treating discernment with the importance degree of each field, will the importance degree of each field is from big to little ordered, selects the field of predetermineeing quantity from the front to the back according to the importance degree ordered and is regarded as each characteristic field of the merchant transaction data of treating discernment. That is, the feature screening model may be a random forest algorithm/distributed gradient enhancement library, which is only an example, and the feature screening model may also be other algorithms that can obtain feature vectors, and is not limited specifically. Here, the random forest algorithm/distributed gradient enhancement library may obtain each field of the merchant transaction data to be identified and the importance degree of each field, sort the fields according to the importance degree, select a preset number of fields with the largest importance degree as each feature field, and determine a feature vector according to each feature field. Therefore, the characteristic vector can represent the possibility that the transaction data of the merchant to be identified can be abnormal transaction data to a large extent, and the identification accuracy is improved.
The embodiment of the application provides a graph convolution network model formula, wherein the graph convolution network model satisfies the following conditions:
Figure BDA0003181693540000111
wherein H(l)For the l-th hidden layer feature of the graph convolution network model,
Figure BDA0003181693540000112
an augmentation matrix for a degree matrix of the graph convolution network model,
Figure BDA0003181693540000113
an augmentation matrix of the transaction weight matrix composed of transaction weight parameters,
Figure BDA0003181693540000114
is an augmentation matrix of the adjacency matrix between the merchant nodes, theta(l)And (3) obtaining a ith layer convolution kernel parameter matrix of the graph convolution network model, wherein sigma is an activation function. That is to say, an augmentation matrix of the weight matrix is added in the graph convolution network model formula, that is, a transaction weight capable of representing user behavior information is added, so that the graph convolution network model identifies abnormal transaction data based on the merchant dimension and the user dimension, the identification accuracy is improved, and the problem that the identification result is inaccurate because the formula is only over-fitted based on the merchant dimension is solved.
The embodiment of the application provides a model training method, wherein in the training process of the graph convolution network model, the convolution kernel parameter matrix and the transaction weight matrix are alternately trained for any sample. Wherein the content of the first and second substances,
Figure BDA0003181693540000115
can be changed to
Figure BDA0003181693540000116
As a convolutional layer formula. Therefore, the convolution kernel weight Θ can be fixed first, and alternate optimization in the optimization theory can be adoptedMultiplier Method (ADMM) optimization
Figure BDA0003181693540000117
And
Figure BDA0003181693540000118
then fixing the transaction weight
Figure BDA0003181693540000119
And optimizing theta, and alternately iterating until the model converges or the iteration reaches a specified number of times.
The embodiment of the application provides a transaction weight parameter determination formula, wherein the transaction weight parameter satisfies the following requirements:
Figure BDA00031816935400001110
wherein the content of the first and second substances,
Figure BDA0003181693540000121
representing the number of transactions that occur with merchant i for C users between merchant i and merchant j,
Figure BDA0003181693540000122
the number of transaction strokes between the C users between the merchant i and the merchant j is represented, k is used for representing the user, (k is 1,2 … C),
Figure BDA0003181693540000123
representing the weight of the transaction between the C users and the merchant i in the graph volume network model,
Figure BDA0003181693540000124
representing the weight of the transaction between the C users and the merchant i in the graph convolution network model;
Figure BDA0003181693540000125
representing the number of transactions between the user and merchant i,
Figure BDA0003181693540000126
representing the number of transactions between the user and merchant j,
Figure BDA0003181693540000127
representing the transaction weight vector between the user and merchant i,
Figure BDA0003181693540000128
representing the transaction weight vector between the user and merchant j. That is, the transaction weight parameter may be determined according to the same number of users and transaction strokes between merchants and the weight of the transaction between the users and the merchants. Namely, the connection between two merchants is established through the behavior information of the same multiple users, namely, the transaction stroke number representation, and the weight value of the transaction between the users and the merchants is set.
For example, as shown in fig. 4(a) and 4(b), if the transaction weight parameter is not set, the merchants i and j are only interconnected through two transactions with the user a, and the merchants m and n are interconnected through a total of 12 transactions of the users b, c, and d, in a scenario where a is in the adjacency matrix of the graph convolution network modelijAnd AmnThe values of (1) are all 1, which results in that the degree of association between the merchant nodes lacks of discrimination, and the identification accuracy of the merchant transaction data to be identified is low. And increasing the transaction weight parameter can represent the behavior information of each user (the transaction occurred with that merchant and the number of transactions), and the transaction weight matrix can be obtained by the above formula:
Figure BDA0003181693540000129
therefore, the user behavior information between the merchants is calculated more accurately, and the accuracy of the transaction data identification result of the merchant to be identified is improved. Here, the initial weight may be set as a weight of a transaction between the user and the merchant, and the initial weight may be randomly preset by Xavier (an initialization function, which is a very effective neural network initialization method) or Gaussian (Gaussian) function, so as to obtain an accurate weight in model training.
Based on the method, the embodiment of the application provides a method for determining a graph convolution network model formula, which takes transaction flow data and a graph database as examples, and sets the number of all merchant nodes as N ∈ Z for a graph G (V, E) formed by merchant nodes and related transactions+Wherein the input of each merchant node is F characteristic fields x selected by the characteristic screening modeli∈R1×F(i ═ 1, 2.. times, N), then the node feature matrix of the entire graph convolution network model can be represented as X ═ X1,x2,...,xN],X∈RN×F。A∈RN×NIs an inter-merchant-node Adjacency Matrix (Adjacency Matrix), the element A of whichijRepresenting any two nodes ViAnd VjThe connection condition between, connection is Aij1, otherwise Aij0. Diagonal matrix D ∈ RN×NIs a Degree Matrix (Degrid Matrix) whose diagonal elements
Figure BDA0003181693540000131
Representing the degree of each merchant node, namely the number of nodes connected by the node, the basic expression of the convolution layer in the graph convolution network model is as follows:
Figure BDA0003181693540000132
wherein
Figure BDA0003181693540000133
INIs an N-order identity matrix and is a compensation item for supplementing the self-connection weight of the nodes in the adjacent matrix. H(l)Is the l-th layer hidden feature of the graph convolution network model, H(0)Namely an initial node characteristic matrix X, theta is a convolution kernel parameter matrix, and finally an activation function sigma is added.
However, the above formulas (2) and (3) can show the transaction weight Q between the merchant i and the merchant jijTransaction weight matrix Q; augmented matrix of transaction weight matrix
Figure BDA0003181693540000134
After the formula (4) is introduced, the expression of the convolution layer of the graph convolution network model becomes:
Figure BDA0003181693540000135
in the above formula, < is a Hadamard Product (Hadamard Product) operation, i.e. multiplication of corresponding elements of two matrices with the same dimension, since non-zero entries defined in the adjacent matrix A are all 1, the adjacent matrix A has no effect
Figure BDA0003181693540000136
Can be equivalent to
Figure BDA0003181693540000137
In the pre-training of the graph-convolution network model, the number of transactions between the user and the merchant has been determined from an initial data set, i.e.
Figure BDA0003181693540000138
When the network parameters of the graph convolution network model are trained based on random Gradient Descent (Stochastic Gradient Description) as fixed values, only
Figure BDA0003181693540000139
The gradient of (2) has an effect on the parameter update, and thus can be expressed as:
Figure BDA0003181693540000141
thereby obtaining the final expression of convolution layer of graph convolution model
Figure BDA0003181693540000142
The embodiment of the application provides a method for training and identifying abnormal transaction data models, wherein the feature screening model and the graph convolution network model are obtained in the following way, and the method comprises the following steps:
acquiring a sample data set, wherein the sample data set comprises merchant transaction data of normal labels, merchant transaction data of abnormal labels and merchant transaction data of no labels;
training an initial feature screening model according to the sample data set to obtain the feature screening model;
obtaining a feature vector of a sample through the feature screening model according to any sample in the sample data set; inputting the characteristic vector of the sample into an initial graph convolution network model for training to obtain the graph convolution network model; and the merchant node in the initial graph convolution network model determines an initial graph convolution network model of the initial characteristic screening model according to the sample data set. That is, the initial feature screening model and the initial graph convolution network model are trained by a sample data set including merchant transaction data with normal labels, merchant transaction data with abnormal labels, and merchant transaction data without labels. The initial feature screening model and the initial graph convolution network model are subjected to semi-supervised training, so that the feature screening model and the graph convolution network model obtained by training can learn the internal structure of merchant transaction data and can adjust model parameters through the merchant transaction data with labels; the accuracy of the feature screening model and the graph convolution network model can be improved.
The embodiment of the application provides a method for training and identifying an abnormal transaction data model, which trains an initial feature screening model according to the sample data set and comprises the following steps: inputting the sample data set into the initial characteristic screening model, determining fields contained in each merchant transaction data in the sample data set, and determining the importance degree of each field in each merchant transaction data, wherein the importance degree is used for representing the influence degree of the corresponding field on the transaction data which is abnormal merchant transaction data; sorting the importance degrees of the fields from large to small, and selecting a preset number of fields from front to back according to the importance degree sorting as each characteristic field of the transaction data of each merchant; determining a feature vector according to the feature field of each merchant transaction data; inputting the feature vector of each merchant transaction data into the initial graph-convolution network model to obtain the identification result of each transaction data; and optimizing the initial characteristic screening model according to the identification result of each transaction data. For example, the transaction data includes fields of transaction amount, transaction type, transaction time, and transaction mode, and the fields of the transaction data and the corresponding importance degree are obtained through an initial feature screening model: the importance degrees of the transaction time, the transaction mode, the transaction amount and the transaction type are respectively 4, 3, 7 and 5, the transaction amount, the transaction type, the transaction time and the transaction type are sorted according to the importance degrees and are 7, 5, 4 and 3, the corresponding fields are sorted into the transaction amount, the transaction type, the transaction time and the transaction mode, the preset number is 3, the first three fields are selected as the feature fields, the transaction amount, the transaction type and the transaction time, the feature vectors are obtained according to the transaction amount, the transaction type and the transaction time, the feature vectors are input into an initial graph convolution network model to obtain an identification result, and the initial feature screening model is optimized according to the identification result, so that the feature vectors of the merchant transaction data obtained by a subsequent initial feature screening model are more accurate.
Optionally, training an initial graph convolution network model according to the sample data set includes:
inputting the feature vector of each merchant transaction data in the sample data set into the initial graph volume network model to obtain the identification result of each merchant transaction data, wherein the initial graph volume network model represents a connecting line between a merchant node and a merchant node by using the feature vector of each merchant transaction data, and the connecting line represents user behavior information; and optimizing a convolution kernel parameter matrix and transaction weight parameters in the initial graph convolution network model according to the recognition result of each merchant transaction data. Here, as shown in fig. 4(b), the behavior information of users b, c, d is represented by connecting lines between merchant nodes, and the graph volume network model updates the network parameters using Cross Entropy (Cross Entropy) loss function of the labeled merchant transaction data corresponding nodes. In the model for identifying abnormal merchant transaction data, a convolution kernel parameter matrix theta and two transaction weight matrixes
Figure BDA0003181693540000151
And
Figure BDA0003181693540000152
the layer characteristic propagation formula in the formula (5) can be expressed as
Figure BDA0003181693540000153
In the training process, the convolution kernel weight theta is fixed firstly, and the alternating optimization multiplier method (ADMM) in the optimization theory is adopted for optimization
Figure BDA0003181693540000161
And
Figure BDA0003181693540000162
then fixing the transaction weight
Figure BDA0003181693540000163
And optimizing theta, and alternately iterating until the model converges or the iteration reaches a specified number of times. The weight value of the transaction between the user and the merchant and the initial weight value are randomly preset by an Xavier or Gaussian function and are adjusted in the subsequent training process. Therefore, the phenomenon that the accuracy is reduced due to overfitting caused by training of the initial graph convolution network model only according to the data of the merchant dimensionality is avoided. In addition, the input sample data set of the abnormal transaction data model is identified in the pre-training stage as graph structure data which is accumulated in the early stage and is formed by the abnormal commercial tenant marked by manual verification, the normal commercial tenant and the transaction flow; different from the traditional convolutional neural network structure, the graph convolutional network does not need an excessively deep network structure and does not need a pooling layer and a full connection layer, and after two to three graph convolutional layers, the probability distribution vector of the node type is directly output by using Softmax in the last layer. Thus, the recognition efficiency can also be improved.
Based on the above method flow, an embodiment of the present application provides a flow of a method for identifying an abnormal merchant, as shown in fig. 5, including:
step 501, obtaining a sample data set.
Step 502, inputting the sample data set into an initial feature screening model, and obtaining a feature vector corresponding to transaction data of each merchant in the sample data set.
And 503, inputting each feature vector into an initial graph convolution network model to obtain a recognition result.
And 504, optimizing the initial feature screening model and the initial graph convolution network model according to the recognition result to obtain the feature screening model and the graph convolution network model.
And 505, acquiring the transaction data of the produced merchant to be identified.
Step 506, after the transaction data of the merchant to be identified is cleaned, the transaction data is input into the feature screening model to obtain feature vectors.
And 507, inputting the characteristic vector of the transaction data of the merchant to be identified into the graph convolution network model to obtain an identification result.
And step 508, updating the feature screening model and the graph convolution network model according to the recognition result and the real result.
It should be noted that the above flow steps are not exclusive, and step 508 may or may not be executed.
Based on the same concept, an embodiment of the present invention provides an apparatus for identifying an abnormal merchant, and fig. 6 is a schematic view of the apparatus for identifying an abnormal merchant provided in the embodiment of the present application, as shown in fig. 6, including:
the acquiring module 601 is configured to acquire merchant transaction data to be identified, where the merchant transaction data to be identified is obtained according to transaction data of merchants occurring within a preset time;
a processing module 602, configured to determine, from the to-be-identified merchant transaction data, a feature vector including a preset number of feature fields; the characteristic fields with the preset number are determined according to the importance degree of each characteristic field in the transaction data of the to-be-identified merchant to the identification of the abnormal merchant;
the processing module 602 is further configured to input the feature vector into a graph-rolled network model, and determine whether the merchant is merchant transaction data to be identified by an abnormal merchant, where the graph-rolled network model includes a transaction weight parameter, the transaction weight parameter is used to represent an interconnection degree of merchant nodes in the graph-rolled network model, and the transaction weight parameter is determined by transaction times and transaction weights of users between merchants in historical merchant transaction data.
Optionally, the processing module 602 is further configured to verify the identification result and the actual result of the to-be-identified merchant transaction data; and if the identification result is different from the real result, updating the feature screening model and the graph convolution network model according to the transaction data of the merchant to be identified containing the real result.
Optionally, the processing module 602 is specifically configured to determine, through a feature screening model, a feature vector including a preset number of feature fields from the transaction data of the merchant to be identified; the characteristic screening model is used for determining the importance degree of each characteristic field in the merchant transaction data to be identified and outputting the characteristic field of the merchant transaction data to be identified, wherein the importance degree of the characteristic field is positioned at the top N positions; and N is the preset number. Optionally, the feature screening model is a random forest algorithm/distributed gradient enhancement library, the random forest algorithm/distributed gradient enhancement library is used for obtaining each field of the to-be-identified merchant transaction data and the importance degree of each field, sorting the importance degrees of each field from large to small, and selecting a preset number of fields from front to back according to the importance degree sorting as each feature field of the to-be-identified merchant transaction data.
Optionally, the convolution layer in the graph convolution network model satisfies:
Figure BDA0003181693540000181
wherein H(l)For the l-th hidden layer feature of the graph convolution network model,
Figure BDA0003181693540000182
an augmentation matrix for a degree matrix of the graph convolution network model,
Figure BDA0003181693540000183
an augmentation matrix of the transaction weight matrix composed of transaction weight parameters,
Figure BDA0003181693540000184
is an augmentation matrix of the adjacency matrix between the merchant nodes, theta(l)And (3) obtaining a ith layer convolution kernel parameter matrix of the graph convolution network model, wherein sigma is an activation function. Optionally, in the training process of the graph convolution network model, a mode of alternately training the convolution kernel parameter matrix and the transaction weight matrix is adopted for any sample.
Optionally, the transaction weight parameter satisfies:
Figure BDA0003181693540000185
wherein the content of the first and second substances,
Figure BDA0003181693540000186
representing the number of transactions that occur with merchant i for C users between merchant i and merchant j,
Figure BDA0003181693540000187
the number of transaction strokes between the C users between the merchant i and the merchant j is represented, k is used for representing the user, (k is 1,2 … C),
Figure BDA0003181693540000188
representing the weight of the transaction between the C users and the merchant i in the graph volume network model,
Figure BDA0003181693540000189
representing the weight of the transaction between the C users and the merchant i in the graph convolution network model;
Figure BDA00031816935400001810
representing the number of transactions between the user and merchant i,
Figure BDA00031816935400001811
representing the number of transactions between the user and merchant j,
Figure BDA00031816935400001812
representing the transaction weight vector between the user and merchant i,
Figure BDA00031816935400001813
representing the transaction weight vector between the user and merchant j.
Optionally, the obtaining module 601 is specifically configured to,
acquiring a sample data set, wherein the sample data set comprises merchant transaction data of normal labels, merchant transaction data of abnormal labels and merchant transaction data of no labels;
training an initial feature screening model according to the sample data set to obtain the feature screening model;
obtaining a feature vector of a sample through the feature screening model according to any sample in the sample data set; inputting the characteristic vector of the sample into an initial graph convolution network model for training to obtain the graph convolution network model; and the merchant node in the initial graph convolution network model is determined according to the sample data set.
Optionally, the processing module 602 is further configured to update the feature screening model and the graph convolution network model.
Optionally, the processing module 602 is specifically configured to input the sample data set into the initial feature screening model, determine fields included in each transaction data in the sample data set, and determine the importance degree of each field in each transaction data, where the importance degree is used to represent the influence degree of the corresponding field on the transaction data as abnormal transaction data; sorting the importance degrees of the fields from large to small, and selecting a preset number of fields from front to back according to the importance degree sorting as each characteristic field of each transaction data; determining a feature vector according to the feature field of each transaction data; inputting the feature vector of each transaction data into the initial graph convolution network model to obtain the identification result of each transaction data; and optimizing the initial characteristic screening model according to the identification result of each transaction data.
Optionally, the processing module 602 is specifically configured to input a feature vector of each transaction data in the sample data set into the initial graph-convolution network model, and obtain an identification result of each transaction data, where the initial graph-convolution network model represents a connection line between a merchant node and a merchant node by using the feature vector of each transaction data, and the connection line represents user behavior information; and optimizing a convolution kernel parameter matrix and transaction weight parameters in the initial graph convolution network model according to the identification result of each transaction datum.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (10)

1. A method of identifying an anomalous merchant, the method comprising:
acquiring merchant transaction data to be identified, wherein the merchant transaction data to be identified is obtained according to transaction data of merchants within preset time;
determining a feature vector containing a preset number of feature fields from the transaction data of the merchant to be identified; the characteristic fields with the preset number are determined according to the importance degree of each characteristic field in the transaction data of the to-be-identified merchant to the identification of the abnormal merchant;
inputting the characteristic vector into a graph convolution network model, and determining whether the merchant is merchant transaction data to be identified by an abnormal merchant, wherein the graph convolution network model comprises transaction weight parameters, the transaction weight parameters are used for representing the interconnection degree of merchant nodes in the graph convolution network model, and the transaction weight parameters are determined by transaction times and transaction weights of users among merchants in historical merchant transaction data.
2. The method of claim 1, wherein determining a feature vector containing a predetermined number of feature fields from the merchant transaction data to be identified comprises:
determining a feature vector containing a preset number of feature fields from the merchant transaction data to be identified through a feature screening model; the characteristic screening model is used for determining the importance degree of each characteristic field in the merchant transaction data to be identified and outputting the characteristic field of the merchant transaction data to be identified, wherein the importance degree of the characteristic field is positioned at the top N positions; and N is the preset number.
3. The method of claim 1, wherein the convolutional layers in the convolutional network model satisfy:
Figure FDA0003181693530000011
wherein H(l)For the l-th hidden layer feature of the graph convolution network model,
Figure FDA0003181693530000012
an augmentation matrix for a degree matrix of the graph convolution network model,
Figure FDA0003181693530000013
an augmentation matrix of the transaction weight matrix composed of transaction weight parameters,
Figure FDA0003181693530000014
is an augmentation matrix of the adjacency matrix between the merchant nodes, theta(l)Convolution of the graphAnd (3) a parameter matrix of a convolution kernel of the l layer of the network model, wherein sigma is an activation function.
4. The method of claim 3, wherein the training of the graph convolution network model is performed by alternately training the convolution kernel parameter matrix and the transaction weight matrix for any sample.
5. The method of claim 1, wherein the transaction weight parameter satisfies:
Figure FDA0003181693530000021
wherein alpha isk (i)Representing the number of transactions, alpha, that user k has with merchant ik (j)Represents the number of transaction strokes that user k and merchant j have occurred, (k is 1,2 … C), βk (i)Represents the weight value, beta, of the transaction between the user k and the merchant i in the graph convolution network modelk (j)Representing the weight value of the transaction between the user k and the merchant j in the graph convolution network model; u shapeij (i)Representing the number of transactions, U, between the user and the merchant iij (j)Representing the number of transactions between the user and the merchant j, Vij (i)Representing the transaction weight vector, V, between the user and the merchant iij (j)Representing the transaction weight vector between the user and merchant j.
6. The method of claim 2, wherein the feature screening model and the graph convolution network model are obtained by:
acquiring a sample data set, wherein the sample data set comprises merchant transaction data of normal labels, merchant transaction data of abnormal labels and merchant transaction data of no labels;
training an initial feature screening model according to the sample data set to obtain the feature screening model;
obtaining a feature vector of a sample through the feature screening model according to any sample in the sample data set; inputting the characteristic vector of the sample into an initial graph convolution network model for training to obtain the graph convolution network model; the merchant node in the initial graph convolution network model is determined according to the sample data set;
initial feature screening model the initial graph convolution network model.
7. The method as recited in claim 1, further comprising:
and updating the feature screening model and the graph convolution network model.
8. An apparatus for identifying anomalous merchants, the apparatus comprising:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring merchant transaction data to be identified, and the merchant transaction data to be identified is obtained according to transaction data of merchants within preset time;
the processing module is used for determining a feature vector containing a preset number of feature fields from the transaction data of the merchant to be identified; the characteristic fields with the preset number are determined according to the importance degree of each characteristic field in the transaction data of the to-be-identified merchant to the identification of the abnormal merchant;
the processing module is further configured to input the feature vector into a graph convolution network model, and determine whether the merchant is an abnormal merchant, where the graph convolution network model includes a transaction weight parameter, the transaction weight parameter is used to represent the degree of interconnection of merchant nodes in the graph convolution network model, and the transaction weight parameter is determined by the transaction times and transaction weights of the user among merchants in historical merchant transaction data.
9. A computer-readable storage medium, characterized in that it stores a program which, when run on a computer, causes the computer to carry out the method of any one of claims 1 to 7.
10. A computer device, comprising:
a memory for storing a computer program;
a processor for calling a computer program stored in said memory to execute the method of any of claims 1 to 7 in accordance with the obtained program.
CN202110849076.1A 2021-07-27 2021-07-27 Method and device for identifying abnormal commercial tenant Pending CN113554099A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110849076.1A CN113554099A (en) 2021-07-27 2021-07-27 Method and device for identifying abnormal commercial tenant

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110849076.1A CN113554099A (en) 2021-07-27 2021-07-27 Method and device for identifying abnormal commercial tenant

Publications (1)

Publication Number Publication Date
CN113554099A true CN113554099A (en) 2021-10-26

Family

ID=78132911

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110849076.1A Pending CN113554099A (en) 2021-07-27 2021-07-27 Method and device for identifying abnormal commercial tenant

Country Status (1)

Country Link
CN (1) CN113554099A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114757304A (en) * 2022-06-10 2022-07-15 北京芯盾时代科技有限公司 Data identification method, device, equipment and storage medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108921566A (en) * 2018-05-03 2018-11-30 阿里巴巴集团控股有限公司 A kind of wash sale recognition methods and device based on graph structure model
CN109299954A (en) * 2018-08-22 2019-02-01 ***股份有限公司 A kind of recognition methods of violation trade company and device
CN109961296A (en) * 2017-12-25 2019-07-02 腾讯科技(深圳)有限公司 Merchant type recognition methods and device
CN110334130A (en) * 2019-07-09 2019-10-15 北京万维星辰科技有限公司 A kind of method for detecting abnormality of transaction data, medium, device and calculate equipment
CN110473083A (en) * 2019-07-08 2019-11-19 阿里巴巴集团控股有限公司 Tree-shaped adventure account recognition methods, device, server and storage medium
CN110852755A (en) * 2019-11-06 2020-02-28 支付宝(杭州)信息技术有限公司 User identity identification method and device for transaction scene
CN111292195A (en) * 2020-02-28 2020-06-16 中国工商银行股份有限公司 Risk account identification method and device
CN111882446A (en) * 2020-07-28 2020-11-03 哈尔滨工业大学(威海) Abnormal account detection method based on graph convolution network
CN112037038A (en) * 2020-09-02 2020-12-04 中国银行股份有限公司 Bank credit risk prediction method and device
CN112966728A (en) * 2021-02-26 2021-06-15 ***股份有限公司 Transaction monitoring method and device
US20210182859A1 (en) * 2019-12-17 2021-06-17 Accenture Global Solutions Limited System And Method For Modifying An Existing Anti-Money Laundering Rule By Reducing False Alerts
CN113011979A (en) * 2021-03-29 2021-06-22 ***股份有限公司 Transaction detection method, training method and device of model and computer-readable storage medium

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109961296A (en) * 2017-12-25 2019-07-02 腾讯科技(深圳)有限公司 Merchant type recognition methods and device
CN108921566A (en) * 2018-05-03 2018-11-30 阿里巴巴集团控股有限公司 A kind of wash sale recognition methods and device based on graph structure model
CN109299954A (en) * 2018-08-22 2019-02-01 ***股份有限公司 A kind of recognition methods of violation trade company and device
CN110473083A (en) * 2019-07-08 2019-11-19 阿里巴巴集团控股有限公司 Tree-shaped adventure account recognition methods, device, server and storage medium
CN110334130A (en) * 2019-07-09 2019-10-15 北京万维星辰科技有限公司 A kind of method for detecting abnormality of transaction data, medium, device and calculate equipment
CN110852755A (en) * 2019-11-06 2020-02-28 支付宝(杭州)信息技术有限公司 User identity identification method and device for transaction scene
US20210182859A1 (en) * 2019-12-17 2021-06-17 Accenture Global Solutions Limited System And Method For Modifying An Existing Anti-Money Laundering Rule By Reducing False Alerts
CN111292195A (en) * 2020-02-28 2020-06-16 中国工商银行股份有限公司 Risk account identification method and device
CN111882446A (en) * 2020-07-28 2020-11-03 哈尔滨工业大学(威海) Abnormal account detection method based on graph convolution network
CN112037038A (en) * 2020-09-02 2020-12-04 中国银行股份有限公司 Bank credit risk prediction method and device
CN112966728A (en) * 2021-02-26 2021-06-15 ***股份有限公司 Transaction monitoring method and device
CN113011979A (en) * 2021-03-29 2021-06-22 ***股份有限公司 Transaction detection method, training method and device of model and computer-readable storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114757304A (en) * 2022-06-10 2022-07-15 北京芯盾时代科技有限公司 Data identification method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
WO2021164382A1 (en) Method and apparatus for performing feature processing for user classification model
WO2017140222A1 (en) Modelling method and device for machine learning model
CN110599336B (en) Financial product purchase prediction method and system
CA3120412C (en) An automated and dynamic method and system for clustering data records
CN108898476A (en) A kind of loan customer credit-graded approach and device
CN111292195A (en) Risk account identification method and device
CN109191276B (en) P2P network lending institution risk assessment method based on reinforcement learning
CN108960304A (en) A kind of deep learning detection method of network trading fraud
CN110458600A (en) Portrait model training method, device, computer equipment and storage medium
CN112836750A (en) System resource allocation method, device and equipment
CN113554099A (en) Method and device for identifying abnormal commercial tenant
CN108765137A (en) A kind of credit demand prediction technique and system, storage medium
Renström et al. Fraud Detection on Unlabeled Data with Unsupervised Machine Learning
CN112132589A (en) Method for constructing fraud recognition model based on multiple times of fusion
CN115841345B (en) Cross-border big data intelligent analysis method, system and storage medium
CN116703568A (en) Credit card abnormal transaction identification method and device
CN111340102A (en) Method and apparatus for evaluating model interpretation tools
CN116611911A (en) Credit risk prediction method and device based on support vector machine
Wu et al. Customer churn prediction for commercial banks using customer-value-weighted machine learning models
CN115907954A (en) Account identification method and device, computer equipment and storage medium
CN115510948A (en) Block chain fishing detection method based on robust graph classification
CN111026661B (en) Comprehensive testing method and system for software usability
Jan et al. Detection of fraudulent financial statements using decision tree and artificial neural network
CN113656707A (en) Financing product recommendation method, system, storage medium and equipment
CN113034264A (en) Method and device for establishing customer loss early warning model, terminal equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination