CN102622552A - Detection method and detection system for fraud access to business to business (B2B) platform based on data mining - Google Patents

Detection method and detection system for fraud access to business to business (B2B) platform based on data mining Download PDF

Info

Publication number
CN102622552A
CN102622552A CN2012101056128A CN201210105612A CN102622552A CN 102622552 A CN102622552 A CN 102622552A CN 2012101056128 A CN2012101056128 A CN 2012101056128A CN 201210105612 A CN201210105612 A CN 201210105612A CN 102622552 A CN102622552 A CN 102622552A
Authority
CN
China
Prior art keywords
information
data
client
swindle
early warning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012101056128A
Other languages
Chinese (zh)
Inventor
魏宝军
佘华
蒋巧娜
黄建鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Focus Technology Co Ltd
Original Assignee
Focus Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Focus Technology Co Ltd filed Critical Focus Technology Co Ltd
Priority to CN2012101056128A priority Critical patent/CN102622552A/en
Publication of CN102622552A publication Critical patent/CN102622552A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a detection method and a detection system for fraud access to a business to business (B2B) platform based on data mining. The detection method includes dividing the information of clients into static information and dynamic information, detecting the static information by means of a data mining method of an association analysis, detecting the dynamic information by means of a data mining method of a logistic regression classification model, comprehensively calculating warning values obtained from two data mining methods, grading the clients whose warning values exceed the threshold value, judging the access clients who are graded into a specific grade to be fraud visitors, and listing the fraud visitors into a blacklist information base of fraud clients. The detection system comprises a client information processor, a fraud analysis processor and a front-end display processor. According to the detection method and the detection system for the fraud access to the B2B platform based on the data mining, by means of characteristics of the B2B e-commerce platform, on the basis of multi-dimensional data of client information data, client accessing behaviors and the like, the detection method and the detection system detect behaviors of the fraud access to the B2B e-commerce platform by introducing the data mining technology to model, and the problem that the fraud accesses are difficult to detect caused by the fact that transaction behaviors can't be monitored in the industry is solved.

Description

A kind of detection method and system that visits based on the B2B platform swindle of data mining
Technical field
The present invention relates generally to business to business, and (Business To Business, B2B) e-commerce field relate in particular to a kind of detection method and system that visits based on the B2B platform swindle of data mining.
Background technology
Along with development of market economy, the market fraud also becomes the major issue that most of industries must be faced thereupon, in bank, insurance, security, telecommunications, manufacturing, e-commerce industry, shows particularly outstandingly.Therefore, at these several industry fields, all carried out the modeling of fraud problems and attempted, and obtained certain effect.
In the insurance industry field; The Insurance Fraud behavior takes place repeatedly; The main literature of delivering in this respect has: " the sharing the Insurance Fraud gaming Model Analysis of settling fee based on the reinsurer " that 2006 deliver in " Operations research and mamagement science "; This paper is example with the claim fraud problems of exaggerative risk loss, set up the Insurance Fraud betting model that the reinsurer shares the Claims Resolution investigation cost, analyzes and explained existing this moral hazard afterwards in the settlement of insurance claim process.Comprehensive this civilian research conclusion can be known; There is the refining Bayesian Nash equilibrium of certain form in betting model; When the settling fee of insurer's burden changed in a certain scope, betting model existed the merging of certain form balanced, and the player can be used alternatingly and separate strategy and consolidation strategy.Because in the insurance field, be unable to do without the financial related datas of insurance such as failure costs, settling fee, insurance premium, be very harsh to the requirement of financial data, if lack the data of this respect, just can not guarantee the quality of model at all.
To telecommunications malicious owing fee swindle client; Domestic scholars is on the basis of data mining technology; Set up effective fraud model, the document of delivering mainly comprises: " based on the telecommunications telephone expenses swindle Study of model and the application of Bayesian network " that 2008 deliver in " computer utility ".This type of technology combines telecommunication service personnel's experimental knowledge and sample data, on the basis of data mining technology, has proposed a kind of telecommunications telephone expenses swindle modeling method based on Bayesian network.Experiment shows that (on-line analytical processing, OLAP) the telephone expenses Fraud Prediction effect of analysis and Bayesian network is relatively good, is a kind of effective Customer Fraud analysis tool based on on-line analytical processing.But the foundation of this class model will rely on user's charge type, account cycle, client's financial datas such as cost of the phone call, will cause setting up proper model if lack this type data.
Find that from top literature research existing swindle model mainly carries out modeling from client's transaction, fund Liu Dong, the foundation of its model be unable to do without financial data and transaction data accurately.
For the B2B E-commerce platform, mainly be divided into two big types:
(1) be the B2B E-commerce platform of core with the information service, this e-commerce platform does not have the online transaction function.
(2) be the B2B E-commerce platform of core with the online transaction service, this e-commerce platform has the online transaction function.
The latter is nearly new model that just grew up in 2 years, is to be main with the small amount wholesale and retail, and the frequency of buying is than higher relatively.The transaction size that this B2B E-commerce platform is reached is also smaller comparatively speaking.
And most at present transaction of reaching through the internet; Or dependence is that the B2B E-commerce platform of core is realized with the information service; This platform character is to compile supplier products, towards the purchaser, information interchange and trade service is provided; For supplier and purchaser create business opportunity, embodiment be the marketing relationship between the business to business.It is that the B2B E-commerce platform of core is with above-mentioned comparing with other industries the time that this situation makes with the information service; Maximum difference is to monitor client's trading activity; Caused lacking the entity behavior record of transaction; Therefore can not obtain important information from the angle of trading activity, for example products transactions quantity, the products transactions amount of money, financial data etc., this analyzes identification just for swindle and has brought great challenge.
But with the information service is in the B2B E-commerce of core; The fraudulent trading behavior is very in vogue; There are the possibility of swindle in supplier and purchaser both sides; And the transaction that these electron-like commercial affairs are facilitated often exists the characteristics that products transactions quantity is many, transaction limit is big, in case swindle, the loss that brings to the user will be far longer than the industry of detailed transaction record; Therefore to this type B2B E-commerce platform character, making up a cover, to be fit to this be that the fraud detection system of B2B E-commerce platform character of core is particularly important with the information service.
Certainly; Some is the research that core B2B E-commerce service provider also begins to carry out fraud with the information service; Relevant model data shows; The swindle characteristic mainly is that account's registered address information does not match with using location, IP address, and company's relevant information and straightforward procedures such as the blacklist client of e-commerce platform overlaps.But this fraud detection method exists sensing range causes detecting the unsettled deficiency of effect for a short time.
Therefore how fully excavation is that the characteristics of the B2B E-commerce platform of core are obtained enough available informations effectively to swindle access detection be problem demanding prompt solution with the information service.
Summary of the invention
The purpose of this invention is to provide a kind of detection method and system that visits based on the B2B platform swindle of data mining; In order to solve with the information service is the B2B E-commerce platform of core; Under the situation that lacks transaction data and relevant financial data, be difficult to effectively supplier and purchaser swindled the problem that visit detects.
In order to address the above problem, the present invention adopts following technical scheme:
A kind of detection method of visiting based on the B2B platform swindle of data mining; Be divided into static information and multidate information through information with the client; Adopt the data digging method of association analysis to detect to static information, adopt the data digging method of logistic regression disaggregated model to detect the early warning score value that two kinds of data digging methods of COMPREHENSIVE CALCULATING draw multidate information; The client who surpasses the early warning score value threshold value carries out grade classification; For being divided into other access customer of a specific order, judge that then they are swindle visitors, list swindle client blacklist information bank in.Client's static information comprises customer accounting code, enterprise name, enterprise location, telephone number, Email, product key word, relevant calling information; Multidate information comprises client's click behavior, search behavior, mail transmission/reception behavior, information issue behavior.Wherein obtaining of early warning score value may further comprise the steps: for the result who utilizes two kinds of data digging methods to draw, put respectively and give different weights, carry out the COMPREHENSIVE CALCULATING analysis, draw final early warning score value.Swindle client blacklist is along with the testing result in the testing process continues to upgrade.
A kind of detection system of visiting based on the B2B platform swindle of data mining comprises: customer information processor, swindle analysis processor, front end represent processor; The customer information processor is used for client's information material, website behavior are compiled, and compiles and stores corresponding storage unit into, is convenient to swindle the extraction of analysis processor; The swindle analysis processor, be used to extract customer data information and website behavioral data after, through the data mining means, carry out COMPREHENSIVE CALCULATING and draw the early warning score value, carry out certain rank according to the early warning score value and divide; Front end represents processor, and the customer information that is used for the early warning score value is surpassed threshold value is deployed to service provider's insider information system, supplies reference of relevant departments to inquire about.Wherein swindle analysis processor and also comprise data pick-up unit, association analysis unit, logistic regression unit, COMPREHENSIVE CALCULATING unit, data transmission unit; The association analysis unit is used for carrying out association analysis through after the data pick-up unit acquisition customer information data from a plurality of dimensions; The logistic regression unit obtains the logistic regression disaggregated model after being used for utilizing training, detects client's visit behavior; The COMPREHENSIVE CALCULATING unit is used for the data mining results to static information association analysis excavation result and multidate information logistic regression disaggregated model, puts respectively and gives different weights, carries out the COMPREHENSIVE CALCULATING analysis, draws final early warning score value.
Be divided into static information and multidate information to client's information, static information refers to the customer information of each dimension, comprises customer accounting code, enterprise name, enterprise location, telephone number, Email, product key word, relevant calling information etc.; Multidate information mainly refers to client's visit behavioral data, comprises click behavior, search behavior, mail transmission/reception behavior, information issue behavior of client etc.
Adopt the data digging method of association analysis to detect to static information, be divided into following step:
The extraction of static information.Read the client and land information such as IP, phone, mailbox, complaint, for association mining is prepared basic data.
The data cleansing pre-service.Data are carried out certain pre-service, be standardized into unified format to make things convenient for the association analysis application of model.
The setting of support, confidence threshold value.From the actual conditions of different dimension datas, support, the confidence threshold value of dimension separately are set.Wherein degree of confidence is the measurement to the correlation rule accuracy, the intensity of tolerance correlation rule.Support is the measurement to correlation rule importance, and whether the reflection association is ubiquitous rule, explains that this rule has great representativeness.
The related detection.On the B2B E-commerce platform, customer information and client's behavioral data have the characteristic of mass data, adopt the data digging method of association analysis, carry out association from various dimensions and survey, with the association mode that notes abnormalities.
Result's contrast.After drawing the result of association mining, compare, filter out above the result data of threshold value and preserve with support, confidence threshold value.
Adopt the data digging method of logistic regression disaggregated model to detect to multidate information, be divided into following steps:
The multidate information data pick-up.Read data such as client's search, click, mail transmission/reception, information issue, for basic data is prepared in data mining.
The data cleansing pre-service.Data are carried out certain pre-service, cleaning, missing values filling, abnormality value removing etc.
User behavior, user profile index make up.According to the multidate information data that extract, the user behavior of design in advance and the index of user profile aspect, the needed candidate's variable of modeling is carried out instantiation.
The disaggregated model abnormality detection.With the logistic regression disaggregated model swindle client's behavior and information are surveyed, excavated the abnormity point of swindle client aspect the behavior of website.
Result's contrast.After drawing swindle suspicion degree through the logistic regression disaggregated model, compare with threshold value.Filtering out the result data that surpasses threshold value preserves.Wherein swindle the suspicion degree and refer to the value of utilizing the logistic regression disaggregated model to calculate, i.e. this client's probability of cheating is in order to judge the possibility of its swindle.
To the data mining results of static information association analysis excavation result and multidate information logistic regression disaggregated model, put respectively and give different weights, carry out the COMPREHENSIVE CALCULATING analysis, draw the early warning score value, the client of early warning score value, carry out the division of grade above threshold value.For being divided into other access customer of a specific order, then be judged as the swindle visitor, list swindle client blacklist information bank in.
Swindle client blacklist information bank is along with the testing result in the testing process continues to upgrade.
A kind of B2B platform swindle access detection system based on data mining comprises that customer information processor, swindle analysis processor, front end represent processor.
Said customer information processor is compiled client's information material, website behavior, compiles and stores corresponding storage unit into, is convenient to swindle the extraction of analysis processor.
Said swindle analysis processor; Data digging method through association analysis and logistic regression disaggregated model; Data to being extracted are carried out the fraud COMPREHENSIVE CALCULATING, and the swindle judgement is to combine the synthesis result of association analysis and these two kinds of methods of logistic regression disaggregated model to realize, and according to the early warning score value client is carried out certain rank and divide; Additional related department carries out the identification and the prevention of fraud, and upgrades swindle client blacklist information bank simultaneously.Said swindle analysis processor also comprises association analysis unit, logistic regression unit, COMPREHENSIVE CALCULATING unit.
Said association analysis unit obtains the customer information data through the data pick-up unit, combines the customer data information of confirming as the swindle visit in the swindle client blacklist information bank simultaneously, and both clients' information is carried out association from a plurality of dimensions.
The training sample of a collection of client web site visit behavioral data as the logistic regression disaggregated model chosen in said logistic regression unit, data represented client's the visit behavioral indicator of these different dimensions.After the training, obtain the logistic regression disaggregated model, detect the swindle suspicion degree of client web site visit behavior through this model.
Said COMPREHENSIVE CALCULATING unit to the data mining results of static information association analysis excavation result and multidate information logistic regression disaggregated model, is put respectively and is given different weights, carries out COMPREHENSIVE CALCULATING, draws the early warning score value.
Said front end represents processor; Swindle the customer information of grade classification at said swindle analysis processor; According to certain module and divided rank carry out information integrated after; Represent processor through said front end it is deployed in service provider's insider information system, supply the reference inquiry of relevant departments.
Useful result of the present invention is following:
(1) the present invention is from the characteristics of B2B E-commerce platform; Fully be the basis with various dimensions data such as customer information data, client access behaviors; Introduce data mining technology and carry out modeling; Detect the swindle visit behavior of B2B E-commerce platform, solved because trading activity can't be monitored being difficult to of causing detects swindle and visit this industry issue.
(2) the present invention adopts client's association analysis, hits the swindle access customer that those manual works are difficult to investigate, and adopts the logistic regression disaggregated model that the abnormal behaviour that manual work is difficult to identify is carried out early warning, improves investigation swindle client's efficient.
(3) the present invention not only can detect supplier's in the B2B E-commerce platform swindle visit from various dimensions data such as customer information data and client access behaviors, also can detect purchaser's swindle visit, and it is more comprehensive that early warning is detected.
(4) the present invention not only uses static informations such as customer information data; And multidate informations such as client access behavioral data have been introduced; From a plurality of dimensions mining analysis is carried out in client's visit behavior, detect effect and greatly promote than only detecting effect with part customer data information.
(5) the present invention guarantees the efficient that the system that improves constantly detects according to early warning testing result continuous updating swindle client blacklist information bank, continues to promote early warning and detects effect.
Description of drawings
Fig. 1 is the system construction drawing in the embodiment of the invention.
Fig. 2 is the swindle analysis processor structural representation in the embodiment of the invention.
Fig. 3 is the association analysis process flow diagram in the embodiment of the invention.
Fig. 4 is an embodiment of the invention logistic regression process flow diagram.
Fig. 5 is an early warning score value processing flow chart in the embodiment of the invention.
Fig. 6 is early warning testing result display structure figure in the embodiment of the invention.
Embodiment
In the present invention; According to the application characteristic that with the information service is the B2B E-commerce platform of core; Give full play to the effect of the user information data and the network behavior data of a plurality of dimensions; Introduce data mining technology and carry out modeling, client's visit behavior is analyzed, to detect the swindle access customer.
Consult shown in the accompanying drawing 1, the system architecture in the embodiment of the invention comprises supplier's client 11, purchaser's client 12, B2B E-commerce server 13, swindle visit early warning detection system server 14, insider information system server 15, B2B E-commerce platform operation personnel client 16.
Supplier's client 11 is used for supplier and visits the B2B E-commerce platform, registers, browses, logins, clicks, each item work such as search, mail transmission/reception.
Purchaser's client 12 is used for the purchaser and visits the B2B E-commerce platform, registers, browses, logins, clicks, searches for, inquires the price, issues each item work such as market conditions.
B2B E-commerce server 13; Be used for issuing on the internet supplier's product information, company information, and the purchaser issues market conditions information etc., to increase more on the internet display machine meeting; Have additional supply of merchant and target purchaser's touch opportunity, thereby conclude the transaction.
Swindle visit early warning detection system server 14 is used to detect supplier and purchaser's visit behavior, and early warning detects the client that swindle visit behavior is arranged, to as if be supplier and the purchaser who registers.
Swindle visit early warning detection system 14 comprises that customer information processor 141, swindle analysis processor 142, front end represent processor 143.
Customer information processor 141 is compiled client's information material, client's website visiting behavioral data, compiles and stores corresponding storage unit into, is convenient to swindle the extraction of analysis processor 142.
Swindle analysis processor 142; After extracting customer information and client web site visit behavioral data; Through the data mining means of association analysis and logistic regression disaggregated model, carry out the fraud COMPREHENSIVE CALCULATING, the swindle judgement combines association analysis and two kinds of data digging methods of logistic regression disaggregated model to realize; And client's early warning score value is carried out certain rank divide, additional related department carries out the identification and the early warning of fraud better.
Front end represents processor 143; Swindle 142 pairs of analysis processors swindle customer information carry out according to certain module and divided rank information integrated after; Represent processor 143 through said front end it is deployed on the insider information system server 15, supply the reference inquiry of relevant departments.
Insider information system server 15 is used to receive the output result that the early warning detection system is visited in swindle, represents in insider information system with understandable mode.
B2B E-commerce platform operation personnel client 16 is used for the output result of B2B E-commerce platform operation personnel inquiry early warning detection system.
Consult shown in the accompanying drawing 2, the swindle analysis processor structure in the embodiment of the invention comprises data pick-up unit 21, association analysis unit 22, logistic regression unit 23, COMPREHENSIVE CALCULATING unit 24, data transmission unit 25.
Data pick-up unit 21 is used for extracting the customer information data that customer information processor 141 storage unit are put in order, and client web site visit behavioral data, as the data basis of data mining.
Association analysis unit 22; Obtain the customer information data through data pick-up unit 21, comprise client's information such as ID, name, phone, mailbox, registration IP and login IP, combine the customer data information of confirming as the swindle visit in the swindle client blacklist information bank then; Carry out related from a plurality of dimensions such as name, phone, mailbox, registration IP and login IP both clients' information; Related support between different dimensions information and the dissimilar client is different with confidence threshold value, if the client meets the threshold decision condition, then preserves the result; COMPREHENSIVE CALCULATING is carried out in preparation, if the client does not meet the threshold decision condition and then directly passes through.
Logistic regression unit 23; At first choose the training sample of a collection of client web site visit behavioral data as the logistic regression disaggregated model; Comprise login, release news, the behavioral data of a plurality of dimensions such as mail transmission/reception, click, search, data represented client's the visit behavioral indicator of these different dimensions.After the training, obtain the logistic regression disaggregated model, detect the swindle suspicion degree of client web site visit behavior through this model.If client's swindle suspicion degree is then preserved the result greater than threshold value, prepare to carry out COMPREHENSIVE CALCULATING; If the client does not meet the threshold decision condition and then directly passes through.
COMPREHENSIVE CALCULATING unit 24; Excavate the data mining results of result and multidate information logistic regression disaggregated model to the static information association analysis; Put respectively and give different weights, carry out the COMPREHENSIVE CALCULATING analysis, draw the early warning score value; The client of early warning score value, carry out the division of grade: " very serious ", " seriously ", " suspicious ", " concern ", " generally " above threshold value.For be divided into " very serious " level other, judge that then they are swindle visitors, list swindle client blacklist information bank in.
Data transmission unit 25 is used for sending to front end to the customer data after the grade classification and shows processor 143.
Consult shown in the accompanying drawing 3, the association analysis flow process in the embodiment of the invention is following:
Step 31: through data pick-up unit 21, extracting the customer information data of putting in order in customer information processor 141 storage unit, mainly is that the client lands information such as IP, phone, mailbox, for basic data is prepared in association analysis.
Step 32: the customer information data is carried out certain cleaning, pre-service, data normalization to make things convenient for association analysis.
Step 33: adopt the association analysis data digging method, carry out association from various dimensions and survey, with the association mode that notes abnormalities.In conjunction with the swindle customer information data in the swindle client blacklist information bank; Not only can detect with the blacklist information bank in have the visitor of corresponding relation; And can detect not in the blacklist information bank; But exist the visitor of incidence relation between the login account, this situation has also increased them and has swindled the possibility of visit.
The association analysis data mining is adopted the FP_TREE algorithm, concrete grammar in this instance:
Run-down transaction database D obtains frequent item set (1_ item collection) among the D, their support is counted, and frequent item set is obtained result set L by its support counting descending sort.
Generate FP_TREE.At first create the root node T of tree, with " null " mark, scan transaction database D for the second time, the item in each affairs is handled (promptly by the support count sort that successively decreases) by order among the L.If frequent table after the ordering is [p|P], wherein p is first frequent, and P is remaining frequent.Call insert_tree ([p|P], T).
Root node T has a plurality of child node N, if the P non-NULL, (P N), depicts a FP-tree according to the item collection, still keeps related information wherein simultaneously recursively to call insert_tree; Scan transaction database more once at last, excavate in proper order from lower to upper, the child node among the deletion FP-tree can produce the frequent mode that needs.
Step 34: adopt association analysis method, rational support and confidence threshold value need be set, excavate to there being unusual related number of the account in the customer information data.Greater than the result of threshold value, carry out step 35 for support and degree of confidence, for support and the degree of confidence result smaller or equal to threshold value, then execution in step 36.
Have 2 kinds of situation in the association: first kind of situation had the number of the account of the swindle confirmed as in the associated account number; Second kind of situation also do not confirmed as the number of the account of swindle in the associated account number.Obvious first kind of situation is more serious than second kind of situation, so the setting of the support of first kind and second kind situation and degree of confidence will be distinguished.
Step 35:, preserve greater than the result of threshold value for support and degree of confidence, prepare when COMPREHENSIVE CALCULATING, to use.
Step 36: process ends.
Consult shown in the accompanying drawing 4, embodiment of the invention logistic regression flow process is following:
Step 41: through data pick-up unit 21; Extract the client web site visit behavioral data of putting in order in customer information processor 141 storage unit; Mainly be to read information such as client's search, click, mail transmission/reception, the issue of product market conditions, excavate the preparation basic data for carrying out logistic regression.
Step 42: the client access behavioral data is carried out certain processing, mainly is that data are carried out certain pre-service, cleaning, missing values filling, abnormality value removing etc.
Step 43: should there be certain difference in operation behavior and the normal client of swindle client on the website, checks and accepts an equal angles from client's click, search, mail and made up a cover client behavioral indicator.Based on this cover client behavioral indicator, introduce the logistic regression disaggregated model in the data mining, whether exist abnormal behaviour to classify to the client and predict and sum up the exception rules of swindling the client.
What logistic regression adopted in the present embodiment is the Logit model, is one of discrete back-and-forth method model, belongs to multiple variable analysis category.The Logic Regression Models formula is following in the present embodiment:
Wherein, in the formula ... representative be selected target variable, refer to website visiting behavioral datas such as the client searches for, click, mail transmission/reception, the issue of product market conditions.
Step 44: draw client's swindle suspicion degree through the logistic regression disaggregated model, if client's swindle suspicion degree then carry out step 45 greater than threshold value; If client's suspicion degree is less than preset suspicion degree threshold value, then execution in step 46.
Step 45: for the result of swindle suspicion degree, preserve, prepare when COMPREHENSIVE CALCULATING, to use greater than threshold value.
Step 46: process ends.
Shown in accompanying drawing 5, early warning score value treatment scheme is following in the embodiment of the invention:
Step 51: obtain the early warning score value after the COMPREHENSIVE CALCULATING.
Step 52: whether judge the early warning score value greater than threshold value, if, execution in step 53, if not, execution in step 55.
Step 53: carry out the division of grade: " very serious ", " seriously ", " suspicious ", " concern ", " generally ".
Step 54: for be divided into " very serious " level other, judge that then they are swindle visitors, list swindle client blacklist in, upgrade swindle client blacklist information bank.
Step 55: process ends.
Consult shown in the accompanying drawing 6, early warning testing result display structure comprises front end displaying processor 61, insider information system 62 in the embodiment of the invention.
Front end is showed processor 61, comprises Data Receiving unit 611, data integration unit 612, data deployment unit 613.
Data Receiving unit 611 is used for receiving the result data that swindle analysis processor data transmission unit 25 is sent, and mainly is the customer data that final early warning score value surpasses threshold value.
Data integration unit 612, the data that are used for data receiving element 611 is received are carried out data integration according to rule and divided rank, prepare for being deployed to insider information system server 15.
Data deployment unit 613 integrating good data, passes to insider information system 62 through data-interface, will show B2B E-commerce platform related work personnel to result data by insider information system 62.
Insider information system 62; Be the Internal Management System that is used for B2B E-commerce service provider management, operation, statistics, wherein relevant with swindle access detection system have early warning client related information displaying subsystem 621, early warning customer complaint information exhibition subsystem 622, early warning client abnormal information are showed subsystem 623.
Early warning client related information is showed subsystem 621, write down and showed information through the early warning client who obtains after the COMPREHENSIVE CALCULATING, detail the information of the related aspect of early warning client, comprise degree of incidence, related reason, associated client type etc.
Early warning customer complaint information exhibition subsystem 622, detail the information of early warning customer complaint aspect, comprise and complain type, complain number of times etc.
Early warning client abnormal information is showed subsystem 623, detail the abnormal behavior that exists of early warning client, comprise login, release news, various dimensions information such as mail transmission/reception, click, search.
The purpose of this invention is to provide a kind of detection method and system that visits based on the B2B platform swindle of data mining; From supplier and purchaser's the information material and the data of visit behavior; Under the situation that lacks transaction data and relevant financial data; Swindle is analyzed the client enrollment information that focuses on of identification; And on the various dimensions website behavioral datas such as the client logins, browses, search, mail transmission/reception, and introduce data mining technology and carry out modeling, be in the B2B E-commerce platform of core with the information service because trading activity is difficult to monitoring causes and can not effectively carry out this problem of early warning detection to swindle visit behavior thereby solved.
The developer of this area can carry out various changes and modification to embodiments of the invention and not break away from the spirit and scope of the present invention.Like this, if these in the embodiment of the invention are revised within the scope that belongs to claim of the present invention with modification and be equal to, then the embodiment among the present invention also comprises these changes and modification interior.

Claims (15)

1. the detection method based on the B2B platform swindle visit of data mining is characterized in that, comprising:
Be the basis with various dimensions data such as customer information data, client access behaviors, introduce data mining technology and carry out modeling that early warning detects with the information service swindle visit behavior of the B2B E-commerce platform that is core;
According to customer information data, both characteristics of client access behavioral data, adopt two kinds of corresponding data digging methods respectively, in order to the association that notes abnormalities, and client's abnormal access behavior;
The early warning score value that two kinds of data digging methods of COMPREHENSIVE CALCULATING draw, the client who surpasses the early warning score value threshold value carries out grade classification, for being divided into other access customer of a specific order, judges that then they are swindle visitors, lists swindle client blacklist information bank in.
2. detection method according to claim 1; It is characterized in that: be divided into static information and multidate information to client's information; Static information refers to the customer information data of each dimension, comprises customer accounting code, enterprise name, enterprise location, telephone number, Email, product key word, relevant calling information etc.; Multidate information mainly refers to client's visit behavioral data, comprises click behavior, search behavior, mail transmission/reception behavior, information issue behavior of client etc.
3. detection method according to claim 1 is characterized in that: adopt the data digging method of association analysis to detect to static information.
4. detection method according to claim 1 is characterized in that: adopt the data digging method of logistic regression disaggregated model to detect to multidate information.
5. detection method according to claim 1 is characterized in that: for the result who utilizes two kinds of data digging methods to draw, put respectively and give different weights, carry out the COMPREHENSIVE CALCULATING analysis, draw final early warning score value.
6. detection method according to claim 1 is characterized in that: the client of early warning score value above threshold value, carry out the division of grade; For the access customer that is divided into " very serious ", list swindle client blacklist information bank in, and upgrade swindle visit blacklist information bank.
7. the B2B platform based on data mining is swindled the detection method of visiting; It is characterized in that: client's information is divided into static information and multidate information; Adopt the data digging method of association analysis to detect to static information, adopt the data digging method of logistic regression disaggregated model to detect the early warning score value that two kinds of data digging methods of COMPREHENSIVE CALCULATING draw multidate information; The client who surpasses the early warning score value threshold value carries out grade classification; For being divided into other access customer of a specific order, judge that then they are swindle visitors, list swindle client blacklist information bank in.
8. detection method according to claim 7 is characterized in that: said client's static information comprises customer accounting code, enterprise name, enterprise location, telephone number, Email, product key word, relevant calling information; Multidate information comprises client's click behavior, search behavior, mail transmission/reception behavior, information issue behavior.
9. detection method according to claim 7; It is characterized in that: wherein obtaining of early warning score value may further comprise the steps: for the result who utilizes two kinds of data digging methods to draw; Put respectively and give different weights, carry out the COMPREHENSIVE CALCULATING analysis, draw final early warning score value.
10. detection method according to claim 7 is characterized in that: said swindle client blacklist is along with the testing result in the testing process continues to upgrade.
11. the detection system based on the B2B platform swindle visit of data mining is characterized in that comprise: customer information processor, swindle analysis processor, front end represent processor;
The customer information processor is compiled client's information material, website behavior, compiles and stores corresponding storage unit into, is convenient to swindle the extraction of analysis processor;
The swindle analysis processor behind extraction customer data information and the website behavioral data, through the data mining means, carries out COMPREHENSIVE CALCULATING and draws the early warning score value, carries out certain rank according to the early warning score value and divides;
Front end represents processor, and the customer information that is used for the early warning score value is surpassed threshold value is deployed to service provider's insider information system, supplies reference of relevant departments to inquire about.
12. detection system according to claim 11 is characterized in that: said swindle analysis processor also comprises data pick-up unit, association analysis unit, logistic regression unit, COMPREHENSIVE CALCULATING unit, data transmission unit.
13. detection system according to claim 12 is characterized in that: said association analysis unit is used for carrying out association analysis through after the data pick-up unit acquisition customer information data from a plurality of dimensions.
14. detection system according to claim 12 is characterized in that: said logistic regression unit, obtain the logistic regression disaggregated model after the utilization training, detect client's visit behavior.
15. detection system according to claim 12; It is characterized in that: said COMPREHENSIVE CALCULATING unit; Excavate the data mining results of result and multidate information logistic regression disaggregated model to the static information association analysis; Put respectively and give different weights, carry out the COMPREHENSIVE CALCULATING analysis, draw final early warning score value.
CN2012101056128A 2012-04-12 2012-04-12 Detection method and detection system for fraud access to business to business (B2B) platform based on data mining Pending CN102622552A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012101056128A CN102622552A (en) 2012-04-12 2012-04-12 Detection method and detection system for fraud access to business to business (B2B) platform based on data mining

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012101056128A CN102622552A (en) 2012-04-12 2012-04-12 Detection method and detection system for fraud access to business to business (B2B) platform based on data mining

Publications (1)

Publication Number Publication Date
CN102622552A true CN102622552A (en) 2012-08-01

Family

ID=46562467

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012101056128A Pending CN102622552A (en) 2012-04-12 2012-04-12 Detection method and detection system for fraud access to business to business (B2B) platform based on data mining

Country Status (1)

Country Link
CN (1) CN102622552A (en)

Cited By (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103763152A (en) * 2014-01-07 2014-04-30 ***(深圳)有限公司 Method and system for multi-dimensionally monitoring telecommunication fraudulent conduct
CN103763124A (en) * 2013-12-26 2014-04-30 孙伟力 Internet user behavior analyzing and early-warning system and method
CN103886495A (en) * 2013-09-30 2014-06-25 上海本家空调***有限公司 Monitoring method and system based on network transaction
CN104268378A (en) * 2014-09-12 2015-01-07 北京邮电大学 Visual abnormal behavior monitoring method based on mobile user mass data
CN104636912A (en) * 2015-02-13 2015-05-20 银联智惠信息服务(上海)有限公司 Identification method and device for withdrawal of credit cards
CN104794622A (en) * 2015-04-13 2015-07-22 天津市非常易订科技有限公司 E-commerce platform system based on big data processing
CN105187674A (en) * 2015-08-14 2015-12-23 上海银天下科技有限公司 Compliance checking method and device for service recorded sound
CN105208009A (en) * 2015-08-27 2015-12-30 腾讯科技(深圳)有限公司 Safety detection method and apparatus of account number
CN105512896A (en) * 2014-09-25 2016-04-20 腾讯科技(深圳)有限公司 Tip-off information processing method and system
CN105516990A (en) * 2015-12-04 2016-04-20 中国联合网络通信集团有限公司 Telecom fraud user analysis method and device
CN106296343A (en) * 2016-08-01 2017-01-04 王四春 A kind of e-commerce transaction monitoring method based on the Internet and big data
CN106408141A (en) * 2015-07-28 2017-02-15 平安科技(深圳)有限公司 Abnormal expense automatic extraction system and method
CN106649476A (en) * 2016-09-29 2017-05-10 北京中联网盟科技股份有限公司 IP address information query system
CN106851633A (en) * 2017-02-15 2017-06-13 上海交通大学 Telecoms Fraud detecting system and method based on privacy of user protection
CN106897880A (en) * 2015-12-18 2017-06-27 阿里巴巴集团控股有限公司 A kind of account methods of risk assessment and equipment
CN107103479A (en) * 2017-04-25 2017-08-29 北京国舜科技股份有限公司 The real-time anti-fake system of financial transaction
CN107169864A (en) * 2017-05-31 2017-09-15 天云融创数据科技(北京)有限公司 A kind of card holder's risk of fraud feature extracting method based on complex network
CN107230154A (en) * 2017-05-22 2017-10-03 中国平安人寿保险股份有限公司 The recognition methods of life insurance Claims Resolution case with clique's risk of fraud and device
CN107767933A (en) * 2016-08-16 2018-03-06 厦门君沣信息科技有限公司 Psychological situation method for early warning and device based on OLAP
CN107871203A (en) * 2017-09-30 2018-04-03 平安科技(深圳)有限公司 Business personnel's behaviorist risk screens management method, application server and computer-readable recording medium
CN107945024A (en) * 2017-12-12 2018-04-20 厦门市美亚柏科信息股份有限公司 Identify that internet finance borrowing enterprise manages abnormal method, terminal device and storage medium
CN107995152A (en) * 2016-10-27 2018-05-04 腾讯科技(深圳)有限公司 A kind of malicious access detection method, device and detection service device
CN108614895A (en) * 2018-05-10 2018-10-02 ***通信集团海南有限公司 The recognition methods of abnormal data access behavior and data processing equipment
CN108765176A (en) * 2018-06-04 2018-11-06 中国平安人寿保险股份有限公司 Settlement of insurance claim case processing method, device, computer equipment and storage medium
CN109360004A (en) * 2018-09-25 2019-02-19 电子科技大学 A kind of client relation management method and system
CN109922032A (en) * 2017-12-13 2019-06-21 百度在线网络技术(北京)有限公司 Method and apparatus for determining the risk of logon account
CN110175784A (en) * 2019-05-30 2019-08-27 杭州一骑轻尘信息技术有限公司 Auto metal halide lamp risk control method, apparatus and system
CN110428091A (en) * 2019-07-10 2019-11-08 平安科技(深圳)有限公司 Risk Identification Method and relevant device based on data analysis
CN111582879A (en) * 2019-01-30 2020-08-25 浙江远图互联科技股份有限公司 Anti-fraud medical insurance identification method based on genetic algorithm
CN111865925A (en) * 2020-06-24 2020-10-30 国家计算机网络与信息安全管理中心 Network traffic based fraud group identification method, controller and medium
CN112465622A (en) * 2020-09-16 2021-03-09 西安科技大学 Method, system, medium and computer equipment for checking enterprise comprehensive credit information
CN113159881A (en) * 2021-03-15 2021-07-23 杭州云搜网络技术有限公司 Data clustering and B2B platform customer preference obtaining method and system
CN113344469A (en) * 2021-08-02 2021-09-03 成都新希望金融信息有限公司 Fraud identification method and device, computer equipment and storage medium
CN113570199A (en) * 2021-06-30 2021-10-29 北京达佳互联信息技术有限公司 Information processing method, electronic resource distribution method, device, electronic equipment and storage medium
CN113688905A (en) * 2021-08-25 2021-11-23 中国互联网络信息中心 Harmful domain name verification method and device
CN117688055A (en) * 2023-11-08 2024-03-12 亿保创元(北京)信息科技有限公司 Insurance black product identification and response system based on correlation network analysis technology

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1750038A (en) * 2005-09-06 2006-03-22 上海理想信息产业(集团)有限公司 Computer managing system for large scale telecommunication enterprise

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1750038A (en) * 2005-09-06 2006-03-22 上海理想信息产业(集团)有限公司 Computer managing system for large scale telecommunication enterprise

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
张涛: "《电子商务中客户行为数据挖掘技术的研究与应用》", 《HTTP://WWW.DOCIN.COM/P-150943113.HTML》, 17 March 2011 (2011-03-17) *
李晓妹: "《数据挖掘在金融欺诈检测中的应用》", 《知识经济》, 31 December 2011 (2011-12-31) *
管乐等: "多维关联规则挖掘在彩铃推荐中的应用", 《计算机***应用》, no. 4, 30 April 2009 (2009-04-30) *
谭兴斌等: "基于行为监控和数据挖掘的动态信任模型", 《计算机应用研究》, vol. 28, no. 10, 31 October 2011 (2011-10-31) *
郑莉华等: "《基于贝叶斯网络的电信话费欺诈模型的研究及应用》", 《计算机应用》, vol. 28, no. 2, 29 February 2008 (2008-02-29), pages 511 - 512 *

Cited By (51)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103886495A (en) * 2013-09-30 2014-06-25 上海本家空调***有限公司 Monitoring method and system based on network transaction
CN103763124A (en) * 2013-12-26 2014-04-30 孙伟力 Internet user behavior analyzing and early-warning system and method
CN103763124B (en) * 2013-12-26 2017-04-05 孙伟力 A kind of Internet user's behavior analysiss early warning system and method
CN103763152A (en) * 2014-01-07 2014-04-30 ***(深圳)有限公司 Method and system for multi-dimensionally monitoring telecommunication fraudulent conduct
CN103763152B (en) * 2014-01-07 2017-03-15 ***(深圳)有限公司 A kind of Telecoms Fraud behavior multidimensional monitoring and risk quantification appraisal procedure and system
CN104268378A (en) * 2014-09-12 2015-01-07 北京邮电大学 Visual abnormal behavior monitoring method based on mobile user mass data
CN104268378B (en) * 2014-09-12 2017-02-15 北京邮电大学 Visual abnormal behavior monitoring method based on mobile user mass data
CN105512896A (en) * 2014-09-25 2016-04-20 腾讯科技(深圳)有限公司 Tip-off information processing method and system
CN104636912A (en) * 2015-02-13 2015-05-20 银联智惠信息服务(上海)有限公司 Identification method and device for withdrawal of credit cards
CN104794622A (en) * 2015-04-13 2015-07-22 天津市非常易订科技有限公司 E-commerce platform system based on big data processing
CN106408141A (en) * 2015-07-28 2017-02-15 平安科技(深圳)有限公司 Abnormal expense automatic extraction system and method
CN105187674A (en) * 2015-08-14 2015-12-23 上海银天下科技有限公司 Compliance checking method and device for service recorded sound
CN105187674B (en) * 2015-08-14 2020-02-14 上海银赛计算机科技有限公司 Compliance checking method and device for service recording
CN105208009A (en) * 2015-08-27 2015-12-30 腾讯科技(深圳)有限公司 Safety detection method and apparatus of account number
CN105208009B (en) * 2015-08-27 2020-09-15 腾讯科技(深圳)有限公司 Account security detection method and device
CN105516990A (en) * 2015-12-04 2016-04-20 中国联合网络通信集团有限公司 Telecom fraud user analysis method and device
CN106897880A (en) * 2015-12-18 2017-06-27 阿里巴巴集团控股有限公司 A kind of account methods of risk assessment and equipment
CN106897880B (en) * 2015-12-18 2020-12-18 创新先进技术有限公司 Account risk assessment method and equipment
CN106296343A (en) * 2016-08-01 2017-01-04 王四春 A kind of e-commerce transaction monitoring method based on the Internet and big data
CN107767933A (en) * 2016-08-16 2018-03-06 厦门君沣信息科技有限公司 Psychological situation method for early warning and device based on OLAP
CN106649476A (en) * 2016-09-29 2017-05-10 北京中联网盟科技股份有限公司 IP address information query system
CN106649476B (en) * 2016-09-29 2019-08-20 北京中联网盟科技有限公司 A kind of IP address information inquiry system
CN107995152B (en) * 2016-10-27 2020-07-03 腾讯科技(深圳)有限公司 Malicious access detection method and device and detection server
CN107995152A (en) * 2016-10-27 2018-05-04 腾讯科技(深圳)有限公司 A kind of malicious access detection method, device and detection service device
CN106851633A (en) * 2017-02-15 2017-06-13 上海交通大学 Telecoms Fraud detecting system and method based on privacy of user protection
CN106851633B (en) * 2017-02-15 2020-05-01 上海交通大学 Telecommunication fraud detection system and method based on user privacy protection
CN107103479A (en) * 2017-04-25 2017-08-29 北京国舜科技股份有限公司 The real-time anti-fake system of financial transaction
CN107230154A (en) * 2017-05-22 2017-10-03 中国平安人寿保险股份有限公司 The recognition methods of life insurance Claims Resolution case with clique's risk of fraud and device
CN107169864A (en) * 2017-05-31 2017-09-15 天云融创数据科技(北京)有限公司 A kind of card holder's risk of fraud feature extracting method based on complex network
CN107871203A (en) * 2017-09-30 2018-04-03 平安科技(深圳)有限公司 Business personnel's behaviorist risk screens management method, application server and computer-readable recording medium
CN107945024A (en) * 2017-12-12 2018-04-20 厦门市美亚柏科信息股份有限公司 Identify that internet finance borrowing enterprise manages abnormal method, terminal device and storage medium
CN109922032B (en) * 2017-12-13 2022-04-19 百度在线网络技术(北京)有限公司 Method, device, equipment and storage medium for determining risk of logging in account
CN109922032A (en) * 2017-12-13 2019-06-21 百度在线网络技术(北京)有限公司 Method and apparatus for determining the risk of logon account
CN108614895B (en) * 2018-05-10 2020-09-29 ***通信集团海南有限公司 Abnormal data access behavior identification method and data processing device
CN108614895A (en) * 2018-05-10 2018-10-02 ***通信集团海南有限公司 The recognition methods of abnormal data access behavior and data processing equipment
CN108765176A (en) * 2018-06-04 2018-11-06 中国平安人寿保险股份有限公司 Settlement of insurance claim case processing method, device, computer equipment and storage medium
CN109360004A (en) * 2018-09-25 2019-02-19 电子科技大学 A kind of client relation management method and system
CN111582879A (en) * 2019-01-30 2020-08-25 浙江远图互联科技股份有限公司 Anti-fraud medical insurance identification method based on genetic algorithm
CN110175784A (en) * 2019-05-30 2019-08-27 杭州一骑轻尘信息技术有限公司 Auto metal halide lamp risk control method, apparatus and system
CN110428091A (en) * 2019-07-10 2019-11-08 平安科技(深圳)有限公司 Risk Identification Method and relevant device based on data analysis
CN110428091B (en) * 2019-07-10 2022-12-27 平安科技(深圳)有限公司 Risk identification method based on data analysis and related equipment
CN111865925A (en) * 2020-06-24 2020-10-30 国家计算机网络与信息安全管理中心 Network traffic based fraud group identification method, controller and medium
CN112465622A (en) * 2020-09-16 2021-03-09 西安科技大学 Method, system, medium and computer equipment for checking enterprise comprehensive credit information
CN112465622B (en) * 2020-09-16 2024-03-05 西安科技大学 Enterprise comprehensive credit information checking method, system, medium and computer equipment
CN113159881B (en) * 2021-03-15 2022-08-12 杭州云搜网络技术有限公司 Data clustering and B2B platform customer preference obtaining method and system
CN113159881A (en) * 2021-03-15 2021-07-23 杭州云搜网络技术有限公司 Data clustering and B2B platform customer preference obtaining method and system
CN113570199A (en) * 2021-06-30 2021-10-29 北京达佳互联信息技术有限公司 Information processing method, electronic resource distribution method, device, electronic equipment and storage medium
CN113344469A (en) * 2021-08-02 2021-09-03 成都新希望金融信息有限公司 Fraud identification method and device, computer equipment and storage medium
CN113344469B (en) * 2021-08-02 2021-11-30 成都新希望金融信息有限公司 Fraud identification method and device, computer equipment and storage medium
CN113688905A (en) * 2021-08-25 2021-11-23 中国互联网络信息中心 Harmful domain name verification method and device
CN117688055A (en) * 2023-11-08 2024-03-12 亿保创元(北京)信息科技有限公司 Insurance black product identification and response system based on correlation network analysis technology

Similar Documents

Publication Publication Date Title
CN102622552A (en) Detection method and detection system for fraud access to business to business (B2B) platform based on data mining
Maghfuriyah et al. Market structure and Islamic banking performance in Indonesia: An error correction model
Besker et al. A systematic literature review and a unified model of ATD
CN110400215B (en) Method and system for constructing enterprise family-oriented small micro enterprise credit assessment model
KR100883827B1 (en) A system and a method for calculatiing fitness of location of independently managed business shops and rank of the shops in sales
KR20110032878A (en) Keyword ad. method and system for social networking service
US20120239375A1 (en) Standardized Modeling Suite
CN110705307A (en) Information change index monitoring method and device, computer equipment and storage medium
CN111833182B (en) Method and device for identifying risk object
CN108572988A (en) A kind of house property assessment data creation method and device
CN110675078A (en) Marketing company risk diagnosis method, system, computer terminal and storage medium
CN110991650A (en) Method and device for training card maintenance identification model and identifying card maintenance behavior
CN101308564A (en) Mortgage loan information monitoring method and system
Huang et al. A deep dive into nft rug pulls
CN114118793A (en) Local exchange risk early warning method, device and equipment
CN112581291B (en) Risk assessment change detection method, apparatus, device and storage medium
US11620665B2 (en) Methods and systems using and constructing merchant communities based on financial transaction data
Wang et al. Demystifying “removed reviews” in iOS app store
CN113065943A (en) Anti-fraud black product entity identification method and system
CN112669039A (en) Client risk control system and method based on knowledge graph
KR20210132990A (en) System and method that provide differential rewards for providing real estate information
CN114528448B (en) Accurate analytic system of drawing of portrait of global foreign trade customer
KR102090951B1 (en) Method and system for providing integrated financial service
CN113592505B (en) System for realizing suspicious transaction scene model identification processing based on combination construction
CN113641725A (en) Information display method, device, equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20120801