CN111861493B - Information processing method, information processing device, electronic equipment and storage medium - Google Patents

Information processing method, information processing device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111861493B
CN111861493B CN202010764852.3A CN202010764852A CN111861493B CN 111861493 B CN111861493 B CN 111861493B CN 202010764852 A CN202010764852 A CN 202010764852A CN 111861493 B CN111861493 B CN 111861493B
Authority
CN
China
Prior art keywords
credit card
card application
information
application information
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010764852.3A
Other languages
Chinese (zh)
Other versions
CN111861493A (en
Inventor
李香元
罗琦山
李兴柯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202010764852.3A priority Critical patent/CN111861493B/en
Publication of CN111861493A publication Critical patent/CN111861493A/en
Application granted granted Critical
Publication of CN111861493B publication Critical patent/CN111861493B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/401Transaction verification
    • G06Q20/4016Transaction verification involving fraud or risk level assessment in transaction processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Finance (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Strategic Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Technology Law (AREA)
  • Marketing (AREA)
  • Computer Security & Cryptography (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The embodiment of the disclosure provides an information processing method, an information processing device, electronic equipment and a storage medium, which can be applied to the field of artificial intelligence and the field of big data. The method comprises the following steps: acquiring credit card application information of a user to be evaluated; and processing credit card application information by using an information evaluation model to obtain an evaluation result aiming at the user, wherein the evaluation result is used for representing the possibility of fraudulent activity of the user, the information evaluation model is generated based on credit card application information training of a sample user with a first characteristic dimension, and the credit card application information of the sample user with the first characteristic dimension is obtained by performing dimension reduction processing on the credit card application information of the sample user with a second characteristic dimension.

Description

Information processing method, information processing device, electronic equipment and storage medium
Technical Field
The embodiment of the disclosure relates to the technical field of computers, and more particularly relates to an information processing method, an information processing device, electronic equipment and a storage medium.
Background
Currently, in the financial field, with the faster upgrading and renewing speed of fraud, and the increasingly significant trend in precision and high technology, fraud situations are also becoming more serious, especially in the field of credit card applications.
For credit card applications, there are fraudulent ways of using false application data and cross-regional migration. The acceptance of current credit card application services has largely been transferred off-line to on-line. Correspondingly, in the related art, a mode based on a large amount of real-time online data is adopted to judge the fraudulent activity.
In the process of implementing the disclosed concept, the inventor finds that at least the following problems exist in the related art: the accuracy of judging the fraudulent conduct by adopting the related technology is not high.
Disclosure of Invention
In view of this, embodiments of the present disclosure provide an information processing method, an apparatus, an electronic device, and a storage medium.
An aspect of an embodiment of the present disclosure provides an information processing method, including:
acquiring credit card application information of a user to be evaluated; and processing the credit card application information by using an information evaluation model to obtain an evaluation result aiming at the user, wherein the evaluation result is used for representing the possibility of fraudulent activity of the user, the information evaluation model is generated based on credit card application information training of a sample user with a first characteristic dimension, and the credit card application information of the sample user with the first characteristic dimension is obtained by performing dimension reduction processing on the credit card application information of the sample user with a second characteristic dimension.
According to an embodiment of the present disclosure, the credit card application information of the sample user with the first feature dimension is obtained by performing a dimension reduction process on the credit card application information of the sample user with the second feature dimension, including: acquiring a history sample set, wherein the history sample set comprises credit card application information of a sample user with the second characteristic dimension; and processing credit card application information of the sample user with the second characteristic dimension by using a dimension reduction algorithm to obtain the credit card application information of the sample user with the first characteristic dimension.
According to an embodiment of the present disclosure, the dimension reduction algorithm includes at least one of: principal component analysis algorithm, linear discriminant analysis algorithm, multidimensional scale analysis algorithm, equidistant mapping algorithm, local linear embedding algorithm and Laplace feature mapping algorithm.
According to an embodiment of the present disclosure, the history sample set further includes real tag information of the sample user, where the real tag information includes a fraud-present tag and a fraud-absent tag; the processing the credit card application information of the sample user with the second characteristic dimension by using a dimension reduction algorithm to obtain the credit card application information of the sample user with the first characteristic dimension comprises the following steps: determining a density center point corresponding to each type of the real tag information from credit card application information of the sample users having the second characteristic dimension, wherein the density center point is credit card application information of the sample users having the second characteristic dimension, which is the largest in number of the credit card application information of the sample users having the second characteristic dimension included in a preset range from the center when the density center point is taken as the center; determining the geodesic distance between density center points of the two types of real tag information; determining the amplification distance between the density center points of the two types of real tag information according to a preset distance amplification coefficient and the geodesic distance between the density center points of the two types of real tag information; determining a geodesic distance between credit card application information of each two sample users with the second characteristic dimension according to the amplified distance; and processing each geodesic distance by using a multidimensional scale analysis algorithm to obtain credit card application information of the sample user with the first characteristic dimension.
According to an embodiment of the present disclosure, the processing the credit card application information of the sample user in the second feature dimension by using a dimension reduction algorithm to obtain the credit card application information of the sample user in the first feature dimension includes: determining a target field corresponding to the credit card fraud service; determining screening information corresponding to the target field from credit card application information of the sample user with the second characteristic dimension; and processing the screening information of the sample users with the second characteristic dimension by using a dimension reduction algorithm to obtain credit card application information of the sample users with the first characteristic dimension.
According to an embodiment of the present disclosure, the processing the screening information of the sample user in the second feature dimension by using a dimension reduction algorithm to obtain credit card application information of the sample user in the first feature dimension includes: preprocessing the screening information of the sample users with the second characteristic dimension to obtain processed screening information; and processing the processed screening information of the sample user with the second characteristic dimension by using a dimension reduction algorithm to obtain credit card application information of the sample user with the first characteristic dimension.
According to an embodiment of the present disclosure, the information evaluation model is generated based on credit card application information training of a sample user of a first feature dimension, and includes: and training the classifier model by using credit card application information of the sample user with the first characteristic dimension to obtain the information evaluation model.
Another aspect of the disclosed embodiments provides an information processing apparatus including: the acquisition module is used for acquiring credit card application information of the user to be evaluated; and a processing module, configured to process the credit card application information by using an information evaluation model to obtain an evaluation result for the user, where the evaluation result is used to characterize a possibility that the user has fraudulent activity, the information evaluation model is generated based on training of credit card application information of a sample user with a first feature dimension, and the credit card application information of the sample user with the first feature dimension is obtained by performing dimension reduction processing on the credit card application information of the sample user with a second feature dimension.
Another aspect of an embodiment of the present disclosure provides an electronic device including: one or more processors; and a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method as described above.
Another aspect of the disclosed embodiments provides a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to implement a method as described above.
Another aspect of the disclosed embodiments provides a computer program comprising computer executable instructions which, when executed, are adapted to carry out the method as described above.
According to the embodiment of the disclosure, credit card application information of a user to be evaluated is obtained, and the credit card application information is processed by using an information evaluation model generated by training the credit card application information of a sample user based on a first characteristic dimension to obtain an evaluation result for representing the possibility of fraudulent activity of the user, wherein the credit card application information of the sample user of the first characteristic dimension is obtained by performing dimension reduction processing on the credit card application information of the sample user of a second characteristic dimension. Because the credit card application information of the sample user with the first characteristic dimension is obtained by performing dimension reduction processing on the credit card application information of the sample user with the second characteristic dimension, the credit card application information of the sample user with the first characteristic dimension can reflect the more essential characteristics of the data, and the redundancy of the characteristics is reduced. On the basis, the prediction accuracy of the information evaluation model generated by training the credit card application information of the sample user based on the first characteristic dimension is improved, and the accuracy of the fraudulent judgment is further improved, so that the technical problem that the accuracy of the fraudulent judgment is not high in the related technology is at least partially solved. In addition, the sample density of the data in the low-dimensional subspace is increased through the dimension reduction processing, and the calculation complexity is reduced. The training speed of the information evaluation model is also improved due to the reduction of the computational complexity.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent from the following description of embodiments thereof with reference to the accompanying drawings in which:
FIG. 1 schematically illustrates an exemplary system architecture to which information processing methods may be applied, according to embodiments of the present disclosure;
FIG. 2 schematically illustrates a flow chart of a method of information processing according to an embodiment of the present disclosure;
FIG. 3 schematically illustrates a schematic diagram of a local density determination method according to an embodiment of the present disclosure;
FIG. 4 schematically illustrates a flow chart of another information processing method according to an embodiment of the disclosure;
FIG. 5 schematically illustrates a flow chart of yet another information processing method according to an embodiment of the disclosure;
fig. 6 schematically shows a block diagram of an information processing apparatus according to an embodiment of the present disclosure; and
fig. 7 schematically illustrates a block diagram of an electronic device adapted to implement an information processing method according to an embodiment of the disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is only exemplary and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.
Where expressions like at least one of "A, B and C, etc. are used, the expressions should generally be interpreted in accordance with the meaning as commonly understood by those skilled in the art (e.g.," a system having at least one of A, B and C "shall include, but not be limited to, a system having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). Where a formulation similar to at least one of "A, B or C, etc." is used, in general such a formulation should be interpreted in accordance with the ordinary understanding of one skilled in the art (e.g. "a system with at least one of A, B or C" would include but not be limited to systems with a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
In the related art, machine learning is employed to discriminate credit card fraud. The method comprises the steps of obtaining a history sample set, wherein the history sample set comprises credit card application information of a sample user, integrating the credit card application information of the sample user to obtain original features and derivative features, and splicing the original features and the derivative features to obtain feature information. And obtaining an information evaluation model by utilizing the machine learning processing characteristic information.
In the process of realizing the disclosed conception, the inventor finds that at least the problem of low accuracy of judging the fraudulent behavior exists in the related technology. This is because the dimension of the feature information involved in model training is usually relatively high, and especially in the case where the feature information includes a large number of discrete fields, the dimension can reach millions or even tens of millions after the feature information is subjected to the single-heat encoding process, and the feature information is high-dimensional data. For high-dimensional data, since the high-dimensional data can be relatively sparse in a high-dimensional space, the feature redundancy degree is high, and therefore more essential and deeper distinguishing features in the data are difficult to find. Because more essential and deeper distinguishing features in the data are difficult to find, the prediction accuracy of the information evaluation model obtained based on high-dimensional data training is not high, and the accuracy of distinguishing fraudulent behaviors by adopting the information evaluation model is not high. In addition, because the computational complexity of the high-dimensional data is high, the computational resources consumed for processing the high-dimensional data are high, and meanwhile, the model training speed based on the high-dimensional data is low.
According to the embodiments of the present disclosure, the inventors found that the reason for the low accuracy of the fraud discrimination in the related art is that high-dimensional data is involved in model training. In order to solve the problems in the related art, the inventor finds that the processing can be performed in a dimension reduction manner, namely, the information evaluation model is generated based on credit card application information training of a sample user with a first characteristic dimension, wherein the credit card application information of the sample user with the first characteristic dimension is obtained by performing dimension reduction processing on the credit card application information of the sample user with a second characteristic dimension. The following description will be made with reference to specific embodiments.
The embodiment of the disclosure provides an information processing method and device for electronic equipment and the electronic equipment capable of applying the method. The information determining method, the information determining device and the electronic equipment can be used in the artificial intelligence field and the big data field. The method comprises an evaluation process and a training process, wherein credit card application information of a user to be evaluated is acquired in the evaluation process, the credit card application information is processed by using an information evaluation model generated by training the credit card application information of a sample user based on a first characteristic dimension, and an evaluation result for representing the possibility of fraudulent activity of the user is obtained, wherein the credit card application information of the sample user with the first characteristic dimension is obtained by performing dimension reduction processing on the credit card application information of the sample user with a second characteristic dimension. In the training process, a history sample set is obtained, wherein the history sample set comprises credit card application information of sample users with second characteristic dimensions, credit card application information of the sample users with the second characteristic dimensions is processed by using a dimension reduction algorithm, and credit card application information of the sample users with the first characteristic dimensions is obtained. And training the classifier model by using credit card application information of the sample user with the first characteristic dimension to obtain an information evaluation model.
Fig. 1 schematically illustrates an exemplary system architecture 100 to which information processing methods may be applied according to embodiments of the present disclosure. It should be noted that fig. 1 is only an example of a system architecture to which embodiments of the present disclosure may be applied to assist those skilled in the art in understanding the technical content of the present disclosure, but does not mean that embodiments of the present disclosure may not be used in other devices, systems, environments, or scenarios.
As shown in fig. 1, a system architecture 100 according to this embodiment may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired and/or wireless communication links, and the like.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications may be installed on the terminal devices 101, 102, 103, such as banking class applications, shopping class applications, web browser applications, search class applications, instant messaging tools, mailbox clients and/or social platform software, to name a few.
The terminal devices 101, 102, 103 may be a variety of electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 105 may be a server providing various services, such as a background management server (by way of example only) providing support for websites browsed by users using the terminal devices 101, 102, 103. The background management server may analyze and process the received data such as the user request, and feed back the processing result (e.g., the web page, information, or data obtained or generated according to the user request) to the terminal device.
It should be noted that, the information processing method provided by the embodiment of the present disclosure may be generally executed by the server 105. Accordingly, the information processing apparatus provided by the embodiments of the present disclosure may be generally provided in the server 105. The information processing method provided by the embodiments of the present disclosure may also be performed by a server or a server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105. Accordingly, the information processing apparatus provided by the embodiments of the present disclosure may also be provided in a server or a server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Fig. 2 schematically illustrates a flowchart of an information processing method according to an embodiment of the present disclosure.
As shown in fig. 2, the method includes operations S210 to S220.
In operation S210, credit card application information of a user to be evaluated is acquired.
In embodiments of the present disclosure, the credit card application information may include at least one of a personal basic information field, a residence information field, a occupation information field, an asset information field, a vehicle information field, a card transaction information field, and a contact information field. The personal basic information field may include at least one of a user name field, a gender field, an academic field, a marital status field, an age field, a mobile phone number field, an email box field, an identification card number field, an identification card address field, an identification card validity period field, and a issuing authority field. The residence information field may include at least one of a residence address field, a residence mode field, and a residence time field. The address mode field may include a self-build field, a tamper-evident field, a relative field, or a lease field. The occupation information field may include at least one of a company name field, a company address field, a company telephone field, a job department field, a job position field, and a occupation year field. The asset information field may include a year revenue field and a near month deposit amount field. The vehicle information field may include a self-purchase brand field, a vehicle price field, and a year of vehicle field. The card handling information field may include a credit card opening unit field and a credit card credit limit field. The contact information fields may include a contact name field and a contact phone field.
The respective fields included in the credit card application information may be divided into a target field and an irrelevant field with respect to whether the user's possibility of fraudulent activity can be evaluated, wherein the target field may refer to a field related to evaluating the possibility of fraudulent activity, i.e., the target field is a field that can be used to evaluate the possibility of fraudulent activity. An irrelevant field may refer to a field that is irrelevant for assessing the likelihood of fraud. The above-described likelihood of the user being fraudulent may include the user being fraudulent and the user being non-fraudulent. The fraud described above may be referred to as credit card fraud.
According to an embodiment of the present disclosure, the target fields of the credit card application information may include a personal basic information field, a present address field, a company telephone field, and a near month deposit amount field, etc.
In operation S220, credit card application information is processed by using an information evaluation model to obtain an evaluation result for the user, wherein the evaluation result is used for representing the possibility of fraudulent activity of the user, the information evaluation model is generated based on credit card application information training of the sample user with the first characteristic dimension, and the credit card application information of the sample user with the first characteristic dimension is obtained after the credit card application information of the sample user with the second characteristic dimension is subjected to dimension reduction processing.
In an embodiment of the present disclosure, in order to improve accuracy of fraud discrimination, credit card application information may be processed using an information evaluation model generated based on credit card application information training of a sample user of a first feature dimension. The credit card application information of the sample user with the first characteristic dimension is obtained by performing dimension reduction processing on the credit card application information of the sample user with the second characteristic dimension, namely the first characteristic dimension is smaller than the second characteristic dimension.
By performing dimension reduction processing on credit card application information of a sample user with a second characteristic dimension in high dimension, on one hand, the internal structure of the sample user in a low dimension space can be found, the more essential characteristics of data are reflected, the characteristic redundancy is reduced, namely, the credit card application information of the sample user with a first characteristic dimension can reflect the more essential characteristics of the data, and the characteristic redundancy is reduced. On the basis, the prediction accuracy of the information evaluation model generated by training the credit card application information of the sample user based on the first characteristic dimension is improved, and the accuracy of judging the fraudulent activity is further improved. On the other hand, the sample density of the data in the low-dimensional subspace can be increased, and the calculation complexity is reduced. The training speed of the information evaluation model is also improved due to the reduction of the computational complexity.
The information evaluation model may be a classifier model, that is, the classifier model may be trained by using credit card application information of the sample user with the first feature dimension to obtain the information evaluation model. The classifier model can comprise a Bayes decision model, a maximum likelihood classifier model, a Bayes classifier model, a cluster analysis model, a neural network model, a support vector machine model, a chaos and fractal model, a hidden Markov model and the like. The classifier model may be specifically set according to actual situations, and is not specifically limited herein.
The credit card application information is input into the information evaluation model, and an evaluation result for representing the possibility of fraudulent behaviors of the user is output. The evaluation result may include the presence of fraud by the user or the absence of fraud by the user. The specific form of the evaluation result can be the user identification of the user to be evaluated and the prediction identification corresponding to the user identification. The predictive markers may be first markers, which may be characterized by the presence of fraud, or second markers, which may be characterized by the absence of fraud.
According to the technical scheme of the embodiment of the disclosure, credit card application information of a user to be evaluated is obtained, and the credit card application information is processed by using an information evaluation model generated by training the credit card application information of a sample user based on a first characteristic dimension, so that an evaluation result for representing the possibility of fraudulent behaviors of the user is obtained, wherein the credit card application information of the sample user with the first characteristic dimension is obtained after the credit card application information of the sample user with a second characteristic dimension is subjected to dimension reduction processing. Because the credit card application information of the sample user with the first characteristic dimension is obtained by performing dimension reduction processing on the credit card application information of the sample user with the second characteristic dimension, the credit card application information of the sample user with the first characteristic dimension can reflect the more essential characteristics of the data, and the redundancy of the characteristics is reduced. On the basis, the prediction accuracy of the information evaluation model generated by training the credit card application information of the sample user based on the first characteristic dimension is improved, and the accuracy of the fraudulent judgment is further improved, so that the technical problem that the accuracy of the fraudulent judgment is not high in the related technology is at least partially solved. In addition, the sample density of the data in the low-dimensional subspace is increased through the dimension reduction processing, and the calculation complexity is reduced. The training speed of the information evaluation model is also improved due to the reduction of the computational complexity.
Optionally, on the basis of the above technical solution, credit card application information of the sample user in the first feature dimension is obtained by performing dimension reduction processing on credit card application information of the sample user in the second feature dimension, and the method may include: a history sample set is obtained, wherein the history sample set comprises credit card application information of a sample user with a second feature dimension. And processing credit card application information of the sample user with the second characteristic dimension by using a dimension reduction algorithm to obtain the credit card application information of the sample user with the first characteristic dimension.
In an embodiment of the present disclosure, in order to reduce the dimension of credit card application information of the sample user of the second feature dimension to obtain credit card application information of the sample user of the first feature dimension, a dimension reduction algorithm may be employed.
A history sample set is obtained, wherein the history sample set may include credit card application information of sample users of a second feature dimension, and the credit card application information of sample users of the second feature dimension may include a plurality of credit card application information.
And processing credit card application information of the sample user with the second characteristic dimension by adopting a dimension reduction algorithm to obtain the credit card application information of the sample user with the first characteristic dimension, wherein the dimension reduction algorithm can comprise a linear dimension reduction algorithm and a nonlinear dimension reduction algorithm. The linear dimension reduction algorithm refers to that a linear mapping relation is arranged in the dimension reduction process to guide high-dimension data to carry out linear mapping transformation on low-dimension data. The linear dimension reduction algorithm may include at least one of a principal component analysis (Principal Component Analysis, PCA) algorithm, a linear discriminant analysis (Linear Discriminate Analysis, LDA), and a multi-dimensional dimension analysis algorithm (MultiDimensional Scaling, MDS). The nonlinear dimension reduction algorithm may include at least one of an equidistant mapping (Isometric Feature Mapping, ISOMAP) algorithm, a local linear embedding (Locality Linear Embedding, LLE) algorithm, and a laplace feature mapping (Laplacian Engenmaps, LE) algorithm. Wherein the equidistant mapping algorithm may comprise a modified equidistant mapping algorithm.
Illustratively, a principal component analysis algorithm is described. The principal component analysis algorithm is a statistical method for recombining original independent variables into a group of new independent comprehensive variables, extracting fewer comprehensive variables from the new independent variables according to actual needs, and reflecting original independent variable information as much as possible, wherein the fewer comprehensive variables are principal components. The principal component analysis algorithm process is a coordinate rotation process, each principal component expression is a conversion relation between a new coordinate and an original coordinate system, and the direction of each coordinate axis in the new coordinate system is the direction with the maximum variance of the original independent variable, and the specific implementation mode is as follows:
acquiring historical sample information, wherein credit card application information of a sample user with a second characteristic dimension can be expressed as { X } i |X i ∈X,i=1,2,...,N-1,N},X i =[X i1 ,X i2 ,...,X im ,...,X iM-1 ,X iM ]M=1, 2,..m-1, M, wherein X i Credit card application information, X, representing sample users of the ith second characteristic dimension im The M-th dimension information in the credit card application information of the sample user of the i-th second feature dimension is represented, N represents the number of credit card application information of the sample user of the second feature dimension included in the history sample information, and M represents the dimension of the credit card application information of the sample user of the second feature dimension.
And processing the credit card application information of the sample user with the second characteristic dimension according to the standardized formula to obtain the standardized credit card application information of the sample user with the second characteristic dimension. Wherein the standardized formula is
And processing credit card application information of the standardized sample user with the second characteristic dimension according to the correlation coefficient formula to obtain a correlation coefficient matrix. Wherein, the correlation coefficient formula isThe correlation coefficient matrix may be represented by R.
A feature vector corresponding to the correlation coefficient matrix is determined. The eigenvalues corresponding to the correlation coefficient matrix can be determined according to the eigenvalue formula. Wherein, the eigenvalue formula is |λE-R|=0, and the eigenvalue can be lambda i And (3) representing.
And determining the eigenvectors corresponding to the correlation coefficient matrix according to the eigenvector formula and the eigenvalues corresponding to the correlation coefficient matrix.Wherein, the eigenvector formula is |lambda i E-R|e i =0。
And determining the contribution rate of the principal component according to the characteristic value corresponding to the correlation coefficient matrix. Wherein the contribution rate of the main component can be Q i The representation is made of a combination of a first and a second color,
and determining the target principal component according to the contribution rate of each principal component. That is, the target principal component may be determined from the principal components according to the cumulative contribution rate of the first i principal components and the preset cumulative rate threshold, where the number of target principal components is at least one. The cumulative contribution rate of the first i principal components can be used And (3) representing. The predetermined accumulation rate may range from 0.85 to 0.95. If the cumulative contribution rate of the first i principal components is equal to or greater than a preset cumulative rate threshold, the first i principal components may be determined to be target principal components.
And obtaining the credit card application information of the sample user with the first characteristic dimension according to the credit card application information of the sample user with the second characteristic dimension and the target principal component. I.e. f (X) =X N×M A M×i Wherein A is M×i Is a matrix composed of target principal components.
The sample density of the data in the low-dimensional subspace is increased through the dimension reduction processing, and the calculation complexity is reduced. The training speed of the information evaluation model is also improved due to the reduction of the computational complexity.
Optionally, on the basis of the above technical solution, the dimension reduction algorithm may include at least one of the following: principal component analysis algorithm, linear discriminant analysis algorithm, multidimensional scale analysis algorithm, equidistant mapping algorithm, local linear embedding algorithm and Laplace feature mapping algorithm.
In embodiments of the present disclosure, the equidistant mapping algorithm may comprise a modified equidistant mapping algorithm.
Optionally, on the basis of the technical scheme, the history sample set further comprises real tag information of the sample user, wherein the real tag information comprises a fraud tag and a non-fraud tag. Processing credit card application information of the sample user with the second characteristic dimension by using a dimension reduction algorithm to obtain credit card application information of the sample user with the first characteristic dimension can comprise: and determining a density center point corresponding to each type of real tag information from credit card application information of the sample users with the second characteristic dimension, wherein the density center point is the credit card application information of the sample users with the second characteristic dimension, which is the largest in number of the credit card application information of the sample users with the second characteristic dimension and is included in a preset range from the center when the density center point is taken as the center. And determining the geodesic distance between density center points of the two types of real tag information. And determining the amplification distance between the density center points of the two types of real tag information according to the preset distance amplification coefficient and the geodesic distance between the density center points of the two types of real tag information. And determining the geodesic distance between credit card application information of each two sample users with the second characteristic dimension according to the amplified distance. And processing each geodesic distance by using a multidimensional scale analysis algorithm to obtain credit card application information of the sample user with the first characteristic dimension.
In an embodiment of the present disclosure, in order to improve the accuracy of the model in predicting fraudulent activity, a manner of increasing the distance between credit card application information of sample users with second feature dimensions of different real tag information may be adopted.
The historical sample set may include credit card application information of the sample user of the second feature dimension, as well as real tag information of the sample user. The genuine tag information may include, among other things, a fraudulent tag present and a non-fraudulent tag present. The number of sample users is plural. The credit card application information of the sample user for each second feature dimension may be referred to as a sample point. The sample points will be described below.
According to the real label information of the sample user, the sample points can be divided into two types, namely, the sample points with the same real label information are divided into one type. These two categories are sample points with tags that are fraudulent and sample points with tags that are not fraudulent, respectively.
For each sample point in each type of real label information, the number of the sample points included in a preset range from the center is determined by taking the sample point as the center, and the number of other included sample points is called local density. According to the embodiment of the disclosure, for each sample point, a preset area with the sample point as a center is formed, the number of sample points in the preset area is determined, and the number of sample points in the preset area is taken as the local density corresponding to the sample point. Based on this, a local density corresponding to each sample point can be obtained.
Fig. 3 schematically illustrates a schematic diagram of a local density determination method according to an embodiment of the present disclosure. As shown in fig. 3, 17 sample points are included in fig. 3. The sample points 1 to 9 are sample points with the same real label information. Sample points 10 to 17 are sample points having the same real tag information. For the sample point 9, a preset area corresponding to the sample point 9 is formed with a preset radius d by taking the sample point 9 as a circle center. Sample points 1, 3, 4, 5, 6, 8 and 9 having the same real label information as sample point 9 are located in the preset area. Since the number of sample points located in the preset area is 7, the local density corresponding to the sample point 9 is 7.
After obtaining the local density corresponding to each sample point, for each type of real tag information, a density center point may be determined from the sample points according to the local density of the sample points having the real tag information, where the density center point is the sample point having the greatest local density. That is, the density center point is credit card application information of the sample user of the second characteristic dimension having the largest number of the credit card application information of the sample user of the second characteristic dimension included in the preset range from the center when the density center point is taken as the center. According to an embodiment of the present disclosure, the density center point is credit card application information of a sample user of a second feature dimension, and the credit card application information of the sample user of the second feature dimension satisfies the following condition: when the credit card application information of the sample user with the second characteristic dimension is taken as a center, the number of the credit card application information of the sample user with the second characteristic dimension included in a preset range from the center is the largest.
And determining the amplification distance between the density center points of the two types of real label information according to the preset distance amplification coefficient and the geodesic distance between the density center points of the two types of real label information, namely connecting the density center points of the two types of real label information to obtain the geodesic distance between the two density center points, wherein the geodesic distance is the shortest distance between the two points in the high-dimensional manifold. Multiplying the preset distance amplification coefficient by the geodesic distance between the two density center points to obtain the amplification distance between the density center points of the two types of real tag information.
And determining the geodesic distance between any two sample points of the specific real tag information aiming at each type of real tag information. The geodesic distance between every two sample points is determined according to the amplified distance between two density center points and the geodesic distance between every two sample points of each type of real tag information, wherein every two sample points in the geodesic distance between every two sample points are any two sample points in a history sample set, and the two sample points may be sample points with the same real tag information or sample points with different real tag information.
After the geodesic distance between every two sample points is obtained, the geodesic distances can be processed by adopting a multidimensional scale analysis algorithm so as to obtain credit card application information of the sample user with the first characteristic dimension.
According to an embodiment of the present disclosure, the processing the credit card application information of the sample user with the second feature dimension by using the dimension reduction algorithm, to obtain the credit card application information of the sample user with the first feature dimension is based on the idea that: firstly, real label information of a history sample set is adopted, and density center points corresponding to each type of real label information are respectively determined.
Secondly, the distance between the two density center point connecting lines and the sample points of different categories is amplified by a preset distance amplification factor, so that the larger the distance between the sample points of different categories is changed by the action of the preset distance amplification factor, the smaller the distance in the category is relatively changed. Different classes as described herein refer to having different real tag information. The classes refer to having the same real tag information.
And finally, determining the geodesic distance between every two sample points in each class, and determining the geodesic distance between every two sample points by taking the two density center points as connecting bridges between all the sample points.
The distances between the sample points of different categories are spatially scaled by adopting the real tag information, so that the distances between the sample points of different real tag information are increased, and the differentiation degree of the sample points of different categories is further improved, and the data density is further improved. On the basis, the feature redundancy is reduced, and the prediction accuracy of the model on the fraudulent behavior is further improved.
Optionally, on the basis of the above technical solution, processing credit card application information of the sample user with the second feature dimension by using a dimension reduction algorithm to obtain credit card application information of the sample user with the first feature dimension may include: a destination field corresponding to the credit card fraud is determined. And determining screening information corresponding to the target field from credit card application information of the sample user with the second feature dimension. And processing screening information of the sample user with the second characteristic dimension by using a dimension reduction algorithm to obtain credit card application information of the sample user with the first characteristic dimension.
In the embodiment of the present disclosure, since the information evaluation model is used to evaluate the possibility of the user having the fraudulent activity, the information evaluation model is generated based on the credit card application information training of the sample user of the first feature dimension, and the credit card application information includes the field related to the fraudulent activity evaluation and also includes the field unrelated to the fraudulent activity evaluation, the information of the field related to the possibility of the fraudulent activity evaluation may be extracted from the information evaluation model in order to improve the prediction accuracy of the information evaluation model. Wherein the field related to the possibility of evaluating the fraudulent activity may be referred to as a target field, i.e. the target field is a field that can be used for evaluating the possibility of the fraudulent activity being present. And information corresponding to the target field in credit card application information of the sample user with the second feature dimension is called screening information.
According to an embodiment of the present disclosure, the target fields of the credit card application information may include a personal basic information field, a present address field, a company telephone field, and a near month deposit amount field, etc.
After the screening information of the sample users with the second characteristic dimension is obtained, the screening information of the sample users with the second characteristic dimension can be processed by adopting a dimension reduction algorithm, so that credit card application information of the sample users with the first characteristic dimension is obtained. According to the embodiment of the disclosure, the fields included in the credit card application information of the sample user with the first feature dimension obtained by performing the dimension reduction processing on the screening information with the second feature dimension are target fields. That is, in this case, the credit card application information of the sample user of the second feature dimension includes information of the target field and not information of the extraneous field.
Optionally, on the basis of the above technical solution, processing the screening information of the sample user with the second feature dimension by using a dimension reduction algorithm to obtain credit card application information of the sample user with the first feature dimension may include: and preprocessing the screening information of the sample user with the second characteristic dimension to obtain the processed screening information. And processing the processed screening information of the sample user with the second characteristic dimension by using a dimension reduction algorithm to obtain credit card application information of the sample user with the first characteristic dimension.
In this common embodiment, in order to improve the prediction accuracy of the information evaluation model, the screening information of the sample user with the second feature dimension may be preprocessed. The preprocessing may include at least one of data cleaning, data integration, data reduction, and data transformation, among others.
The data transformation may include discretization. The discretization process is to segment continuous data into discretized sections. The principle of segmentation may be based on equidistant, equal frequency or optimized methods. According to embodiments of the present disclosure, the preprocessing may be discretization processing.
Through discretization processing, the time and space expenditure of model training can be reduced, and the clustering capacity and noise resistance of the model to historical samples are improved. In addition, the discretization feature is easier to understand than the continuous feature, so that the defect of hiding in data can be effectively overcome, and a prediction result obtained based on a model is more stable.
It should be noted that the credit card application information of the user to be evaluated may be credit card application information of the user to be evaluated after the preprocessing. The credit card application information of the user to be evaluated may be preprocessed in the same manner as the screening information of the sample user handling the second feature dimension.
Optionally, on the basis of the above technical solution, the information evaluation model is generated based on credit card application information training of the sample user with the first feature dimension, and may include: and training the classifier model by using credit card application information of the sample user with the first characteristic dimension to obtain an information evaluation model.
In an embodiment of the present disclosure, the information evaluation model may be generated by training the classifier model based on credit card application information of the sample user of the first feature dimension. The classifier model may include a Bayes decision model, a maximum likelihood classifier model, a Bayes classifier model, a cluster analysis model, a neural network model, a support vector machine model, a chaos and fractal model, a hidden Markov model and the like. The classifier model may be specifically set according to actual situations, and is not specifically limited herein. The following describes a cluster analysis model as an example.
Cluster analysis is an unsupervised machine learning algorithm, which belongs to a exploratory data analysis method. The cluster analysis is to divide similar objects into one object class according to the distance or similarity between the objects to form a plurality of object classes. Target classification refers to a collection of similar objects. The clustering result requires higher object similarity with the same object classification and lower object similarity with different object classifications. The clustering result is to determine the target cluster center of each target class. The cluster analysis may include K-means cluster analysis, K-center cluster analysis, CLARA (Clustering LARge Application) analysis, or fuzzy C-means analysis. According to embodiments of the present disclosure, the object may refer to credit card application information of a sample user of the first feature dimension. The target classifications may include the presence of fraud and the absence of fraud.
The information evaluation model may be generated by training the cluster analysis model based on credit card application information of the sample user with the first feature dimension, and may include: an initial cluster center for each target class is determined. An initial distance between credit card application information of the sample user of each first feature dimension and each initial cluster center is determined. And determining the target classification to which the credit card application information of the sample user of each first feature dimension belongs according to the initial distances, determining the distance average value of each initial distance in each target classification, and taking the distance average value as a new initial clustering center of the target classification. Repeating the operations of determining the initial distance and determining the new initial cluster center of the target classification until the preset condition is met, and taking the new initial cluster center of each target classification obtained when the preset condition is met as the target cluster center of the corresponding target classification. And generating an information evaluation model according to each target cluster center.
Fig. 4 schematically shows a flowchart of another information processing method according to an embodiment of the present disclosure.
As shown in fig. 4, the method includes operations S410 to S480.
In operation S410, a history sample set is acquired, wherein the history sample set includes credit card application information of a sample user of a second feature dimension.
In operation S420, a target field corresponding to the credit card fraud service is determined.
In operation S430, filtering information corresponding to the target field is determined from credit card application information of the sample user of the second feature dimension.
In operation S440, the screening information of the sample user with the second feature dimension is preprocessed, so as to obtain the processed screening information.
In operation S450, the processed screening information of the sample user with the second feature dimension is processed by using the dimension reduction algorithm, so as to obtain credit card application information of the sample user with the first feature dimension.
In operation S460, the classifier model is trained using credit card application information of the sample user of the first feature dimension to obtain an information evaluation model.
In operation S470, credit card application information of the user to be evaluated is acquired.
In operation S480, credit card application information is processed using the information evaluation model to obtain an evaluation result for the user, where the evaluation result is used to characterize the possibility of fraud by the user.
According to the technical scheme of the embodiment of the disclosure, credit card application information of a user to be evaluated is obtained, and the credit card application information is processed by using an information evaluation model generated by training the credit card application information of a sample user based on a first characteristic dimension, so that an evaluation result for representing the possibility of fraudulent behaviors of the user is obtained, wherein the credit card application information of the sample user with the first characteristic dimension is obtained after the credit card application information of the sample user with a second characteristic dimension is subjected to dimension reduction processing. Because the credit card application information of the sample user with the first characteristic dimension is obtained by performing dimension reduction processing on the credit card application information of the sample user with the second characteristic dimension, the credit card application information of the sample user with the first characteristic dimension can reflect the more essential characteristics of the data, and the redundancy of the characteristics is reduced. On the basis, the prediction accuracy of the information evaluation model generated by training the credit card application information of the sample user based on the first characteristic dimension is improved, and the accuracy of the fraudulent judgment is further improved, so that the technical problem that the accuracy of the fraudulent judgment is not high in the related technology is at least partially solved. In addition, the sample density of the data in the low-dimensional subspace is increased through the dimension reduction processing, and the calculation complexity is reduced. The training speed of the information evaluation model is also improved due to the reduction of the computational complexity.
Fig. 5 schematically shows a flowchart of still another information processing method according to an embodiment of the present disclosure.
As shown in fig. 5, the method includes operations S501 to S511.
In operation S501, a history sample set is acquired, wherein the history sample set includes credit card application information of a sample user of a second feature dimension and real tag information of the sample user, wherein the real tag information includes a fraud present tag and a fraud absent tag.
In operation S502, a target field corresponding to a credit card fraud service is determined.
In operation S503, the screening information of the sample user with the second feature dimension is preprocessed, so as to obtain the processed screening information.
In operation S504, a density center point corresponding to each type of real tag information is determined from the processed filtering information of the sample users of the second feature dimension, wherein the density center point is the processed filtering information of the sample users of the second feature dimension having the largest number of credit card application information of the sample users of the second feature dimension included within a preset range from the center when the density center point is taken as the center.
In operation S505, a geodesic distance between density center points of two types of real tag information is determined.
In operation S506, an enlarged distance between density center points of two kinds of real tag information is determined according to a preset distance enlargement factor and a geodesic distance between density center points of two kinds of real tag information.
In operation S507, a geodesic distance between the processed screening information of each two sample users of the second feature dimension is determined according to the enlarged distance.
In operation S508, each geodesic distance is processed using a multidimensional scaling algorithm to obtain credit card application information of the sample user of the first characteristic dimension.
In operation S509, the classifier model is trained using credit card application information of the sample user of the first feature dimension to obtain an information evaluation model.
In operation S510, credit card application information of a user to be evaluated is acquired.
In operation S511, the credit card application information is processed using the information evaluation model to obtain an evaluation result for the user, wherein the evaluation result is used to characterize the possibility of fraudulent activity of the user.
According to the technical scheme of the embodiment of the disclosure, the distances between the sample points of different categories are spatially scaled by adopting the real tag information, so that the distances between the sample points of different real tag information are increased, and the distinction degree of the sample points of different categories is further improved, and the data density is further improved. On the basis, the feature redundancy is reduced, and the prediction accuracy of the model on the fraudulent behavior is further improved. The prediction accuracy of the information evaluation model is improved, so that the accuracy of the fraudulent behavior judgment is improved, and the technical problem that the accuracy of the fraudulent behavior judgment is not high in the related art is at least partially solved. In addition, the sample density of the data in the low-dimensional subspace is increased through the dimension reduction processing, and the calculation complexity is reduced. The training speed of the information evaluation model is also improved due to the reduction of the computational complexity.
Fig. 6 schematically shows a block diagram of an information processing apparatus according to an embodiment of the present disclosure.
As shown in fig. 6, the information processing apparatus 600 may include an acquisition module 610 and a processing module 620.
The acquisition module 610 is communicatively coupled to the processing module 620.
An acquisition module 610, configured to acquire credit card application information of a user to be evaluated; and
the processing module 620 is configured to process credit card application information by using an information evaluation model to obtain an evaluation result for the user, where the evaluation result is used to characterize the possibility of fraudulent activity of the user, the information evaluation model is generated based on credit card application information training of a sample user with a first feature dimension, and the credit card application information of the sample user with the first feature dimension is obtained by performing dimension reduction processing on the credit card application information of the sample user with a second feature dimension.
According to the technical scheme of the embodiment of the disclosure, credit card application information of a user to be evaluated is obtained, and the credit card application information is processed by using an information evaluation model generated by training the credit card application information of a sample user based on a first characteristic dimension, so that an evaluation result for representing the possibility of fraudulent behaviors of the user is obtained, wherein the credit card application information of the sample user with the first characteristic dimension is obtained after the credit card application information of the sample user with a second characteristic dimension is subjected to dimension reduction processing. Because the credit card application information of the sample user with the first characteristic dimension is obtained by performing dimension reduction processing on the credit card application information of the sample user with the second characteristic dimension, the credit card application information of the sample user with the first characteristic dimension can reflect the more essential characteristics of the data, and the redundancy of the characteristics is reduced. On the basis, the prediction accuracy of the information evaluation model generated by training the credit card application information of the sample user based on the first characteristic dimension is improved, and the accuracy of the fraudulent judgment is further improved, so that the technical problem that the accuracy of the fraudulent judgment is not high in the related technology is at least partially solved. In addition, the sample density of the data in the low-dimensional subspace is increased through the dimension reduction processing, and the calculation complexity is reduced. The training speed of the information evaluation model is also improved due to the reduction of the computational complexity.
Alternatively, based on the above technical solution, the processing module 620 may include an acquisition sub-module and a processing sub-module.
And the acquisition sub-module is used for acquiring a history sample set, wherein the history sample set comprises credit card application information of a sample user with the second characteristic dimension.
And the processing sub-module is used for processing the credit card application information of the sample user with the second characteristic dimension by using the dimension reduction algorithm to obtain the credit card application information of the sample user with the first characteristic dimension.
Optionally, on the basis of the above technical solution, the dimension reduction algorithm includes at least one of the following: principal component analysis algorithm, linear discriminant analysis algorithm, multidimensional scale analysis algorithm, equidistant mapping algorithm, local linear embedding algorithm and Laplace feature mapping algorithm.
Optionally, on the basis of the technical scheme, the history sample set further comprises real tag information of the sample user, wherein the real tag information comprises a fraud tag and a non-fraud tag.
The processing sub-module may include a first determining unit, a second determining unit, a third determining unit, a fourth determining unit, and a first processing unit.
And a first determining unit configured to determine a density center point corresponding to each type of real tag information from credit card application information of sample users of the second feature dimension, where the density center point is credit card application information of the sample users of the second feature dimension having the largest number of credit card application information of sample users of other second feature dimensions included within a preset range from the center when the density center point is taken as the center.
And the second determining unit is used for determining the geodesic distance between the density center points of the two types of real tag information.
And the third determining unit is used for determining the amplification distance between the density center points of the two types of real tag information according to the preset distance amplification coefficient and the geodesic distance between the density center points of the two types of real tag information.
And a fourth determining unit for determining the geodesic distance between the credit card application information of the sample users of each two second feature dimensions according to the amplified distance.
The first processing unit is used for processing each geodesic distance by utilizing a multidimensional scale analysis algorithm to obtain credit card application information of the sample user with the first characteristic dimension.
Optionally, on the basis of the above technical solution, the processing sub-module may include a fifth determining unit, a sixth determining unit, and a second processing unit.
And a fifth determining unit for determining a target field corresponding to the credit card fraud.
And a sixth determining unit, configured to determine filtering information corresponding to the target field from credit card application information of the sample user in the second feature dimension.
And the second processing unit is used for processing the screening information of the sample users with the second characteristic dimension by using a dimension reduction algorithm to obtain credit card application information of the sample users with the first characteristic dimension.
Optionally, on the basis of the above technical solution, the processing sub-module may include a third processing unit and a fourth processing unit.
And the third processing unit is used for preprocessing the screening information of the sample user with the second characteristic dimension to obtain the processed screening information.
And the fourth processing unit is used for processing the processed screening information of the sample user with the second characteristic dimension by using a dimension reduction algorithm to obtain credit card application information of the sample user with the first characteristic dimension.
Alternatively, the processing module 620 may include a training sub-module based on the above-described aspects.
And the training sub-module is used for training the classifier model by using credit card application information of the sample user with the first characteristic dimension to obtain an information evaluation model.
Any number of modules, sub-modules, units, sub-units, or at least some of the functionality of any number of the sub-units according to embodiments of the present disclosure may be implemented in one module. Any one or more of the modules, sub-modules, units, sub-units according to embodiments of the present disclosure may be implemented as split into multiple modules. Any one or more of the modules, sub-modules, units, sub-units according to embodiments of the present disclosure may be implemented at least in part as hardware circuitry, such as a field programmable gate array (Field Programmable Gate Array, FPGA), a programmable logic array (Programmable Logic Arrays, PLA), a system on a chip, a system on a substrate, a system on a package, an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or in any other reasonable manner of hardware or firmware that integrates or encapsulates circuitry, or in any one of or a suitable combination of three of software, hardware, and firmware. Alternatively, one or more of the modules, sub-modules, units, sub-units according to embodiments of the present disclosure may be at least partially implemented as computer program modules, which when executed, may perform the corresponding functions.
For example, any number of the acquisition module 610 and the processing module 620 may be combined in one module/unit/sub-unit or any number of the modules/units/sub-units may be split into a plurality of modules/units/sub-units. Alternatively, at least some of the functionality of one or more of these modules/units/sub-units may be combined with at least some of the functionality of other modules/units/sub-units and implemented in one module/unit/sub-unit. According to embodiments of the present disclosure, at least one of the acquisition module 610 and the processing module 620 may be implemented at least in part as hardware circuitry, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in hardware or firmware, such as any other reasonable way of integrating or packaging circuitry, or in any one of or a suitable combination of three of software, hardware, and firmware. Alternatively, at least one of the acquisition module 610 and the processing module 620 may be at least partially implemented as a computer program module, which when executed, may perform the corresponding functions.
It should be noted that, in the embodiment of the present disclosure, the information processing apparatus portion corresponds to the information processing method portion in the embodiment of the present disclosure, and the description of the information processing apparatus portion specifically refers to the information processing method portion, which is not described herein.
Fig. 7 schematically illustrates a block diagram of an electronic device adapted to implement the above-described method according to an embodiment of the present disclosure. The electronic device shown in fig. 7 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.
As shown in fig. 7, an electronic device 700 according to an embodiment of the present disclosure includes a processor 701 that can perform various appropriate actions and processes according to a program stored in a Read-Only Memory (ROM) 702 or a program loaded from a storage section 708 into a random access Memory (Random Access Memory, RAM) 703. The processor 701 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or an associated chipset and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. The processor 701 may also include on-board memory for caching purposes. The processor 701 may comprise a single processing unit or a plurality of processing units for performing different actions of the method flows according to embodiments of the disclosure.
In the RAM 703, various programs and data necessary for the operation of the electronic apparatus 700 are stored. The processor 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704. The processor 701 performs various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM 702 and/or the RAM 703. Note that the program may be stored in one or more memories other than the ROM 702 and the RAM 703. The processor 701 may also perform various operations of the method flow according to embodiments of the present disclosure by executing programs stored in the one or more memories.
According to an embodiment of the present disclosure, the system 700 may further include an input/output (I/O) interface 705, the input/output (I/O) interface 705 also being connected to the bus 704. The system 700 may also include one or more of the following components connected to the I/O interface 705: an input section 706 including a keyboard, a mouse, and the like; an output portion 707 including a Cathode Ray Tube (CRT), a liquid crystal display (Liquid Crystal Display, LCD), and the like, and a speaker, and the like; a storage section 708 including a hard disk or the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. The drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read therefrom is mounted into the storage section 708 as necessary.
According to embodiments of the present disclosure, the method flow according to embodiments of the present disclosure may be implemented as a computer software program. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable storage medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 709, and/or installed from the removable medium 711. The above-described functions defined in the system of the embodiments of the present disclosure are performed when the computer program is executed by the processor 701. The systems, devices, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.
The present disclosure also provides a computer-readable storage medium that may be embodied in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the apparatus/device/system. The computer-readable storage medium carries one or more programs which, when executed, implement methods in accordance with embodiments of the present disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium. Examples may include, but are not limited to: portable computer diskette, hard disk, random Access Memory (RAM), read-Only Memory (ROM), erasable programmable read-Only Memory (EPROM (Erasable Programmable Read Only Memory) or flash Memory), portable compact disc read-Only Memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
For example, according to embodiments of the present disclosure, the computer-readable storage medium may include ROM 702 and/or RAM 703 and/or one or more memories other than ROM 702 and RAM 703 described above.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. Those skilled in the art will appreciate that the features recited in the various embodiments of the disclosure and/or in the claims may be combined in various combinations and/or combinations, even if such combinations or combinations are not explicitly recited in the disclosure. In particular, the features recited in the various embodiments of the present disclosure and/or the claims may be variously combined and/or combined without departing from the spirit and teachings of the present disclosure. All such combinations and/or combinations fall within the scope of the present disclosure.
The embodiments of the present disclosure are described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described above separately, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be made by those skilled in the art without departing from the scope of the disclosure, and such alternatives and modifications are intended to fall within the scope of the disclosure.

Claims (7)

1. An information processing method, comprising:
acquiring credit card application information of a user to be evaluated; and
processing the credit card application information by using an information evaluation model to obtain an evaluation result aiming at the user, wherein the evaluation result is used for representing the possibility of fraudulent behaviors of the user, the information evaluation model is generated based on credit card application information training of a sample user with a first characteristic dimension, and the credit card application information of the sample user with the first characteristic dimension is obtained by performing dimension reduction processing on the credit card application information of the sample user with a second characteristic dimension;
The credit card application information of the sample user with the first characteristic dimension is obtained by performing dimension reduction processing on the credit card application information of the sample user with the second characteristic dimension, and the method comprises the following steps:
acquiring a history sample set, wherein the history sample set comprises credit card application information of a sample user with the second characteristic dimension; and
processing credit card application information of the sample user with the second characteristic dimension by using a dimension reduction algorithm to obtain the credit card application information of the sample user with the first characteristic dimension;
the history sample set further comprises real tag information of the sample user, wherein the real tag information comprises a fraud tag and a fraud tag;
the processing the credit card application information of the sample user with the second characteristic dimension by using the dimension reduction algorithm to obtain the credit card application information of the sample user with the first characteristic dimension comprises the following steps:
determining a density center point corresponding to each type of real tag information from credit card application information of the sample users with the second characteristic dimension, wherein the density center point is the credit card application information of the sample users with the second characteristic dimension, which is included in a preset range from the center, with the largest number of the credit card application information of the sample users with the second characteristic dimension when the density center point is taken as the center;
Determining the geodesic distance between density center points of the two types of real tag information;
determining the amplification distance between the density center points of the two types of real tag information according to a preset distance amplification coefficient and the geodesic distance between the density center points of the two types of real tag information;
determining the geodesic distance between credit card application information of each two sample users with the second characteristic dimension according to the amplified distance; and
and processing each geodesic distance by using a multidimensional scale analysis algorithm to obtain credit card application information of the sample user with the first characteristic dimension.
2. The method of claim 1, wherein the processing the credit card application information of the sample user in the second feature dimension using the dimension reduction algorithm to obtain the credit card application information of the sample user in the first feature dimension comprises:
determining a target field corresponding to the credit card fraud service;
determining screening information corresponding to the target field from credit card application information of the sample user with the second characteristic dimension; and
and processing screening information of the sample users with the second characteristic dimension by using a dimension reduction algorithm to obtain credit card application information of the sample users with the first characteristic dimension.
3. The method according to claim 2, wherein the processing the screening information of the sample user in the second feature dimension by using the dimension reduction algorithm to obtain credit card application information of the sample user in the first feature dimension includes:
preprocessing the screening information of the sample users with the second characteristic dimension to obtain processed screening information; and
and processing the processed screening information of the sample user with the second characteristic dimension by using a dimension reduction algorithm to obtain credit card application information of the sample user with the first characteristic dimension.
4. The method of claim 1, wherein the information assessment model is generated based on credit card application information training of a sample user of a first feature dimension, comprising:
and training the classifier model by using credit card application information of the sample user with the first characteristic dimension to obtain the information evaluation model.
5. An information processing apparatus comprising:
the acquisition module is used for acquiring credit card application information of the user to be evaluated; and
the processing module is used for processing the credit card application information by utilizing an information evaluation model to obtain an evaluation result aiming at the user, wherein the evaluation result is used for representing the possibility of fraudulent activity of the user, the information evaluation model is generated based on credit card application information training of a sample user with a first characteristic dimension, and the credit card application information of the sample user with the first characteristic dimension is obtained by performing dimension reduction processing on the credit card application information of the sample user with a second characteristic dimension;
The credit card application information of the sample user with the first characteristic dimension is obtained by performing dimension reduction processing on the credit card application information of the sample user with the second characteristic dimension, and the method comprises the following steps:
acquiring a history sample set, wherein the history sample set comprises credit card application information of a sample user with the second characteristic dimension; and
processing credit card application information of the sample user with the second characteristic dimension by using a dimension reduction algorithm to obtain the credit card application information of the sample user with the first characteristic dimension;
the history sample set further comprises real tag information of the sample user, wherein the real tag information comprises a fraud tag and a fraud tag;
the processing the credit card application information of the sample user with the second characteristic dimension by using the dimension reduction algorithm to obtain the credit card application information of the sample user with the first characteristic dimension comprises the following steps:
determining a density center point corresponding to each type of real tag information from credit card application information of the sample users with the second characteristic dimension, wherein the density center point is the credit card application information of the sample users with the second characteristic dimension, which is included in a preset range from the center, with the largest number of the credit card application information of the sample users with the second characteristic dimension when the density center point is taken as the center;
Determining the geodesic distance between density center points of the two types of real tag information;
determining the amplification distance between the density center points of the two types of real tag information according to a preset distance amplification coefficient and the geodesic distance between the density center points of the two types of real tag information;
determining the geodesic distance between credit card application information of each two sample users with the second characteristic dimension according to the amplified distance; and
and processing each geodesic distance by using a multidimensional scale analysis algorithm to obtain credit card application information of the sample user with the first characteristic dimension.
6. An electronic device, comprising:
one or more processors;
a memory for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-4.
7. A computer readable storage medium having stored thereon executable instructions which when executed by a processor cause the processor to implement the method of any of claims 1 to 4.
CN202010764852.3A 2020-07-31 2020-07-31 Information processing method, information processing device, electronic equipment and storage medium Active CN111861493B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010764852.3A CN111861493B (en) 2020-07-31 2020-07-31 Information processing method, information processing device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010764852.3A CN111861493B (en) 2020-07-31 2020-07-31 Information processing method, information processing device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111861493A CN111861493A (en) 2020-10-30
CN111861493B true CN111861493B (en) 2023-09-05

Family

ID=72952723

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010764852.3A Active CN111861493B (en) 2020-07-31 2020-07-31 Information processing method, information processing device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111861493B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113839843B (en) * 2021-11-25 2022-03-29 深圳中科德能科技有限公司 Intelligent device discovery method, device, medium and block chain system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109300029A (en) * 2018-10-25 2019-02-01 北京芯盾时代科技有限公司 Borrow or lend money fraud detection model training method, debt-credit fraud detection method and device
CN109598331A (en) * 2018-12-04 2019-04-09 北京芯盾时代科技有限公司 A kind of fraud identification model training method, fraud recognition methods and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010062986A2 (en) * 2008-11-26 2010-06-03 Ringcentral, Inc. Fraud prevention techniques
US10332203B2 (en) * 2012-12-20 2019-06-25 Ncr Corporation Systems and methods for facilitating credit card application transactions

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109300029A (en) * 2018-10-25 2019-02-01 北京芯盾时代科技有限公司 Borrow or lend money fraud detection model training method, debt-credit fraud detection method and device
CN109598331A (en) * 2018-12-04 2019-04-09 北京芯盾时代科技有限公司 A kind of fraud identification model training method, fraud recognition methods and device

Also Published As

Publication number Publication date
CN111861493A (en) 2020-10-30

Similar Documents

Publication Publication Date Title
Hoang Image Processing‐Based Pitting Corrosion Detection Using Metaheuristic Optimized Multilevel Image Thresholding and Machine‐Learning Approaches
US11531987B2 (en) User profiling based on transaction data associated with a user
US11769078B2 (en) Systems and methods for transfer learning of neural networks
CN108596630B (en) Fraud transaction identification method, system and storage medium based on deep learning
CN111783039B (en) Risk determination method, risk determination device, computer system and storage medium
CN111814910B (en) Abnormality detection method, abnormality detection device, electronic device, and storage medium
CN111833175A (en) Internet financial platform application fraud behavior detection method based on KNN algorithm
CN113986674A (en) Method and device for detecting abnormity of time sequence data and electronic equipment
CN112487284A (en) Bank customer portrait generation method, equipment, storage medium and device
CN115545886A (en) Overdue risk identification method, overdue risk identification device, overdue risk identification equipment and storage medium
CN114202336A (en) Risk behavior monitoring method and system in financial scene
Che et al. Bank telemarketing forecasting model based on t-SNE-SVM
CN112102049A (en) Model training method, business processing method, device and equipment
CN111861493B (en) Information processing method, information processing device, electronic equipment and storage medium
Tiwari et al. Machine learning in financial market surveillance: A survey
CN116664306A (en) Intelligent recommendation method and device for wind control rules, electronic equipment and medium
CN113255824B (en) Method and apparatus for training classification model and data classification
CN113052512A (en) Risk prediction method and device and electronic equipment
CN112434083A (en) Event processing method and device based on big data
CN112818235A (en) Violation user identification method and device based on associated features and computer equipment
CN111598334A (en) Cycle identification method, device, system, terminal and storage medium for local production industry
CN117009883B (en) Object classification model construction method, object classification method, device and equipment
Domingos et al. Experimental Analysis of Hyperparameters for Deep Learning-Based Churn Prediction in the Banking Sector. Computation 2021, 9, 34
US20230099904A1 (en) Machine learning model prediction of interest in an object
CN118195776A (en) Transaction data prediction method, training method of transaction data prediction model apparatus, device, storage medium, and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant