CN116308824A - Knowledge graph-based group fraud risk identification method and related equipment - Google Patents

Knowledge graph-based group fraud risk identification method and related equipment Download PDF

Info

Publication number
CN116308824A
CN116308824A CN202310295749.2A CN202310295749A CN116308824A CN 116308824 A CN116308824 A CN 116308824A CN 202310295749 A CN202310295749 A CN 202310295749A CN 116308824 A CN116308824 A CN 116308824A
Authority
CN
China
Prior art keywords
data
user
named entity
risk
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310295749.2A
Other languages
Chinese (zh)
Inventor
刘齐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Health Insurance Company of China Ltd
Original Assignee
Ping An Health Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Health Insurance Company of China Ltd filed Critical Ping An Health Insurance Company of China Ltd
Priority to CN202310295749.2A priority Critical patent/CN116308824A/en
Publication of CN116308824A publication Critical patent/CN116308824A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Business, Economics & Management (AREA)
  • Finance (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Accounting & Taxation (AREA)
  • Fuzzy Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Technology Law (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Animal Behavior & Ethology (AREA)
  • Development Economics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The application discloses a knowledge-graph-based group fraud risk identification method, which is applied to the field of risk control. The method provided by the application comprises the following steps: acquiring user information data and service association data of an insurance user as data to be processed; sequentially performing entity extraction, attribute extraction and relation extraction on the data to be processed to respectively obtain a user named entity, a user named entity attribute and a user named entity association relation; importing the user named entity, the user named entity attribute and the user named entity association relationship into a first graph database to obtain a user information knowledge graph; acquiring a group fraud risk association rule according to historical fraud blacklist user data and the user information knowledge graph by a knowledge graph analysis tool; and sending the group fraud risk association rules to an early warning platform so that the early warning platform scans full-platform data in real time according to the group fraud risk association rules to generate early warning risk anomaly information.

Description

Knowledge graph-based group fraud risk identification method and related equipment
Technical Field
The application relates to the field of risk control, in particular to a partner fraud risk identification method based on a knowledge graph and related equipment.
Background
With the rapid development of the internet, the degree of online insurance business is deepened continuously, and online insurance business such as online insurance application and online claim settlement brings convenience and service experience improvement to users and also brings greater challenges to insurance risk control capability. The risk of the group fraud has the characteristics of high hazard, strong concealment and the like.
However, in the prior art, due to limited data information and simple analysis method, the identification accuracy and the identification coverage rate of the group fraud risk are low, and the development requirement of insurance business is difficult to meet.
Disclosure of Invention
The embodiment of the application provides a method, a device, computer equipment and a storage medium for identifying the risk of the group partner fraud based on a knowledge graph, which are used for solving the problems of low accuracy and low coverage rate of identifying the risk of the insurance group partner fraud in the prior art.
In a first aspect of the present application, a method for identifying risk of group fraud based on knowledge graph is provided, including:
acquiring user information data and service association data of all insurance users from a preset data source, and taking the user information data and the service association data as data to be processed, wherein the data source comprises historical data of each system in an enterprise, public network acquisition data and data provided by a third party mechanism;
Sequentially performing entity extraction, attribute extraction and relation extraction on the data to be processed to respectively obtain user named entities, user named entity attributes associated with the user named entities and user named entity association relations among the user named entities;
importing the user named entity, the user named entity attribute and the user named entity association relationship into a deployed first graph database to obtain a user information knowledge graph;
acquiring historical fraud blacklist user data, and obtaining a target group fraud risk association rule according to the historical fraud blacklist user data and the user information knowledge graph through a knowledge graph analysis tool, wherein the target group fraud risk association rule comprises a target user named entity, a target user named entity attribute and a target user named entity association relation;
and sending the target group fraud risk association rule to a corresponding early warning platform so that the early warning platform scans full-platform data in real time according to the target group fraud risk association rule, and if early warning risk abnormality information is generated by real-time scanning, sending the early warning risk abnormality information to related business personnel, wherein the early warning risk abnormality information comprises corresponding user information, user attribute information and user association relation.
In a second aspect of the present application, there is provided a knowledge-graph-based group fraud risk identification apparatus, including:
the system comprises a first data acquisition module, a second data acquisition module and a third party mechanism, wherein the first data acquisition module is used for acquiring user information data and service association data of all insurance users from a preset data source, and taking the user information data and the service association data as data to be processed, wherein the data source comprises historical data of each system in an enterprise, public network acquisition data and data provided by the third party mechanism;
the first data extraction module is used for sequentially carrying out entity extraction, attribute extraction and relation extraction on the data to be processed to respectively obtain user named entities, user named entity attributes associated with the user named entities and user named entity association relations among the user named entities;
the first graph database module is used for importing the user named entity, the user named entity attribute and the user named entity association relationship into a deployed first graph database to obtain a user information knowledge graph;
the first knowledge graph analysis module is used for acquiring historical fraud blacklist user data, and obtaining a target group fraud risk association rule according to the historical fraud blacklist user data and the user information knowledge graph through the knowledge graph analysis tool, wherein the target group fraud risk association rule comprises a target user naming entity, a target user naming entity attribute and a target user naming entity association relation;
And the group partner fraud risk early warning module is used for sending the target group partner fraud risk association rule to a corresponding early warning platform so that the early warning platform scans the whole platform data in real time according to the target group partner fraud risk association rule, and if early warning risk abnormality information is generated by the real-time scanning, the early warning risk abnormality information is sent to related business personnel, wherein the early warning risk abnormality information comprises corresponding user information, user attribute information and user association relation.
In a third aspect of the present application, there is provided a computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the above-described knowledge-graph-based method of identifying a risk of a group fraud, when the computer program is executed.
In a fourth aspect of the present application, there is provided a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the above-described knowledge-graph-based method of identifying a risk of partner fraud.
The method, the device, the computer equipment and the storage medium for identifying the risk of the partner fraud based on the knowledge graph are used for acquiring the user information data and the service association data of the insurance user as data to be processed; sequentially performing entity extraction, attribute extraction and relation extraction on the data to be processed to respectively obtain a user named entity, a user named entity attribute and a user named entity association relation; importing the user named entity, the user named entity attribute and the user named entity association relationship into a first graph database to obtain a user information knowledge graph; acquiring a group fraud risk association rule according to historical fraud blacklist user data and the user information knowledge graph by a knowledge graph analysis tool; and sending the group fraud risk association rules to corresponding early warning platforms so that the early warning platforms can scan the whole platform data in real time according to the group fraud risk association rules to generate early warning risk abnormality information. The method and the system not only fully utilize the associated data inside and outside enterprises, fully mine the associated information among multiple entities in the insurance industry, and improve the prevention and management capacity of the insurance industry on the risk of group partner fraud.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments of the present application will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of an application environment of a knowledge-graph-based method for identifying risk of group fraud in an embodiment of the present application;
FIG. 2 is a flow chart of a method for knowledge-graph-based group fraud risk identification in an embodiment of the present application;
FIG. 3 is a schematic diagram of a knowledge-graph-based group fraud risk identification apparatus according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a computer device in an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
The method for identifying the risk of the partner fraud based on the knowledge graph can be applied to an application environment as shown in fig. 1, wherein computer equipment can be, but not limited to, various personal computers and notebook computers, the computer equipment can also be a server, and the server can be an independent server or a cloud server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content distribution networks (Content Delivery Network, CDNs), basic cloud computing services such as big data and artificial intelligent platforms and the like. It will be appreciated that the number of computer devices in fig. 1 is merely illustrative and that any number of extensions may be made according to actual needs.
In one embodiment, as shown in fig. 2, a method for identifying risk of group fraud based on knowledge graph is provided, and the method is applied to the computer device in fig. 1 for illustration, and includes the following steps S101 to S105:
s101, acquiring user information data and service association data of all insurance users from a preset data source, and taking the user information data and the service association data as data to be processed, wherein the data source comprises historical data of each system in an enterprise, public network acquisition data and data provided by a third party mechanism.
Further, the user information data and the business association data not only comprise initial data obtained from the preset data source, but also comprise association statistical analysis data obtained by carrying out statistical analysis on the obtained initial data through a data statistical analysis tool according to business data requirements. For example, an insurance service subsystem of a certain financial and scientific platform acquires all personal dynamic information data published under the social media account of the user in a public network according to social media account information submitted by an insurance user, and then performs statistical analysis on the personal dynamic information data through a social data statistical analysis tool to obtain social attribute data of the insurance user in the social media field, such as a user active time period, a user occupation and the like.
S102, entity extraction, attribute extraction and relation extraction are sequentially carried out on the data to be processed, and user named entity, user named entity attributes associated with the user named entities and user named entity association relations among the user named entities are respectively obtained.
Further, if there is a data extraction requirement for a target service, extracting data content specified by the data extraction requirement from the data to be processed, for example, if a specific attribute variable is specified in the data extraction requirement during attribute extraction, extracting only data content corresponding to the attribute variable during attribute extraction from the data to be processed. The data extraction requirement can further reduce the range of data processing, rather than performing full processing on the data to be processed, so that the operation efficiency of the embodiment can be further improved. The specific implementation methods of entity extraction, attribute extraction and relationship extraction are not described herein because they are not core parts of the present embodiment.
Further, after entity extraction, attribute extraction and relationship extraction are sequentially performed on the data to be processed to obtain a user named entity, a user named entity attribute associated with the user named entity, and a user named entity association relationship between the user named entities, the method further includes: firstly, carrying out the alignment processing of the knowledge graph entity on the target user named entity, the target user named entity attribute and the target user named entity association relation. In this embodiment, the Entity Alignment (EA) refers to that the same or equivalent Entity is found from different data in the data to be processed. And secondly, carrying out knowledge spectrum conflict resolution processing on the target user named entity, the target user named entity attribute and the target user named entity association relation. Wherein, the conflict resolution can be divided into: voting-based approaches and quality estimation-based approaches. The voting-based manner is generally that the most votes are obtained as a final result after voting according to the occurrence frequency of different examples. The quality estimation based approach is typically based on the selection of the highest quality results based on the trustworthiness of the different data sources. And finally, carrying out knowledge graph knowledge fusion processing on the target user named entity, the target user named entity attribute and the target user named entity association relation. The knowledge fusion refers to heterogeneous data integration of data from different sources in the data to be processed under a unified specification. Because knowledge sources in the knowledge graph are different, the problems of poor knowledge quality, repeated knowledge from different data sources, insufficient correlation among knowledge and the like exist, and therefore knowledge fusion is needed.
And S103, importing the user named entity, the user named entity attribute and the user named entity association relationship into a deployed first graph database to obtain a user information knowledge graph.
Wherein, graph Database (GDB) is a non-relational Database that uses Graph structures for semantic queries, and uses nodes, edges and attributes to represent and store data. The graph database directly associates data items in storage with data nodes and sets of edges representing relationships between the nodes. Further, the first graph database may be selected from but not limited to: neo4j, orientDB, arangoDB, janusGraph, dgraph, hugeGraph, etc.
S104, acquiring historical fraud blacklist user data, and obtaining a target group fraud risk association rule according to the historical fraud blacklist user data and the user information knowledge graph through a knowledge graph analysis tool, wherein the target group fraud risk association rule comprises a target user named entity, a target user named entity attribute and a target user named entity association relation.
The historical fraud blacklist user data is sourced from a system security platform summarized by various systems in the enterprise on one hand, and from a third party data acquisition channel outside the enterprise on the other hand. And each system in the enterprise can collect blacklist users found by the security modules in each system to the system security platform. In addition, the system security platform also collects system security domain data from public network channels by starting different timing tasks, wherein the system security domain data comprises blacklist data of each industry. In a more specific embodiment, users having a history of credit fraud are obtained from a credit assessment organization external to the enterprise as historical fraud blacklisted users.
Further, after the obtaining the historical fraud blacklist user data and obtaining the target group fraud risk association rule according to the historical fraud blacklist user data and the user information knowledge graph through a knowledge graph analysis tool, the method further includes: firstly, traversing and processing the user information knowledge graph by using a community discovery algorithm to obtain a target hierarchical community structure corresponding to the user information knowledge graph. In a more specific embodiment, the community discovery algorithm employs a Louvain algorithm. And then, calculating insurance claim index data corresponding to the target hierarchical community structure, and generating corresponding insurance wind control rules according to the target hierarchical community structure and the insurance claim index data. Wherein the insurance claim index data includes, but is not limited to: the odds, the annual system odds, the reported odds, the integrated odds, the annual accident system full odds, and the like. And finally, sending the insurance wind control rule to an insurance business system which needs to carry out insurance risk control.
Further, after calculating the insurance claim index data corresponding to the target hierarchical community structure and generating the corresponding insurance wind control rule according to the target hierarchical community structure and the insurance claim index data, the method further includes: firstly, checking the insurance wind control rule in the user information knowledge graph to obtain an insurance wind control rule checking result. The robustness and accuracy of the insurance wind control rule can be increased by adding a further verification process to the insurance wind control rule. And then comparing the insurance wind control rule verification result with the historical fraud blacklist user data, and removing the insurance wind control rule corresponding to the insurance wind control rule verification result which does not contain the historical fraud blacklist user data. And finally, removing the safety wind control rule corresponding to the safety wind control rule verification result which does not meet the preset wind control precision and the preset wind control coverage rate. The wind control precision and the wind control coverage rate are flexibly set according to the target insurance business, for example, the wind control precision is set to be detailed to detect each automobile maintenance record of an automobile associated with the user automobile insurance, and for example, the range of the wind control coverage rate is set to be the coverage rate of the number of suspected insurance fraud groups is at least 5.
Further, after traversing the user information knowledge graph by using the community discovery algorithm to obtain a target hierarchical community structure corresponding to the user information knowledge graph, the method further includes: firstly, taking the user named entity attribute as a statistical dimension, and counting first user attribute risk statistical data corresponding to the target user named entity in the target hierarchical community structure. And then extracting first enabling feature risk data corresponding to the target user named entity in the target hierarchical community structure. And finally, the first user attribute risk statistical data and the first pulsing feature risk data are sent to an insurance service system containing single-user granularity risk control. The embodiment of the present invention is not described in detail herein. For example, if a scientific and technological financial platform discovers that a target automobile insurance user with an age group and a family income within a certain range has a high probability of abnormal automobile maintenance records through statistical data, user basic information and user attribute information of the target automobile insurance user are analyzed to generate risk control information with single user granularity.
The method comprises the steps of sending the target group fraud risk association rules to corresponding early warning platforms, enabling the early warning platforms to scan full platform data in real time according to the target group fraud risk association rules, and sending early warning risk abnormality information to relevant business personnel if the real-time scanning generates the early warning risk abnormality information, and further comprising: firstly, creating a user information knowledge graph updating thread, and using the user information knowledge graph updating thread to execute the steps according to a preset time frequency to update the user information knowledge graph. And simultaneously, creating a user information knowledge graph monitoring thread, and monitoring whether the data volume and the performance parameters of the first graph database corresponding to the user information knowledge graph reach a data volume threshold range and a performance parameter threshold range simultaneously by using the user information knowledge graph monitoring thread. If yes, suspending the user information knowledge graph updating thread and the user information knowledge graph monitoring thread, and transferring all data of the first graph database to a preset second graph database so that the second graph database replaces the first graph database, wherein the maximum storage data amount of the second graph database is larger than that of the first graph database, and the performance parameters of the second graph database are superior to those of the first graph database. And finally, restarting the user information knowledge graph updating thread and the user information knowledge graph monitoring thread. If the data amounts of the first graph database and the second graph database reach a certain order of magnitude, expansion schemes such as sub-tables and sub-libraries in the database technology can be adopted, and the expansion scheme of the database is not a core part of the embodiment, so that the description is omitted here.
S105, sending the target partner fraud risk association rules to corresponding early warning platforms, so that the early warning platforms scan full-platform data in real time according to the target partner fraud risk association rules, and if early warning risk abnormality information is generated by the real-time scanning, sending the early warning risk abnormality information to related business personnel, wherein the early warning risk abnormality information comprises corresponding user information, user attribute information and user association relations.
According to the knowledge-graph-based group fraud risk identification method, user information data and business association data of insurance users are obtained to serve as data to be processed; sequentially performing entity extraction, attribute extraction and relation extraction on the data to be processed to respectively obtain a user named entity, a user named entity attribute and a user named entity association relation; importing the user named entity, the user named entity attribute and the user named entity association relationship into a first graph database to obtain a user information knowledge graph; acquiring a group fraud risk association rule according to historical fraud blacklist user data and the user information knowledge graph by a knowledge graph analysis tool; and sending the group fraud risk association rules to corresponding early warning platforms so that the early warning platforms can scan the whole platform data in real time according to the group fraud risk association rules to generate early warning risk abnormality information. The method and the system not only fully utilize the associated data inside and outside enterprises, fully mine the associated information among multiple entities in the insurance industry, apply the group partner fraud risk information in a plurality of links in the insurance business in a plurality of modes, and improve the capability of the insurance industry in preventing and managing the group partner fraud risk.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic of each process, and should not limit the implementation process of the embodiment of the present application in any way.
In an embodiment, a knowledge-graph-based group fraud risk identification apparatus 100 is provided, where the knowledge-graph-based group fraud risk identification apparatus 100 corresponds to the knowledge-graph-based group fraud risk identification method in the above embodiment one by one. As shown in fig. 3, the partner fraud risk identification apparatus 100 based on a knowledge graph includes a first data acquisition module 11, a first data extraction module 12, a first graph database module 13, a first knowledge graph analysis module 14, and a partner fraud risk early warning module 15. The functional modules are described in detail as follows:
the first data acquisition module 11 is configured to acquire user information data and service association data of all insurance users from a preset data source as data to be processed, where the data source includes historical data of each system in an enterprise, collected data of a public network, and data provided by a third party mechanism;
the first data extraction module 12 is configured to sequentially perform entity extraction, attribute extraction, and relationship extraction from the data to be processed, so as to correspondingly obtain a target user named entity, a target user named entity attribute associated with the target user named entity, and a target user named entity association relationship between the target user named entities;
A first graph database module 13, configured to import the target user named entity, the target user named entity attribute, and the target user named entity association relationship into a deployed first graph database, to obtain a user information knowledge graph;
a first knowledge graph analysis module 14, configured to obtain historical fraud blacklist user data, and obtain, by using a knowledge graph analysis tool, a target group fraud risk association rule according to the historical fraud blacklist user data and the user information knowledge graph, where the target group fraud risk association rule includes the target user named entity, the target user named entity attribute, and the target user named entity association relationship;
and the group fraud risk early warning module 15 is configured to send the target group fraud risk association rule to a corresponding early warning platform, where the early warning platform scans full-platform data in real time according to the target group fraud risk association rule, and if early warning risk abnormality information is generated by the real-time scanning, sends the early warning risk abnormality information to related business personnel, where the early warning risk abnormality information includes corresponding user information, user attribute information and a user association relationship.
Further, the first data extraction module 12 further includes:
the entity alignment processing sub-module is used for carrying out the alignment processing of the knowledge graph entity on the target user named entity, the target user named entity attribute and the target user named entity association relation;
the conflict resolution processing sub-module is used for carrying out knowledge spectrum conflict resolution processing on the target user named entity, the target user named entity attribute and the target user named entity association relation;
and the knowledge fusion processing sub-module is used for carrying out knowledge graph knowledge fusion processing on the target user named entity, the target user named entity attribute and the target user named entity association relationship.
Further, the first knowledge-graph analysis module 14 further includes:
the community structure generation sub-module is used for traversing and processing the user information knowledge graph by using a community discovery algorithm to obtain a target hierarchical community structure corresponding to the user information knowledge graph;
the insurance wind control rule sub-module is used for calculating insurance claim index data corresponding to the target hierarchical community structure and generating corresponding insurance wind control rules according to the target hierarchical community structure and the insurance claim index data;
And the wind control rule pushing sub-module is used for sending the insurance wind control rule to an insurance service system needing insurance risk control.
Further, the insurance wind control rule submodule further includes:
the wind control rule checking subunit is used for checking the safety wind control rule in the user information knowledge graph to obtain a safety wind control rule checking result;
the wind control rule first screening subunit is used for comparing the verification result of the safety wind control rule with the historical fraud blacklist user data and removing the safety wind control rule corresponding to the verification result of the safety wind control rule which does not contain the historical fraud blacklist user data;
and the wind control rule second screening subunit is used for removing the safety wind control rule corresponding to the safety wind control rule verification result which does not meet the preset wind control precision and the preset wind control coverage rate.
Further, the insurance wind control rule submodule further includes:
the user attribute risk subunit is used for taking the user named entity attribute as a statistical dimension and counting first user attribute risk statistical data corresponding to the target user named entity in the target hierarchical community structure;
The enabling feature risk subunit is used for extracting first enabling feature risk data corresponding to the target user named entity in the target hierarchical community structure;
and the risk data pushing subunit is used for sending the first user attribute risk statistical data and the first unbedding characteristic risk data to an insurance business system containing single-user granularity risk control.
Further, the group fraud risk pre-warning module 15 further includes:
the knowledge graph updating sub-module is used for creating a user information knowledge graph updating thread, executing the steps according to a preset time frequency by using the user information knowledge graph updating thread, and updating the user information knowledge graph;
the knowledge graph monitoring sub-module is used for creating a user information knowledge graph monitoring thread, and monitoring whether the data volume and the performance parameters of a first graph database corresponding to the user information knowledge graph reach a data volume threshold range and a performance parameter threshold range at the same time by using the user information knowledge graph monitoring thread;
the graph database migration submodule is used for suspending the user information knowledge graph updating thread and the user information knowledge graph monitoring thread if yes, migrating all data of the first graph database to a preset second graph database so that the second graph database replaces the first graph database, wherein the maximum storage data amount of the second graph database is larger than that of the first graph database, and the performance parameters of the second graph database are superior to those of the first graph database;
And the thread restarting sub-module is used for restarting the user information knowledge graph updating thread and the user information knowledge graph monitoring thread.
The meaning of "first" and "second" in the above modules/units is merely to distinguish different modules/units, and is not used to limit which module/unit has higher priority or other limiting meaning. Furthermore, the terms "comprises," "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or modules is not necessarily limited to those steps or modules that are expressly listed or inherent to such process, method, article, or apparatus, but may include other steps or modules that may not be expressly listed or inherent to such process, method, article, or apparatus, and the partitioning of such modules by means of such elements is only a logical partitioning and may be implemented in a practical application.
For specific limitations on the knowledge-graph-based group fraud risk identification means, reference may be made to the above limitation on the knowledge-graph-based group fraud risk identification method, and no further description is given here. The above-mentioned knowledge-graph-based individual modules in the group fraud risk identification apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 4. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing data involved in a knowledge-graph-based method for identifying the risk of group fraud. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program when executed by a processor implements a knowledge-graph-based method of identifying risk of group fraud.
In one embodiment, a computer device is provided, including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the knowledge-graph-based method for identifying a group fraud risk in the above embodiment, such as steps S101 to S105 shown in fig. 2, and other extensions of the method and extensions of related steps, when the computer program is executed by the processor. Alternatively, the processor, when executing the computer program, implements the functions of each module/unit of the knowledge-graph-based group fraud risk identification apparatus in the above embodiment, for example, the functions of the modules 11 to 15 shown in fig. 3. In order to avoid repetition, a description thereof is omitted.
The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like that is a control center of the computer device, connecting various parts of the overall computer device using various interfaces and lines.
The memory may be used to store the computer program and/or modules, and the processor may implement various functions of the computer device by running or executing the computer program and/or modules stored in the memory, and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data (such as audio data, video data, etc.) created according to the use of the cellular phone, etc.
The memory may be integrated in the processor or may be provided separately from the processor.
In one embodiment, a computer readable storage medium is provided, on which a computer program is stored, which when executed by a processor, implements the steps of the knowledge-graph-based method for identifying a group fraud risk in the above embodiment, such as steps S101 to S105 shown in fig. 2 and other extensions of the method and extensions of related steps. Alternatively, the computer program when executed by the processor implements the functions of the modules/units of the knowledge-graph-based group fraud risk identification apparatus in the above embodiment, for example, the functions of the modules 11 to 15 shown in fig. 3. In order to avoid repetition, a description thereof is omitted.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions.
The above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims (10)

1. The method for identifying the risk of the group fraud based on the knowledge graph is characterized by comprising the following steps:
acquiring user information data and service association data of all insurance users from a preset data source, and taking the user information data and the service association data as data to be processed, wherein the data source comprises historical data of each system in an enterprise, public network acquisition data and data provided by a third party mechanism;
Sequentially performing entity extraction, attribute extraction and relation extraction on the data to be processed to respectively obtain user named entities, user named entity attributes associated with the user named entities and user named entity association relations among the user named entities;
importing the user named entity, the user named entity attribute and the user named entity association relationship into a deployed first graph database to obtain a user information knowledge graph;
acquiring historical fraud blacklist user data, and obtaining a target group fraud risk association rule according to the historical fraud blacklist user data and the user information knowledge graph through a knowledge graph analysis tool, wherein the target group fraud risk association rule comprises a target user named entity, a target user named entity attribute and a target user named entity association relation;
and sending the target group fraud risk association rule to a corresponding early warning platform so that the early warning platform scans full-platform data in real time according to the target group fraud risk association rule, and if early warning risk abnormality information is generated by real-time scanning, sending the early warning risk abnormality information to related business personnel, wherein the early warning risk abnormality information comprises corresponding user information, user attribute information and user association relation.
2. The method for identifying the risk of group fraud based on a knowledge graph according to claim 1, wherein after sequentially performing entity extraction, attribute extraction and relationship extraction on the data to be processed to obtain a user named entity, a user named entity attribute associated with the user named entity, and a user named entity association relationship between the user named entities, the method further comprises:
carrying out knowledge graph entity alignment processing on the target user named entity, the target user named entity attribute and the target user named entity association relation;
carrying out knowledge spectrum conflict resolution processing on the target user named entity, the target user named entity attribute and the target user named entity association relation;
and carrying out knowledge graph knowledge fusion processing on the target user named entity, the target user named entity attribute and the target user named entity association relation.
3. The method for identifying risk of group fraud based on knowledge graph according to claim 1, wherein after obtaining historical fraud blacklist user data and obtaining target group fraud risk association rule according to the historical fraud blacklist user data and the user information knowledge graph by a knowledge graph analysis tool, further comprising:
Traversing and processing the user information knowledge graph by using a community discovery algorithm to obtain a target hierarchical community structure corresponding to the user information knowledge graph;
calculating insurance claim index data corresponding to the target hierarchical community structure, and generating corresponding insurance wind control rules according to the target hierarchical community structure and the insurance claim index data;
and sending the insurance wind control rule to an insurance business system needing insurance risk control.
4. The method for identifying a risk of group fraud based on a knowledge graph according to claim 3, wherein after calculating the insurance claim index data corresponding to the target hierarchical community structure and generating the corresponding insurance wind control rule according to the target hierarchical community structure and the insurance claim index data, further comprising:
checking the insurance wind control rule in the user information knowledge graph to obtain an insurance wind control rule checking result;
comparing the insurance wind control rule verification result with the historical fraud blacklist user data, and removing the insurance wind control rule corresponding to the insurance wind control rule verification result which does not contain the historical fraud blacklist user data;
And removing the safety wind control rule corresponding to the safety wind control rule verification result which does not meet the preset wind control precision and the preset wind control coverage rate.
5. The method for identifying a risk of group fraud based on a knowledge graph according to claim 3, wherein after traversing the user information knowledge graph by using a community discovery algorithm to obtain a target hierarchical community structure corresponding to the user information knowledge graph, the method further comprises:
taking the user named entity attribute as a statistical dimension, and counting first user attribute risk statistical data corresponding to the target user named entity in the target hierarchical community structure;
extracting first ebedding feature risk data corresponding to the target user named entity in the target hierarchical community structure;
and sending the first user attribute risk statistical data and the first enabling feature risk data to an insurance business system containing single-user granularity risk control.
6. The method for identifying the risk of the partner fraud based on the knowledge graph according to claim 1, wherein the sending the target partner fraud risk association rule to the corresponding pre-warning platform, so that the pre-warning platform scans the whole platform data in real time according to the target partner fraud risk association rule, and if the real-time scanning generates the pre-warning risk abnormality information, after sending the pre-warning risk abnormality information to related business personnel, further comprises:
Creating a user information knowledge graph updating thread, and using the user information knowledge graph updating thread to execute the steps according to a preset time frequency to update the user information knowledge graph;
creating a user information knowledge graph monitoring thread, and monitoring whether the data quantity and the performance parameters of a first graph database corresponding to the user information knowledge graph reach a data quantity threshold range and a performance parameter threshold range simultaneously by using the user information knowledge graph monitoring thread;
if yes, suspending the user information knowledge graph updating thread and the user information knowledge graph monitoring thread, and transferring all data of the first graph database to a preset second graph database so that the second graph database replaces the first graph database, wherein the maximum storage data amount of the second graph database is larger than that of the first graph database, and the performance parameters of the second graph database are superior to those of the first graph database;
restarting the user information knowledge graph updating thread and the user information knowledge graph monitoring thread.
7. A knowledge-graph-based group fraud risk identification device, comprising:
The system comprises a first data acquisition module, a second data acquisition module and a third party mechanism, wherein the first data acquisition module is used for acquiring user information data and service association data of all insurance users from a preset data source, and taking the user information data and the service association data as data to be processed, wherein the data source comprises historical data of each system in an enterprise, public network acquisition data and data provided by the third party mechanism;
the first data extraction module is used for sequentially carrying out entity extraction, attribute extraction and relation extraction on the data to be processed to respectively obtain user named entities, user named entity attributes associated with the user named entities and user named entity association relations among the user named entities;
the first graph database module is used for importing the user named entity, the user named entity attribute and the user named entity association relationship into a deployed first graph database to obtain a user information knowledge graph;
the first knowledge graph analysis module is used for acquiring historical fraud blacklist user data, and obtaining a target group fraud risk association rule according to the historical fraud blacklist user data and the user information knowledge graph through the knowledge graph analysis tool, wherein the target group fraud risk association rule comprises a target user naming entity, a target user naming entity attribute and a target user naming entity association relation;
And the group partner fraud risk early warning module is used for sending the target group partner fraud risk association rule to a corresponding early warning platform so that the early warning platform scans the whole platform data in real time according to the target group partner fraud risk association rule, and if early warning risk abnormality information is generated by the real-time scanning, the early warning risk abnormality information is sent to related business personnel, wherein the early warning risk abnormality information comprises corresponding user information, user attribute information and user association relation.
8. A knowledge-graph-based group fraud risk identification apparatus as in claim 7, wherein said first data extraction module further comprises:
the entity alignment processing sub-module is used for carrying out the alignment processing of the knowledge graph entity on the target user named entity, the target user named entity attribute and the target user named entity association relation;
the conflict resolution processing sub-module is used for carrying out knowledge spectrum conflict resolution processing on the target user named entity, the target user named entity attribute and the target user named entity association relation;
and the knowledge fusion processing sub-module is used for carrying out knowledge graph knowledge fusion processing on the target user named entity, the target user named entity attribute and the target user named entity association relationship.
9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor, when executing the computer program, implements the steps of a method for identifying a partner fraud risk based on a knowledge-graph as claimed in any one of claims 1 to 6.
10. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the steps of a knowledge-graph based method of identifying a risk of group fraud as claimed in any one of claims 1 to 6.
CN202310295749.2A 2023-03-23 2023-03-23 Knowledge graph-based group fraud risk identification method and related equipment Pending CN116308824A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310295749.2A CN116308824A (en) 2023-03-23 2023-03-23 Knowledge graph-based group fraud risk identification method and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310295749.2A CN116308824A (en) 2023-03-23 2023-03-23 Knowledge graph-based group fraud risk identification method and related equipment

Publications (1)

Publication Number Publication Date
CN116308824A true CN116308824A (en) 2023-06-23

Family

ID=86784893

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310295749.2A Pending CN116308824A (en) 2023-03-23 2023-03-23 Knowledge graph-based group fraud risk identification method and related equipment

Country Status (1)

Country Link
CN (1) CN116308824A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117575782A (en) * 2024-01-15 2024-02-20 杭银消费金融股份有限公司 Leiden community discovery algorithm-based group fraud identification method
CN117688055A (en) * 2023-11-08 2024-03-12 亿保创元(北京)信息科技有限公司 Insurance black product identification and response system based on correlation network analysis technology
CN117710113A (en) * 2023-11-17 2024-03-15 中国人寿保险股份有限公司山东省分公司 Abnormal insurance application behavior identification method and system based on legal person business knowledge graph
CN117726452A (en) * 2023-12-18 2024-03-19 琥珀投资基金管理(武汉)有限公司 Financial intelligent big data analysis and risk management system

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117688055A (en) * 2023-11-08 2024-03-12 亿保创元(北京)信息科技有限公司 Insurance black product identification and response system based on correlation network analysis technology
CN117710113A (en) * 2023-11-17 2024-03-15 中国人寿保险股份有限公司山东省分公司 Abnormal insurance application behavior identification method and system based on legal person business knowledge graph
CN117726452A (en) * 2023-12-18 2024-03-19 琥珀投资基金管理(武汉)有限公司 Financial intelligent big data analysis and risk management system
CN117575782A (en) * 2024-01-15 2024-02-20 杭银消费金融股份有限公司 Leiden community discovery algorithm-based group fraud identification method
CN117575782B (en) * 2024-01-15 2024-05-07 杭银消费金融股份有限公司 Leiden community discovery algorithm-based group fraud identification method

Similar Documents

Publication Publication Date Title
CN116308824A (en) Knowledge graph-based group fraud risk identification method and related equipment
US20200389495A1 (en) Secure policy-controlled processing and auditing on regulated data sets
CN109558748B (en) Data processing method and device, electronic equipment and storage medium
US9111235B2 (en) Method and system to evaluate risk of configuration changes in an information system
Jeong et al. Anomaly teletraffic intrusion detection systems on hadoop-based platforms: A survey of some problems and solutions
CN111815454B (en) Data uplink method and device, electronic equipment and storage medium
CN110674247A (en) Barrage information intercepting method and device, storage medium and equipment
CN110956269A (en) Data model generation method, device, equipment and computer storage medium
CN112364059B (en) Correlation matching method, device, equipment and storage medium under multi-rule scene
US11645386B2 (en) Systems and methods for automated labeling of subscriber digital event data in a machine learning-based digital threat mitigation platform
CN111488594A (en) Authority checking method and device based on cloud server, storage medium and terminal
US8396877B2 (en) Method and apparatus for generating a fused view of one or more people
CN113157734B (en) Data processing method, device and equipment based on search framework and storage medium
CN112286930A (en) Method, device, storage medium and electronic equipment for resource sharing of redis business side
CN110502549B (en) User data processing method and device, computer equipment and storage medium
US10970341B2 (en) Predictive modeling in event processing systems for big data processing in cloud
CN114969187A (en) Data analysis system and method
CN115051859A (en) Information analysis method, information analysis device, electronic apparatus, and medium
CN114860732A (en) Key report processing method and device, computer equipment and storage medium
CN112632371A (en) Anti-fraud method and system for banking business
CN111163088B (en) Message processing method, system and device and electronic equipment
CN112667730B (en) External data verification method, system, equipment and storage medium
CN116703184B (en) Data processing method, data processing device, electronic equipment and readable storage medium
CN116112264B (en) Method and device for controlling access to strategy hidden big data based on blockchain
US20220038892A1 (en) Mathematical Summaries of Telecommunications Data for Data Analytics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination