CN115438979B - Expert model decision-fused data risk identification method and server - Google Patents

Expert model decision-fused data risk identification method and server Download PDF

Info

Publication number
CN115438979B
CN115438979B CN202211117623.8A CN202211117623A CN115438979B CN 115438979 B CN115438979 B CN 115438979B CN 202211117623 A CN202211117623 A CN 202211117623A CN 115438979 B CN115438979 B CN 115438979B
Authority
CN
China
Prior art keywords
event
risk
behavior
suspected
suspected risk
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211117623.8A
Other languages
Chinese (zh)
Other versions
CN115438979A (en
Inventor
代洪立
武传华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Spread Technology Co ltd
Original Assignee
Shenzhen Spread Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Spread Technology Co ltd filed Critical Shenzhen Spread Technology Co ltd
Priority to CN202211117623.8A priority Critical patent/CN115438979B/en
Publication of CN115438979A publication Critical patent/CN115438979A/en
Application granted granted Critical
Publication of CN115438979B publication Critical patent/CN115438979B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/018Certifying business or products
    • G06Q30/0185Product, service or business identity fraud
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Accounting & Taxation (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Finance (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Educational Administration (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a data risk identification method and a server for fusing expert model decisions, which are characterized in that risk behavior decision knowledge, behavior event distribution characteristics and behavior event distinguishing labels of suspected risk behavior events are mined from abnormal user behavior records, a plurality of suspected risk behavior events corresponding to the same selected suspected risk behavior events are determined, and a risk state updating relation network of the same selected suspected risk behavior events is determined, in other words, the risk behavior decision knowledge, the behavior event distribution characteristics and the behavior event distinguishing labels are subjected to linkage analysis, and meanwhile, the risk state updating relation network of the selected suspected risk behavior events is determined according to the three knowledge attributes, so that the accuracy of determining whether the suspected risk potential hazards exist or not and the richness of the risk state updating relation network of the selected suspected risk behavior events are improved, and the accuracy and the reliability of data risk identification are ensured.

Description

Expert model decision-fused data risk identification method and server
Technical Field
The invention relates to the technical field of data processing, in particular to a data risk identification method and a server for fusing expert model decisions.
Background
Expert systems (Expert systems) can apply artificial intelligence technology and computer technology, make reasoning and judgment according to knowledge and experience in the system, simulate the decision process of human Expert, and are a computer program system for simulating human Expert to solve the problem in the field.
Expert systems are typically composed of 6 parts, human-machine interaction interfaces, a knowledge base, an inference engine, an interpreter, a comprehensive database, knowledge acquisition, and the like. The architecture of an expert system varies with the type, function, and size of the expert system.
At present, expert systems are widely applied in fields of data pushing, interest analysis and the like, but have poor application maturity in the field of data security processing, such as aiming at some data risk identification tasks, and the conventional technology is difficult to ensure identification precision and reliability.
Disclosure of Invention
The invention provides a data risk identification method and a server for fusing expert model decisions, and the technical scheme is adopted in order to achieve the technical purposes.
The first aspect is a data risk identification method integrating expert model decisions, applied to an artificial intelligence server, comprising: acquiring abnormal business activity big data, and screening at least two abnormal user behavior records from the abnormal business activity big data according to a set screening instruction; mining risk behavior decision knowledge of each suspected risk behavior event in each abnormal user behavior record, and respectively determining behavior event distribution characteristics and behavior event distinguishing labels of each suspected risk behavior event in each abnormal user behavior record; determining to-be-processed suspected risk behavior events corresponding to the same selected suspected risk behavior event in the at least two abnormal user behavior records according to risk behavior decision knowledge, behavior event distribution characteristics and behavior event distinguishing labels of each abnormal user behavior record; and determining a risk state updating relation network of the selected suspected risk behavior event according to behavior event distribution characteristics of the suspected risk behavior event to be processed, which correspond to the selected suspected risk behavior event in the at least two abnormal user behavior records.
In the embodiment of the invention, the risk behavior decision knowledge, the behavior event distribution characteristics and the behavior event distinguishing labels of the suspected risk behavior events are mined from the abnormal user behavior records, a plurality of suspected risk behavior events corresponding to the same selected suspected risk behavior event are determined, and the risk state updating relation network of the same selected suspected risk behavior event is determined, in other words, the risk behavior decision knowledge, the behavior event distribution characteristics and the behavior event distinguishing labels are subjected to linkage analysis, and meanwhile, the risk state updating relation network of the selected suspected risk behavior event is determined according to the three knowledge attributes, so that the accuracy of determining whether the suspected risk potential hazards exist in the suspected risk behavior event and the richness of the risk state updating relation network of the selected suspected risk behavior event are improved, and the accuracy and the reliability of data risk identification are ensured.
Under some independent design ideas, the at least two abnormal user behavior records comprise a first abnormal user behavior record and a second abnormal user behavior record, wherein the first abnormal user behavior record and the second abnormal user behavior record are abnormal user behavior records with time sequence association; determining the to-be-processed suspected risk behavior event corresponding to the same selected suspected risk behavior event in the at least two abnormal user behavior records according to the risk behavior decision knowledge, the behavior event distribution characteristics and the behavior event distinguishing label of each abnormal user behavior record, including: determining that the first abnormal user behavior record is not lower than a first suspected risk behavior event and the second abnormal user behavior record is not lower than a second suspected risk behavior event; determining a behavior feature commonality score between the first suspected risk behavior event and each second suspected risk behavior event according to behavior event distribution features, behavior event distinguishing labels and risk behavior decision knowledge of the first suspected risk behavior event in the first abnormal user behavior record and behavior event distribution features, behavior event distinguishing labels and risk behavior decision knowledge of each second suspected risk behavior event in the second abnormal user behavior record for each first suspected risk behavior event; and determining the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event by combining the behavior feature commonality scores, and determining the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event as the to-be-processed suspected risk behavior event.
In the embodiment of the invention, in view of the fact that a plurality of suspected risk behavior events possibly exist in each abnormal user behavior record, for each first suspected risk behavior event, the behavior feature commonality score between each first suspected risk behavior event and each second suspected risk behavior event can be determined according to the behavior event distribution feature, the behavior event distinguishing label and the risk behavior decision knowledge, so that carpet analysis of each first suspected risk behavior event can be realized, occurrence of problems related to event omission is avoided, and the accuracy of determining the suspected risk behavior event to be processed is improved.
Under some independent design ideas, the determining, for each first suspected risk behavior event, a behavior feature commonality score between the first suspected risk behavior event and each second suspected risk behavior event according to a behavior event distribution feature, a behavior event distinguishing tag, and risk behavior decision knowledge of the first suspected risk behavior event in the first abnormal user behavior record, and a behavior event distribution feature, a behavior event distinguishing tag, and risk behavior decision knowledge of each second suspected risk behavior event in the second abnormal user behavior record, includes: for each first suspected risk behavior event, determining not less than a second selected suspected risk behavior event from the second abnormal user behavior records according to the risk behavior decision knowledge, the behavior event distribution characteristics and not less than one type of target knowledge attribute in the behavior event distinguishing label; determining a behavioural feature commonness score between the first suspected risk behavioural event and the second selected suspected risk behavioural event according to the risk behavioural decision knowledge, the behavioural event distribution feature and the behavioural event discrimination tag; the step of determining the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event by combining the behavior feature commonality scores, and determining the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event as the to-be-processed suspected risk behavior event, includes: and determining the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event according to the behavior feature commonality scores between the first suspected risk behavior event and the second selected suspected risk behavior event, and determining the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event as the to-be-processed suspected risk behavior event.
In the embodiment of the invention, for each first suspected risk behavior event, according to the risk behavior decision knowledge, the behavior event distribution characteristics and the at least one type of target knowledge attribute in the behavior event distinguishing label, the at least one second selected suspected risk behavior event in the second abnormal user behavior record is determined, in other words, the determined area of the second suspected risk behavior event in the second abnormal user behavior record can be compressed according to the at least one type of target knowledge attribute, so that unnecessary feature commonality scoring operation can be avoided, and the timeliness of calculating the behavior feature commonality score between the first suspected risk behavior event and each second suspected risk behavior event can be improved conveniently.
Under some independent design ideas, for each first suspected risk behavior event, determining, according to the risk behavior decision knowledge, the behavior event distribution feature and at least one type of target knowledge attribute in the behavior event distinguishing tag, at least one second selected suspected risk behavior event from the second abnormal user behavior record, including: on the basis that the difference value of the acquisition moments of the first abnormal user behavior record and the second abnormal user behavior record is smaller than a set time difference judgment value, determining a distribution feature commonality score of the first suspected risk behavior event and each second suspected risk behavior event according to the behavior event distribution feature of the first suspected risk behavior event in the first abnormal user behavior record and the behavior event distribution feature of each second suspected risk behavior event in the second abnormal user behavior record for each first suspected risk behavior event; determining whether the distribution feature commonality scores of the first suspected risk behavior event and each second suspected risk behavior event meet a first set requirement; and determining a second suspected risk behavior event corresponding to the distribution feature commonality score meeting the first setting requirement as the second selected suspected risk behavior event.
In the embodiment of the invention, whether the distribution feature commonality scores of the first suspected risk behavior event and each second suspected risk behavior event meet the first setting requirement or not is determined, and the second suspected risk behavior event corresponding to the distribution feature commonality score meeting the first setting requirement is determined as the second selected suspected risk behavior event, in other words, the second suspected risk behavior event corresponding to the distribution feature commonality score not meeting the first setting requirement is cleaned, so that the timeliness of determining the second selected suspected risk behavior event can be improved.
Under some independent design ideas, for each first suspected risk behavior event, determining, according to the risk behavior decision knowledge, the behavior event distribution feature and at least one type of target knowledge attribute in the behavior event distinguishing tag, at least one second selected suspected risk behavior event from the second abnormal user behavior record, including: for each first suspected risk behavior event, determining a tag-distinguishing commonality score between the first suspected risk behavior event and each second suspected risk behavior event according to a behavior event-distinguishing tag of the first suspected risk behavior event and a behavior event-distinguishing tag of each second suspected risk behavior event; determining whether a discrimination tag commonality score between the first suspected risk behavior event and each second suspected risk behavior event meets a second set requirement; and determining a second suspected risk behavior event corresponding to the identification tag commonality score meeting the second setting requirement as the second selected suspected risk behavior event.
In the embodiment of the invention, besides the thought of compressing the determination area of the second suspected risk behavior event through the first setting requirement, the second suspected risk behavior event corresponding to the distinguishing tag commonality score meeting the second setting requirement can be determined as the second selected suspected risk behavior event, in other words, the second suspected risk behavior event corresponding to the distinguishing tag commonality score not meeting the second setting requirement is cleaned, so that the thought of determining the second selected suspected risk behavior event is more flexible, and the timeliness of determining the second selected suspected risk behavior event can be improved.
Under some independent design ideas, the determining the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event by combining the behavior feature commonality scores includes: determining whether association is completed between the first suspected risk behavior event and each second suspected risk behavior event by combining the behavior feature commonality score and a first event association indication; and determining the first suspected risk behavior event and the second suspected risk behavior event which are associated to be completed as the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event.
In the embodiment of the invention, the first suspected risk action event and each second suspected risk action event are subjected to event association by combining the action feature commonality score and the first event association indication, so that the accuracy and the credibility of event association can be improved, and the accuracy of determining the first suspected risk action event and the second suspected risk action event corresponding to the same selected suspected risk action event is further improved.
Under some independent design ideas, the determining the first and second suspected risk behavior events corresponding to the same selected suspected risk behavior event, where the first and second suspected risk behavior events will complete the associated first and second suspected risk behavior events, includes: determining the first suspected risk behavior event and the second suspected risk behavior event which are associated to be completed as a suspected risk behavior event doublet; on the basis that the number of the suspected risk behavior event doublets is two or more, determining a selected suspected risk behavior event doublet from a plurality of suspected risk behavior event doublets according to a second event association indication, and determining a first suspected risk behavior event and the second suspected risk behavior event in the selected suspected risk behavior event doublet as the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event.
In the embodiment of the invention, on the basis that the number of the suspected risk event doublets is two or more, the selected suspected risk event doublet is determined according to the second event association indication, in other words, the selected suspected risk event doublet with the highest association can be determined from a plurality of suspected risk event doublets, so that the accuracy of calculating the first suspected risk event and the second suspected risk event corresponding to the same selected suspected risk event can be improved.
Under some independent design ideas, according to the behavior event distribution characteristics of the selected suspected risk behavior events corresponding to the to-be-processed suspected risk behavior events in the plurality of abnormal user behavior records, after determining the risk state update relationship network of the selected suspected risk behavior events, the method further includes: and updating a relation network according to the risk state of the selected suspected risk behavior event, and determining a risk hidden danger prediction report of the selected suspected risk behavior event.
In the embodiment of the invention, the risk hidden danger prediction report is determined according to the complete risk state updating relation network of the selected suspected risk behavior event, so that the accuracy of determining the risk hidden danger prediction report can be improved, the targeted risk protection processing is conveniently carried out by combining the risk hidden danger prediction report, and the safety of service information is ensured.
A second aspect is an artificial intelligence server comprising a memory and a processor; the memory is coupled to the processor; the memory is used for storing computer program codes, and the computer program codes comprise computer instructions; wherein the computer instructions, when executed by the processor, cause the artificial intelligence server to perform the method of the first aspect.
A third aspect is a computer readable storage medium having stored thereon a computer program which, when run, performs the method of the first aspect.
Drawings
Fig. 1 is a schematic flow chart of a data risk identification method for fusing expert model decisions according to an embodiment of the present invention.
Fig. 2 is a block diagram of a data risk recognition device for fusing expert model decisions according to an embodiment of the present invention.
Detailed Description
Hereinafter, the terms "first," "second," and "third," etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first", "a second", or "a third", etc., may explicitly or implicitly include one or more such feature.
Fig. 1 shows a flow chart of a data risk identification method for fusing expert model decisions, which is provided by an embodiment of the present invention, and the data risk identification method for fusing expert model decisions may be implemented by an artificial intelligence server, where the artificial intelligence server may include a memory and a processor; the memory is coupled to the processor; the memory is used for storing computer program codes, and the computer program codes comprise computer instructions; wherein the computer instructions, when executed by the processor, cause the artificial intelligence server to perform the technical scheme described in the following steps.
And 101, acquiring abnormal business activity big data, and screening at least two abnormal user behavior records from the abnormal business activity big data according to a set screening instruction.
The set screening instruction may be a flexibly set action record screening step (screening interval period), a flexibly set action record screening number, a flexibly set screening start time, or the like.
In some examples, when screening at least two abnormal user behavior records, the abnormal business activity big data may be split by an artificial intelligence technology, and the abnormal user behavior records are screened according to a set screening instruction, so that at least two abnormal user behavior records may be determined from the abnormal business activity big data.
The abnormal business activity big data can be acquired and detected by at least one data safety monitoring system/server/platform. In the embodiment of the invention, the number of the data security monitoring systems/servers/platforms can be two.
In other examples, the abnormal business activity big data may be activity big data of various digital businesses, and the big data has data risk and may cause a certain threat to the data information, so that targeted analysis and processing are required. For example, the abnormal business activity big data may be abnormal or risky electric business big data, cloud service big data, intelligent interconnection business big data, etc., and those skilled in the art may match the actual application scenario of the abnormal business activity big data according to the actual situation, which is not described herein.
Step 102, mining risk behavior decision knowledge of each suspected risk behavior event in each abnormal user behavior record, and respectively determining behavior event distribution characteristics and behavior event distinguishing labels of each suspected risk behavior event in each abnormal user behavior record.
For the embodiment of the invention, risk behavior decision knowledge can be understood as description vectors/description fields obtained through flexible setting knowledge mining technology mining.
In some examples, knowledge mining is one of the important links in AI technology, and can be implemented in conjunction with expert decision systems and neural network models. For example, risk behavior decision knowledge may be mined by a flexibly set knowledge mining model, which may be CNN, RNN, DNN, LSTM, or the like.
Further, the behavior event distribution feature may be obtained by highlighting an event recognition window, and in this embodiment of the present invention, the behavior event distribution feature may include recognition window information of a suspected risk behavior event, where the recognition window information of the suspected risk behavior event includes a first window constraint value (such as a window height value) and a second window constraint value (such as a window width value) of the recognition window of the suspected risk behavior event, and a window label of the recognition window of the suspected risk behavior event and a distribution of 4 window angles of the event recognition window, in other words, a distribution variable of the event recognition window is equal to the behavior event distribution feature.
Further, the behavior event distinguishing tag may be understood as an actual category corresponding to the suspected risk behavior event, for example, may be privacy data theft, network telecom fraud, trojan horse attack or DDOS attack.
On the basis of the above, after obtaining at least two abnormal user behavior records, the distribution analysis of the suspected risk behavior events and the label classification of the suspected risk behavior events can be respectively carried out on each abnormal user behavior record, so that the behavior event distribution characteristics and the behavior event distinguishing labels of each suspected risk behavior event are obtained from the abnormal user behavior records.
In some examples, the distribution analysis of the suspected risk behavior events and the label classification of the suspected risk behavior events recorded by the abnormal user behavior are realized through manual annotation, so that the accuracy of the determined distribution characteristics of the behavior events and the label discrimination of the behavior events can be ensured, and a machine learning model which is configured completely and is respectively used for carrying out the distribution analysis of the suspected risk behavior events and the label classification of the suspected risk behavior events can be used, so that the accuracy of the distribution analysis of the suspected risk behavior events and the label classification of the suspected risk behavior events can be improved.
It is understood that, when the distribution analysis of the suspected risk behavior events and the tag classification of the suspected risk behavior events are performed on the abnormal user behavior record, noise may exist in the abnormal user behavior record, and noise information may be ignored.
Step 103, determining to-be-processed suspected risk behavior events corresponding to the same selected suspected risk behavior event in the at least two abnormal user behavior records according to the risk behavior decision knowledge, the behavior event distribution characteristics and the behavior event distinguishing labels of each abnormal user behavior record.
It can be understood that after the risk behavior decision knowledge, the behavior event distribution feature and the behavior event distinguishing label in each abnormal user behavior record are obtained, each suspected risk behavior event corresponding to different abnormal user behavior records can be subjected to event association and matching through the risk behavior decision knowledge, the behavior event distribution feature and the behavior event distinguishing label in each abnormal user behavior record, so as to determine the suspected risk behavior event to be processed, which belongs to the same selected suspected risk behavior event.
Taking the suspected risk event in the abnormal user behavior record1 and the suspected risk event in the abnormal user behavior record2 as an example, the suspected risk event in the abnormal user behavior record1 and the suspected risk event in the abnormal user behavior record2 include the suspected risk event case_a1, the suspected risk event case_a2 and the suspected risk event case_a3, the suspected user behavior record2 includes the suspected risk event case_b1, the suspected risk event case_b2 and the suspected risk event case_b3, and if the suspected risk event case_a1 and the suspected risk event case_b1 are correlated through the risk event decision knowledge of each suspected risk event and the suspected risk event element, the suspected risk event case_a1 and the suspected risk event case_b1 are the suspected risk event corresponding to the same selected suspected risk event, the suspected risk event case_a2 and the suspected risk event case_b2 are the suspected event corresponding to the same selected suspected event, the suspected risk event case_a3 and the suspected risk event case_b3 are the suspected risk event corresponding to the same suspected event, the suspected risk event case_a2 and the suspected risk event case_b2 belong to the user to the abnormal event record1 and the suspected risk event case_b2, respectively, the suspected risk event case_a2 and the suspected risk event case_b1 belong to the user event case_1 is processed, the suspected risk behavior event case_a3 and the suspected risk behavior event case_b3 belong to-be-processed suspected risk behavior events which exist in the abnormal user behavior record1 and the abnormal user behavior record2 respectively, and the selected suspected risk behavior event target3 belongs to the to-be-processed suspected risk behavior events.
Step 104, determining a risk state updating relationship network of the selected suspected risk behavior event according to the behavior event distribution characteristics of the suspected risk behavior event to be processed, which corresponds to the selected suspected risk behavior event in the at least two abnormal user behavior records, respectively.
Further, the risk state update relationship network of the selected suspected risk behavior event may be understood as an event state change of the selected suspected risk behavior event, which may be represented by an ordered knowledge graph, so that risk state features of the selected suspected risk behavior event may be integrated in series, so as to determine a change situation of the selected suspected risk behavior event (such as an operation behavior feature/interaction activity feature change of the network fraud event in each link) based on the overall level, where the change situation may reflect a risk hidden danger that may occur in a later period.
It may be understood that after the association determination of the to-be-processed suspected risk behavior event corresponding to the selected suspected risk behavior event in each abnormal user behavior record, risk state integration and other processes may be performed through the behavior event distribution feature of each to-be-processed suspected risk behavior event in the abnormal business activity big data, so as to determine a risk state update relationship network of the selected suspected risk behavior event.
According to the data risk identification method fusing expert model decision, risk action decision knowledge, action event distribution characteristics and action event distinguishing labels of suspected risk action events are mined from abnormal user action records, a plurality of suspected risk action events corresponding to the same selected suspected risk action event are determined, in other words, a risk state updating relation network of the same selected suspected risk action event is determined, in other words, linkage analysis is conducted on the risk action decision knowledge, the action event distribution characteristics and the action event distinguishing labels, meanwhile, the risk state updating relation network of the selected suspected risk action event is determined according to the three knowledge attributes, and therefore accuracy of determining whether risk hidden danger exists in the suspected risk action event and the richness of the risk state updating relation network of the selected suspected risk action event are facilitated to be improved, and accuracy and reliability of data risk identification are guaranteed.
Under some independent design ideas, for step 103, the at least two abnormal user behavior records include a first abnormal user behavior record and a second abnormal user behavior record, where the first abnormal user behavior record and the second abnormal user behavior record are abnormal user behavior records with time sequence association; when determining the to-be-processed suspected risk behavior event corresponding to the same selected suspected risk behavior event in the at least two abnormal user behavior records according to the risk behavior decision knowledge, the behavior event distribution characteristics and the behavior event distinguishing label of each abnormal user behavior record, the method further may include the following steps 1031-1033.
Step 1031, determining that the first abnormal user behavior record is not lower than a first suspected risk behavior event and the second abnormal user behavior record is not lower than a second suspected risk behavior event.
Further, in order to improve the efficiency of event analysis, the plurality of abnormal user behavior records may be clustered, each two consecutive abnormal user behavior records are grouped into a group of abnormal user behavior records with time sequence association, in other words, each abnormal user behavior record may be split into two groups of abnormal user behavior records with time sequence association, the abnormal user behavior records with time sequence association include two consecutive abnormal user behavior records, for the abnormal user behavior records with time sequence association, any one abnormal user behavior record may be used as a first abnormal user behavior record, the other abnormal user behavior record may be used as a second abnormal user behavior record, and then according to the risk behavior decision knowledge, the behavior event distribution feature and the behavior event distinguishing tag, it may be determined that one of the first abnormal user behavior records with time sequence association is not lower than one first suspected risk behavior event and one of the second abnormal user behavior records with time sequence association is not lower than one second suspected risk behavior event.
Step 1032, for each first suspected risk behavior event, determining a behavior feature commonality score between the first suspected risk behavior event and each second suspected risk behavior event according to the behavior event distribution feature, the behavior event distinguishing tag and the risk behavior decision knowledge of the first suspected risk behavior event in the first abnormal user behavior record, and the behavior event distribution feature, the behavior event distinguishing tag and the risk behavior decision knowledge of each second suspected risk behavior event in the second abnormal user behavior record.
In the embodiment of the invention, for each first suspected risk behavior event in the first abnormal user behavior record, in order to accurately and reliably correlate and match to obtain a second suspected risk behavior event of the same selected suspected risk behavior event corresponding to the first suspected risk behavior event, event correlation is required between the first suspected risk behavior event and each second suspected risk behavior event in the second abnormal user behavior record, and in the event correlation process, a behavior feature commonality score between the first suspected risk behavior event and each second suspected risk behavior event can be determined according to behavior event distribution characteristics, behavior event distinguishing labels and risk behavior decision knowledge of the first suspected risk behavior event and each second suspected risk behavior event.
Step 1033, determining the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event by combining the behavior feature commonality scores, and determining the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event as the to-be-processed suspected risk behavior event.
For example, the behavioral characteristic commonality score may reflect a similarity between two suspected risk behavior events, and after determining the behavioral characteristic commonality score, the first suspected risk behavior event and each second suspected risk behavior event may be event-associated according to the behavioral characteristic commonality score between each first suspected risk behavior event and each second suspected risk behavior event, so as to determine a first suspected risk behavior event and a second suspected risk behavior event corresponding to the same selected suspected risk behavior event, in other words, determine the first suspected risk behavior event and the second suspected risk behavior event with the largest behavioral characteristic commonality score as the same selected suspected risk behavior event, so as to facilitate improvement of accuracy of determining the suspected risk behavior event to be processed.
In the embodiment of the invention, as a plurality of suspected risk behavior events possibly exist in each abnormal user behavior record, for each first suspected risk behavior event, the behavior feature commonality score between each first suspected risk behavior event and each second suspected risk behavior event can be determined according to the behavior event distribution feature, the behavior event distinguishing label and the risk behavior decision knowledge, so that carpet analysis (traversal) of each first suspected risk behavior event can be realized, occurrence of problems related to event omission can be avoided, and further the accuracy of determining the suspected risk behavior event to be processed is facilitated to be improved.
Under some independent design ideas, as a plurality of suspected risk behavior events may exist in each abnormal user behavior record, for each first suspected risk behavior event in the first abnormal user behavior record, a behavior feature commonality score between the first suspected risk behavior event and each second suspected risk behavior event needs to be determined, so that the operation cost of the behavior feature commonality score is high. According to the foregoing, for step 1032 and step 1033, when determining the behavior feature commonality score between the first suspected risk behavior event and each second suspected risk behavior event, and combining the behavior feature commonality scores, determining the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event, and determining the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event as the to-be-processed suspected risk behavior event, the following steps 201 to 203 may be included.
Step 201, for each first suspected risk behavior event, determining not less than a second selected suspected risk behavior event from the second abnormal user behavior records according to the risk behavior decision knowledge, the behavior event distribution feature and not less than one class of target knowledge attribute in the behavior event distinguishing label.
Further, the second selected suspected risk behavior event is not less than one of a plurality of second suspected risk behavior events in the second abnormal user behavior record. In other words, according to the risk behavior decision knowledge, the behavior event distribution characteristics and the behavior event distinguishing labels, at least one type of target knowledge attribute (knowledge attribute can be understood as different types of characteristics or knowledge) is determined to be not lower than a second selected suspected risk behavior event, in other words, by means of at least one type of target knowledge attribute, a determination area (local abnormal user behavior record) of a second suspected risk behavior event associated with an event to be performed in a second abnormal user behavior record can be compressed, so that unnecessary characteristic commonality scoring operation is avoided, and timeliness of calculating the behavior characteristic commonality score between the first suspected risk behavior event and each second suspected risk behavior event is facilitated to be improved.
Step 202, determining a performance feature commonality score between the first suspected risk performance event and the second selected suspected risk performance event according to the risk performance decision knowledge, the performance event distribution feature and the performance event distinguishing label.
Step 203, determining the first suspected risk behavior event and the second selected suspected risk behavior event corresponding to the same selected suspected risk behavior event by combining the behavior feature commonality scores between the first suspected risk behavior event and the second selected suspected risk behavior event, and determining the first suspected risk behavior event and the second selected suspected risk behavior event corresponding to the same selected suspected risk behavior event as the to-be-processed suspected risk behavior event.
In some examples, in determining the behavioral characteristic commonality score between the first suspected risk behavioral event and the second selected suspected risk behavioral event, a vector similarity of the first suspected risk behavioral event and the second selected suspected risk behavioral event may be determined first from risk behavioral decision knowledge of the first suspected risk behavioral event in the first abnormal user behavior record and risk behavioral decision knowledge of the second selected suspected risk behavioral event in the second abnormal user behavior record.
In some examples, a euclidean distance between the two risk behavior decision knowledge may be determined according to risk behavior decision knowledge of the first suspected risk behavior event in the first abnormal user behavior record and risk behavior decision knowledge of the second selected suspected risk behavior event in the second abnormal user behavior record, and a vector similarity of the first suspected risk behavior event and the second selected suspected risk behavior event may be determined according to the euclidean distance. In addition, the vector similarity may be calculated in other ways, which will not be described in detail herein.
And then determining a distribution feature commonality score of the first suspected risk behavior event and the second selected suspected risk behavior event according to the behavior event distribution feature of the first suspected risk behavior event in the first abnormal user behavior record and the behavior event distribution feature of the first selected suspected risk behavior event in the second abnormal user behavior record.
For example, the distribution feature commonality score between the first suspected risk behavior event and the second selected suspected risk behavior event may be determined by a positioning variable indicated by a behavior event distribution feature, or the like, in other words, whether the distribution variable of the first suspected risk behavior event and the distribution variable of the second selected suspected risk behavior event are distribution change information of different time nodes corresponding to the same suspected risk behavior event is determined from event change features of the suspected risk behavior event.
In the embodiment of the invention, the behavior event distribution feature may include identification window information of a suspected risk behavior event, where the identification window information of the suspected risk behavior event includes a first window constraint value and a second window constraint value of an identification window of the suspected risk behavior event, and a window label of the identification window of the suspected risk behavior event, and the first window constraint value, the second window constraint value and the window label of the identification window of the suspected risk behavior event may be determined by acquiring positioning variables of four window key points of the identification window of the suspected risk behavior event, such as positioning variables in the abnormal user behavior record or further positioning variables after the positioning variables are converted, so that the first window constraint value, the second window constraint value and the window label of the identification window of the suspected risk behavior event may be determined by the positioning variables of the four window key points.
In some examples, for the pass-through behavior event distribution feature, determining the distribution feature commonality score may be determining a first window constraint value and a second window constraint value of an identification window of a suspected risk event of the first suspected risk event, and a first window constraint value and a second window constraint value of an identification window of a suspected risk event of the second selected suspected risk event, determining identification window commonality scores of the first suspected risk event and the second selected suspected risk event, and determining a key distribution commonality score of the first suspected risk event and the second selected suspected risk event according to an event identification window label of the first suspected risk event and an event identification window label of the second selected suspected risk event.
Under some independent design ideas, on the basis that the distribution feature commonality scores comprise the identification window commonality scores and the key distribution commonality scores, when the distribution feature commonality scores are determined, the weight values corresponding to the identification window commonality scores and the weight values corresponding to the key distribution commonality scores can be combined.
Further, a common score of the distinguishing labels of the first suspected risk event and the second selected suspected risk event can be determined according to the behavior event distinguishing label of the first suspected risk event in the first abnormal user behavior record and the behavior event distinguishing label of the first selected suspected risk event in the second abnormal user behavior record.
In some examples, the first and second selected risk performance events may be the same or similar risk performance event based on a performance event discrimination tag. In some examples, further, the comparison analysis may be performed between the suspected risk event category of the first suspected risk event and the suspected risk event category of the second selected suspected risk event to determine whether the first suspected risk event and the second selected suspected risk event belong to the same or similar suspected risk event, if the first suspected risk event and the second selected suspected risk event belong to the same or similar suspected risk event, the comparison analysis may be performed by "Y", if the first suspected risk event and the second selected suspected risk event do not belong to the same or similar suspected risk event, the comparison analysis may be performed by "Y0", and the values of Y and Y0 may be flexibly adjusted according to the actual situation.
It may be appreciated that after the vector similarity, the distribution feature commonality score, and the discrimination tag commonality score are obtained, the behavioural feature commonality score of the first suspected risk behavioural event and the second selected suspected risk behavioural event may be determined according to the first importance corresponding to the vector similarity, the second importance corresponding to the distribution feature commonality score, and the third importance corresponding to the discrimination tag commonality score. In other words, after the vector similarity, the distribution feature commonality score, and the discrimination tag commonality score are obtained, the first importance degree corresponding to the vector similarity, the second importance degree corresponding to the distribution feature commonality score, and the third importance degree corresponding to the discrimination tag commonality score may be obtained, and the final behavior feature commonality score of the first suspected risk behavior event and the second selected suspected risk behavior event may be determined through a weighted result after the product of the behavior feature commonality score and the corresponding importance degree.
It may be appreciated that after the final performance feature commonality score of the first suspected risk performance event and the second suspected risk performance event is obtained, whether the first suspected risk performance event and the second suspected risk performance event are the same selected suspected risk performance event may be determined according to the final performance feature commonality score, for example, based on the final performance feature commonality score being greater than a set determination value, the first suspected risk performance event and the second suspected risk performance event are considered to be the same selected suspected risk performance event, and the first suspected risk performance event and the second suspected risk performance event corresponding to the same selected suspected risk performance event are determined to be the suspected risk performance event to be processed.
Under some design considerations that can be independent, for step 201, when determining, for each first suspected risk behavior event, not less than one second selected suspected risk behavior event from the second abnormal user behavior record according to the risk behavior decision knowledge, the behavior event distribution feature, and not less than one class of target knowledge attribute in the behavior event distinguishing label, the following steps 2011-2013 may be included.
And 2011, determining a distribution feature commonality score of the first suspected risk behavior event and each second suspected risk behavior event according to a behavior event distribution feature of the first suspected risk behavior event in the first abnormal user behavior record and a behavior event distribution feature of each second suspected risk behavior event in the second abnormal user behavior record on the basis that a difference value of the acquisition moments of the first abnormal user behavior record and the second abnormal user behavior record is smaller than a set time difference judgment value.
In some examples, if the difference value between the time points of the acquisition of the first abnormal user behavior record and the second abnormal user behavior record is smaller than the set time difference determination value, it may be determined that distribution characteristics of the first suspected risk behavior event and the second suspected risk behavior event belonging to the same selected suspected risk behavior event in the two abnormal user behavior records are similar, based on which a distribution characteristic commonality score of the first suspected risk behavior event and each second suspected risk behavior event may be determined first.
Step 2012, determining whether the distribution feature commonality score of the first suspected risk performance event and each second suspected risk performance event meets a first set requirement.
Further, the first setting requirement may be a value of 0.9, 0.93, or 0.95, etc.
And step 2013, determining a second suspected risk behavior event corresponding to the distribution feature commonality score meeting the first setting requirement as the second selected suspected risk behavior event.
It may be appreciated that if the distribution feature commonality score of the first suspected risk action event and each second suspected risk action event meets the first setting requirement, the two suspected risk action events are considered to belong to the same suspected risk action event or to similar suspected risk action events, based on this, the second suspected risk action event corresponding to the distribution feature commonality score meeting the first setting requirement may be determined as the second selected suspected risk action event, so that the number of event association may be reduced in advance, in other words, when the following event association and the action feature commonality score are determined, it is not necessary to determine other second suspected risk action events except for the second selected suspected risk action event, so that efficient event association is convenient to achieve.
In some independent embodiments, the number of event associations may also be reduced based on behavioral event differentiation tags for step 201, illustratively including the following steps 201 a-201 c.
Step 201a, for each first suspected risk behavior event, determining a tag commonality score between the first suspected risk behavior event and each second suspected risk behavior event according to the behavior event distinguishing tag of the first suspected risk behavior event and the behavior event distinguishing tag of each second suspected risk behavior event.
Step 201b, determining whether the tag commonality score between the first suspected risk behavior event and each second suspected risk behavior event meets a second set requirement.
Further, the second setting requirement may be 0.92, 0.95, 0.97, or the like.
Step 201c, determining a second suspected risk behavior event corresponding to the tag-distinguishing commonality score satisfying the second setting requirement as the second selected suspected risk behavior event.
For example, for a first suspected risk behavior event (e.g., a trojan horse attack), a second suspected risk behavior event includes a fragment attack, a denial of service attack, and a DDOS attack, when event association is performed, since the tag commonality score for distinguishing between the trojan horse attack and the DDOS attack is smaller, determining the DDOS attack as the second selected suspected risk behavior event may not be required to perform the determination of the behavior feature commonality score. For a fragment attack in the second suspected risk event, the second selected suspected risk event may be determined to be a suspected risk event similar to a trojan horse attack.
Therefore, if the score of the commonality of the distinguishing tag of the first suspected risk event and each second suspected risk event meets the first setting requirement, the two suspected risk events are considered to belong to the same suspected risk event or to similar suspected risk events, in this case, the second suspected risk event corresponding to the score of the commonality of the distinguishing tag meeting the first setting requirement can be determined as the second selected suspected risk event, so that the number of event association can be reduced in advance, in other words, when the subsequent events are associated and the score of the commonality of the behavior feature is determined, other second suspected risk events except for the second selected suspected risk event are not determined, so that the timeliness of determining the second selected suspected risk event is improved.
Under some independent design considerations, for step 1033, when determining the first suspected risk performance event and the second suspected risk performance event corresponding to the same selected suspected risk performance event in combination with the performance feature commonality score, the following steps 10331-10332 may be included.
Step 10331, determining whether the association between the first suspected risk action event and each second suspected risk action event is completed according to the action feature commonality score and the first event association indication.
Step 10332, determining the first suspected risk behavior event and the second suspected risk behavior event that are associated with each other as the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event.
Further, the first event association indication may be a combined optimization algorithm that solves the task allocation problem in polynomial time. For another example, a binary regression analysis and a combinatorial optimization algorithm may be combined to implement event correlation processing of the first and each second suspected risk behavior event.
In some examples, based on the first suspected risk behavior event and the second suspected risk behavior event being one, whether the association is completed may be determined according to a behavior feature commonality score between the first suspected risk behavior event and the second suspected risk behavior event, and if the association is completed, the first suspected risk behavior event and the second suspected risk behavior event may be determined to be the same selected suspected risk behavior event.
Further, for each first suspected risk behavior event, if there are a plurality of second suspected risk behavior events similar to the first suspected risk behavior event in the second abnormal user behavior record, the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event may be determined by, in other words, for step 10332, when the first suspected risk behavior event and the second suspected risk behavior event that are to be associated are determined to correspond to the same selected suspected risk behavior event, the following steps 103321-103322 may be included.
And 103321, determining the first suspected risk behavior event and the second suspected risk behavior event which are associated to be the suspected risk behavior event doublet.
Step 103322, determining a selected suspected risk behavior event doublet from a plurality of suspected risk behavior event doublets according to a second event association indication on the basis that the number of the suspected risk behavior event doublets is two or more, and determining a first suspected risk behavior event and the second suspected risk behavior event in the selected suspected risk behavior event doublet as the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event.
In the embodiment of the invention, the second event association indication can be understood as a combined optimization algorithm for solving the task allocation problem in polynomial time.
Further, the description is given by taking each suspected risk behavior event in the first abnormal user behavior record1 and the second abnormal user behavior record2 as an example. For example, the first abnormal user behavior record1 includes a suspected risk behavior event case_a1, a suspected risk behavior event case_a2, a suspected risk behavior event case_a3, and a suspected risk behavior event case_a4, and the second abnormal user behavior record2 includes a suspected risk behavior event case_b1, a suspected risk behavior event case_b2, a suspected risk behavior event case_b3, and a suspected risk behavior event case_b4, where the suspected risk behavior event doublet includes case_a1-case_b1, case_a1-case_b2, case_a2-case_b3, case_a3-case_b1, case_a3-case_b2, and case_a4-case_b3. In other words, the reference members are case_a1, case_a2, case_a3, case_a4, case_b1, case_b2, case_b3, and case_b4, and according to the behavior feature commonality score between two suspected risk events in each suspected risk event doublet, a numerical match is performed for each reference member (for example, 0.9 is allocated to case_a1, case_a2, case_a3, 0.2 is allocated to case_a4, and 0 is allocated to case_b1, case_b2, case_b3, case_b4), and according to the second event association indication, the selected suspected risk event doublet may finally be determined to include: the case_a1-case_b1, the case_a2-case_b3 and the case_a3-case_b2, so that the case_a1 and the case_b1 are the same suspected risk behavior event, the case_a2 and the case_b3 are the same suspected risk behavior event and the case_a3 and the case_b2 are the same suspected risk behavior event.
In the embodiment of the invention, on the basis that the number of the suspected risk event doublets is two or more, the selected suspected risk event doublet is determined according to the second event association indication, in other words, the selected suspected risk event doublet with the highest association can be determined from a plurality of suspected risk event doublets, so that the accuracy of calculating the first suspected risk event and the second suspected risk event corresponding to the same selected suspected risk event can be improved.
The following is another data risk identification method for fusing expert model decisions provided in the embodiment of the present invention, which is different from the data risk identification method for fusing expert model decisions, and the method for processing suspected risk behavior events provided in the embodiment of the present invention further includes step 105.
Step 105, updating a relation network according to the risk state of the selected suspected risk behavior event, and determining a risk hidden danger prediction report of the selected suspected risk behavior event.
Further, the risk potential prediction report may include: hidden danger topics, suspected risk behavior event topics and suspected risk behavior event influence ranges.
It can be appreciated that after the risk status update relationship network of the selected suspected risk behavior event is obtained, a risk hidden danger prediction report can be determined through the risk status update relationship network, and then targeted risk hidden danger protection processing, such as deployment of protection measures in advance, is performed based on the risk hidden danger prediction report. Based on this, for some design concepts that can be implemented independently, after determining the risk potential prediction report of the selected suspected risk behavior event, the method may further include, after updating the relationship network according to the risk status of the selected suspected risk behavior event described in step 105: determining risk behavior attack intention by using the risk hidden danger prediction report; and generating a data risk protection scheme through the risk behavior attack intention.
For example, the risk behavior attack intention can reflect a concrete expression mode with risk hidden danger or a link to which the risk behavior attack is possible, based on the concrete expression mode or the link, the data risk protection scheme can be customized in a targeted manner, the data risk protection scheme can be deployed in advance, and the accuracy and the reliability of the subsequent data risk protection are improved.
For some design ideas that can be implemented independently, determining risk behavior attack intent using the risk potential prediction report may include the following: acquiring a risk hidden danger event set in the risk hidden danger prediction report; respectively carrying out linkage attack intention analysis and non-linkage attack intention analysis on a plurality of risk hidden danger prediction events in the risk hidden danger event set to obtain a linkage attack intention analysis characteristic set and a non-linkage attack intention analysis characteristic set; performing first feature screening on the linkage attack intention analysis feature set through a first feature optimization algorithm to obtain a first attack intention relation network comprising linkage attack intention; performing a second feature screening on the non-linkage attack intention analysis feature set through a second feature optimization algorithm to obtain a second attack intention relation network comprising non-linkage attack intention; sorting based on the first attack intention relation network and the second attack intention relation network to obtain a target attack intention relation network matched with the target attack intention in the risk hidden danger event set; the target attack intention comprises at least one of linkage attack intention and non-linkage attack intention; and determining risk behavior attack intents corresponding to the risk hidden danger event set based on the target attack intents relation network. For example, the set relationship network unit with the highest attack intention heat can be determined through the target attack intention relationship network to be used as the risk behavior attack intention.
For some design ideas that can be implemented independently, the performing linkage attack intention analysis and non-linkage attack intention analysis on the plurality of risk hidden danger prediction events in the risk hidden danger event set to obtain a linkage attack intention analysis feature set and a non-linkage attack intention analysis feature set includes: respectively carrying out linkage attack intention analysis on a plurality of risk hidden danger prediction events in the risk hidden danger event set to obtain linkage attack intention analysis windows in each risk hidden danger prediction event and initial attack intention types corresponding to each linkage attack intention analysis window; determining a linkage attack intention analysis feature set based on linkage attack intention analysis windows and corresponding initial attack intention types in each risk hidden danger prediction event; and respectively carrying out non-linkage attack intention analysis on a plurality of risk hidden danger prediction events in the risk hidden danger event set to obtain a non-linkage attack intention analysis characteristic set. By the design, the linkage attack intention analysis characteristic set and the non-linkage attack intention analysis characteristic set can be accurately and completely determined.
For some design ideas that can be implemented independently, the performing the non-linkage attack intention analysis on the plurality of risk hidden danger prediction events in the risk hidden danger event set to obtain a non-linkage attack intention analysis feature set includes: respectively carrying out hidden danger topic identification on a plurality of risk hidden danger prediction events in the risk hidden danger event set to obtain hidden danger topic identification results respectively corresponding to each risk hidden danger prediction event; respectively carrying out hidden danger performance recognition on a plurality of risk hidden danger prediction events in the risk hidden danger event set to obtain hidden danger performance recognition results respectively corresponding to each risk hidden danger prediction event; correlating hidden danger topic identification results and hidden danger performance identification results corresponding to the same event; and carrying out non-linkage attack intention analysis processing based on hidden danger performance recognition results related to the target hidden danger theme recognition results in the risk hidden danger event set to obtain a non-linkage attack intention analysis characteristic set.
For some design ideas that can be implemented independently, the step of performing a first feature screening on the linkage attack intention analysis feature set by using a first feature optimization algorithm to obtain a first attack intention relation network including linkage attack intention includes: respectively sampling attack intention category for each risk hidden danger prediction event in the linkage attack intention analysis feature set to obtain a single attack intention category corresponding to each risk hidden danger prediction event; based on the scale of the linkage attack intention analysis window corresponding to the corresponding single attack intention type in each risk hidden danger prediction event, respectively performing window fine screening treatment to obtain a linkage attack intention analysis feature set after fine screening; continuously sampling the updated linkage attack intention analysis feature set to obtain a plurality of first standby attack intention relation networks comprising linkage attack intention; and according to the category to which each first standby attack intention relation network respectively belongs, carrying out relation network optimization on the first standby attack intention relation networks belonging to the same category to obtain a first attack intention relation network comprising linkage attack intention.
Based on the same inventive concept, fig. 2 shows a block diagram of a data risk recognition device for fusing expert model decisions according to an embodiment of the present invention, and a data risk recognition device for fusing expert model decisions may include the following modules for implementing the relevant method steps shown in fig. 1.
A data acquisition module 21 for: and acquiring abnormal business activity big data, and screening at least two abnormal user behavior records from the abnormal business activity big data according to a set screening instruction.
Knowledge mining module 22 for: and mining risk behavior decision knowledge of each suspected risk behavior event in each abnormal user behavior record, and respectively determining behavior event distribution characteristics and behavior event distinguishing labels of each suspected risk behavior event in each abnormal user behavior record.
Event analysis module 23 for: and determining to-be-processed suspected risk behavior events corresponding to the same selected suspected risk behavior event in the at least two abnormal user behavior records according to the risk behavior decision knowledge, the behavior event distribution characteristics and the behavior event distinguishing labels of each abnormal user behavior record.
The feature analysis module 24 is configured to: and determining a risk state updating relation network of the selected suspected risk behavior event according to behavior event distribution characteristics of the suspected risk behavior event to be processed, which correspond to the selected suspected risk behavior event in the at least two abnormal user behavior records.
The related embodiments applied to the present invention can achieve the following technical effects: the risk behavior decision knowledge, the behavior event distribution characteristics and the behavior event distinguishing labels of the suspected risk behavior events are mined from the abnormal user behavior records, a plurality of suspected risk behavior events corresponding to the same selected suspected risk behavior event are determined, and a risk state updating relation network of the same selected suspected risk behavior event is determined, in other words, the risk behavior decision knowledge, the behavior event distribution characteristics and the behavior event distinguishing labels are subjected to linkage analysis, and meanwhile, the risk state updating relation network of the selected suspected risk behavior event is determined according to the three knowledge attributes, so that the accuracy of determining whether the suspected risk potential hazards exist in the suspected risk behavior event and the richness of the risk state updating relation network of the selected suspected risk behavior event are improved, and the accuracy and the reliability of data risk identification are ensured.
The foregoing is only a specific embodiment of the present invention. Variations and alternatives will occur to those skilled in the art based on the detailed description provided herein and are intended to be included within the scope of the invention.

Claims (9)

1. A data risk identification method integrating expert model decisions, which is applied to an artificial intelligence server, the method comprising:
acquiring abnormal business activity big data, and screening at least two abnormal user behavior records from the abnormal business activity big data according to a set screening instruction; mining risk behavior decision knowledge of each suspected risk behavior event in each abnormal user behavior record, and respectively determining behavior event distribution characteristics and behavior event distinguishing labels of each suspected risk behavior event in each abnormal user behavior record;
determining to-be-processed suspected risk behavior events corresponding to the same selected suspected risk behavior event in the at least two abnormal user behavior records according to risk behavior decision knowledge, behavior event distribution characteristics and behavior event distinguishing labels of each abnormal user behavior record; determining a risk state updating relation network of the selected suspected risk behavior event according to behavior event distribution characteristics of the corresponding suspected risk behavior events to be processed in the at least two abnormal user behavior records; the risk state updating relationship network is an event state change of a selected suspected risk behavior event, and is represented in an ordered knowledge graph form, and is used for integrating risk state features of the selected suspected risk behavior event in series and determining a change condition of the selected suspected risk behavior event based on an overall level, wherein the change condition comprises an operation behavior feature or an interaction activity feature change of a network fraud event in each link;
Wherein after determining the risk state update relationship network of the selected suspected risk behavior event according to the behavior event distribution characteristics of the selected suspected risk behavior event corresponding to the to-be-processed suspected risk behavior event in the at least two abnormal user behavior records, the method further includes: updating a relation network according to the risk state of the selected suspected risk behavior event, and determining a risk hidden danger prediction report of the selected suspected risk behavior event;
wherein, after determining the risk hidden danger prediction report of the selected suspected risk behavior event, the method further includes: determining risk behavior attack intention by using the risk hidden danger prediction report; generating a data risk protection scheme through the risk behavior attack intention;
the risk potential prediction report is used for determining risk behavior attack intention, and the method comprises the following steps: acquiring a risk hidden danger event set in the risk hidden danger prediction report; respectively carrying out linkage attack intention analysis and non-linkage attack intention analysis on a plurality of risk hidden danger prediction events in the risk hidden danger event set to obtain a linkage attack intention analysis characteristic set and a non-linkage attack intention analysis characteristic set; performing first feature screening on the linkage attack intention analysis feature set through a first feature optimization algorithm to obtain a first attack intention relation network comprising linkage attack intention; performing a second feature screening on the non-linkage attack intention analysis feature set through a second feature optimization algorithm to obtain a second attack intention relation network comprising non-linkage attack intention; sorting based on the first attack intention relation network and the second attack intention relation network to obtain a target attack intention relation network matched with the target attack intention in the risk hidden danger event set; the target attack intention comprises at least one of linkage attack intention and non-linkage attack intention; determining risk behavior attack intents corresponding to the risk hidden danger event sets based on the target attack intents relation network;
The method for analyzing the linkage attack intention and the non-linkage attack intention respectively for a plurality of risk hidden danger prediction events in the risk hidden danger event set to obtain a linkage attack intention analysis feature set and a non-linkage attack intention analysis feature set comprises the following steps: respectively carrying out linkage attack intention analysis on a plurality of risk hidden danger prediction events in the risk hidden danger event set to obtain linkage attack intention analysis windows in each risk hidden danger prediction event and initial attack intention types corresponding to each linkage attack intention analysis window; determining a linkage attack intention analysis feature set based on linkage attack intention analysis windows and corresponding initial attack intention types in each risk hidden danger prediction event; and respectively carrying out non-linkage attack intention analysis on a plurality of risk hidden danger prediction events in the risk hidden danger event set to obtain a non-linkage attack intention analysis characteristic set.
2. The method of claim 1, wherein the at least two abnormal user behavior records comprise a first abnormal user behavior record and a second abnormal user behavior record, the first abnormal user behavior record and the second abnormal user behavior record being abnormal user behavior records having a timing relationship;
Determining the to-be-processed suspected risk behavior event corresponding to the same selected suspected risk behavior event in the at least two abnormal user behavior records according to the risk behavior decision knowledge, the behavior event distribution characteristics and the behavior event distinguishing label of each abnormal user behavior record, including:
determining that the first abnormal user behavior record is not lower than a first suspected risk behavior event and the second abnormal user behavior record is not lower than a second suspected risk behavior event;
determining a behavior feature commonality score between the first suspected risk behavior event and each second suspected risk behavior event according to behavior event distribution features, behavior event distinguishing labels and risk behavior decision knowledge of the first suspected risk behavior event in the first abnormal user behavior record and behavior event distribution features, behavior event distinguishing labels and risk behavior decision knowledge of each second suspected risk behavior event in the second abnormal user behavior record for each first suspected risk behavior event;
and determining the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event by combining the behavior feature commonality scores, and determining the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event as the to-be-processed suspected risk behavior event.
3. The method of claim 2, wherein the determining, for each first suspected risk performance event, a performance feature commonality score between the first suspected risk performance event and each second suspected risk performance event based on performance event distribution features, performance event differentiation tags, and risk performance decision knowledge of the first suspected risk performance event in the first abnormal user performance record, and performance event distribution features, performance event differentiation tags, and risk performance decision knowledge of each second suspected risk performance event in the second abnormal user performance record, comprises: for each first suspected risk behavior event, determining not less than a second selected suspected risk behavior event from the second abnormal user behavior records according to the risk behavior decision knowledge, the behavior event distribution characteristics and not less than one type of target knowledge attribute in the behavior event distinguishing label; determining a behavioural feature commonness score between the first suspected risk behavioural event and the second selected suspected risk behavioural event according to the risk behavioural decision knowledge, the behavioural event distribution feature and the behavioural event discrimination tag;
The step of determining the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event by combining the behavior feature commonality scores, and determining the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event as the to-be-processed suspected risk behavior event, includes: determining the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event by combining the behavior feature commonality scores between the first suspected risk behavior event and the second selected suspected risk behavior event, and determining the first suspected risk behavior event and the second selected suspected risk behavior event corresponding to the same selected suspected risk behavior event as the to-be-processed suspected risk behavior event.
4. The method of claim 3, wherein said determining not less than a second selected suspected risk behavioural event from said second abnormal user behaviour record based on said risk behavioural decision knowledge, said behavioural event distribution characteristics and not less than one class of target knowledge attributes in said behavioural event differentiation label for each said first suspected risk behavioural event comprises:
On the basis that the difference value of the acquisition moments of the first abnormal user behavior record and the second abnormal user behavior record is smaller than a set time difference judgment value, determining a distribution feature commonality score of the first suspected risk behavior event and each second suspected risk behavior event according to the behavior event distribution feature of the first suspected risk behavior event in the first abnormal user behavior record and the behavior event distribution feature of each second suspected risk behavior event in the second abnormal user behavior record for each first suspected risk behavior event;
determining whether the distribution feature commonality scores of the first suspected risk behavior event and each second suspected risk behavior event meet a first set requirement;
and determining a second suspected risk behavior event corresponding to the distribution feature commonality score meeting the first setting requirement as the second selected suspected risk behavior event.
5. The method of claim 3, wherein said determining not less than a second selected suspected risk behavioural event from said second abnormal user behaviour record based on said risk behavioural decision knowledge, said behavioural event distribution characteristics and not less than one class of target knowledge attributes in said behavioural event differentiation label for each said first suspected risk behavioural event comprises:
For each first suspected risk behavior event, determining a tag-distinguishing commonality score between the first suspected risk behavior event and each second suspected risk behavior event according to a behavior event-distinguishing tag of the first suspected risk behavior event and a behavior event-distinguishing tag of each second suspected risk behavior event;
determining whether a discrimination tag commonality score between the first suspected risk behavior event and each second suspected risk behavior event meets a second set requirement;
and determining a second suspected risk behavior event corresponding to the identification tag commonality score meeting the second setting requirement as the second selected suspected risk behavior event.
6. The method of claim 2, wherein the determining the first and second suspected risk performance events corresponding to the same selected suspected risk performance event in combination with the performance feature commonality score comprises:
determining whether association is completed between the first suspected risk behavior event and each second suspected risk behavior event by combining the behavior feature commonality score and a first event association indication;
And determining the first suspected risk behavior event and the second suspected risk behavior event which are associated to be completed as the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event.
7. The method of claim 6, wherein the determining the first and second suspected risk performance events that correspond to the same selected suspected risk performance event that will complete the associated first and second suspected risk performance events comprises:
determining the first suspected risk behavior event and the second suspected risk behavior event which are associated to be completed as a suspected risk behavior event doublet;
on the basis that the number of the suspected risk behavior event doublets is two or more, determining a selected suspected risk behavior event doublet from a plurality of suspected risk behavior event doublets according to a second event association indication, and determining a first suspected risk behavior event and the second suspected risk behavior event in the selected suspected risk behavior event doublet as the first suspected risk behavior event and the second suspected risk behavior event corresponding to the same selected suspected risk behavior event.
8. An artificial intelligence server, comprising: a memory and a processor; the memory is coupled to the processor; the memory is used for storing computer program codes, and the computer program codes comprise computer instructions; wherein the computer instructions, when executed by the processor, cause the artificial intelligence server to perform the method of any one of claims 1-7.
9. A computer readable storage medium, characterized in that it has stored thereon a computer program which, when run, performs the method according to any of claims 1-7.
CN202211117623.8A 2022-09-14 2022-09-14 Expert model decision-fused data risk identification method and server Active CN115438979B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211117623.8A CN115438979B (en) 2022-09-14 2022-09-14 Expert model decision-fused data risk identification method and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211117623.8A CN115438979B (en) 2022-09-14 2022-09-14 Expert model decision-fused data risk identification method and server

Publications (2)

Publication Number Publication Date
CN115438979A CN115438979A (en) 2022-12-06
CN115438979B true CN115438979B (en) 2023-06-09

Family

ID=84247576

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211117623.8A Active CN115438979B (en) 2022-09-14 2022-09-14 Expert model decision-fused data risk identification method and server

Country Status (1)

Country Link
CN (1) CN115438979B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7720822B1 (en) * 2005-03-18 2010-05-18 Beyondcore, Inc. Quality management in a data-processing environment
CN110363449A (en) * 2019-07-25 2019-10-22 中国工商银行股份有限公司 A kind of Risk Identification Method, apparatus and system
CN111489098A (en) * 2020-04-17 2020-08-04 支付宝(杭州)信息技术有限公司 Suspected risk service decision method, device and processing equipment
CN113301054A (en) * 2021-06-08 2021-08-24 太仓韬信信息科技有限公司 Network security monitoring method and network security monitoring system
CN114139210A (en) * 2021-12-15 2022-03-04 智谷互联网科技(廊坊)有限公司 Big data security threat processing method and system based on intelligent service
CN114138872A (en) * 2021-12-13 2022-03-04 青岛华仁互联网络有限公司 Big data intrusion analysis method and storage medium applied to digital finance
CN114331224A (en) * 2022-03-07 2022-04-12 深圳市光子跃动科技有限公司 Real-time business wind control processing method and system based on rule engine

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7720822B1 (en) * 2005-03-18 2010-05-18 Beyondcore, Inc. Quality management in a data-processing environment
CN110363449A (en) * 2019-07-25 2019-10-22 中国工商银行股份有限公司 A kind of Risk Identification Method, apparatus and system
CN111489098A (en) * 2020-04-17 2020-08-04 支付宝(杭州)信息技术有限公司 Suspected risk service decision method, device and processing equipment
CN113301054A (en) * 2021-06-08 2021-08-24 太仓韬信信息科技有限公司 Network security monitoring method and network security monitoring system
CN114138872A (en) * 2021-12-13 2022-03-04 青岛华仁互联网络有限公司 Big data intrusion analysis method and storage medium applied to digital finance
CN114139210A (en) * 2021-12-15 2022-03-04 智谷互联网科技(廊坊)有限公司 Big data security threat processing method and system based on intelligent service
CN114331224A (en) * 2022-03-07 2022-04-12 深圳市光子跃动科技有限公司 Real-time business wind control processing method and system based on rule engine

Also Published As

Publication number Publication date
CN115438979A (en) 2022-12-06

Similar Documents

Publication Publication Date Title
US6336109B2 (en) Method and apparatus for inducing rules from data classifiers
CN111475804A (en) Alarm prediction method and system
CN114816909B (en) Real-time log detection early warning method and system based on machine learning
CN111783442A (en) Intrusion detection method, device, server and storage medium
CN111143838B (en) Database user abnormal behavior detection method
CN114726654B (en) Data analysis method and server for coping with cloud computing network attack
CN115048370B (en) Artificial intelligence processing method for big data cleaning and big data cleaning system
Dou et al. Pc 2 a: predicting collective contextual anomalies via lstm with deep generative model
CN111047173B (en) Community credibility evaluation method based on improved D-S evidence theory
CN116823227A (en) Intelligent equipment management system and method based on Internet of things
Ranasinghe et al. Generating real-valued failure data for prognostics under the conditions of limited data availability
CN110716957B (en) Intelligent mining and analyzing method for class case suspicious objects
CN110011990A (en) Intranet security threatens intelligent analysis method
CN113722719A (en) Information generation method and artificial intelligence system for security interception big data analysis
CN113327037A (en) Model-based risk identification method and device, computer equipment and storage medium
CN115438979B (en) Expert model decision-fused data risk identification method and server
Hashemi et al. Multi-objective Optimization for Computer Security and Privacy.
US20230156043A1 (en) System and method of supporting decision-making for security management
CN116545679A (en) Industrial situation security basic framework and network attack behavior feature analysis method
CN113468540A (en) Security portrait processing method based on network security big data and network security system
Guevara et al. Intrusion detection with neural networks based on knowledge extraction by decision tree
CN115831339B (en) Medical system risk management and control pre-prediction method and system based on deep learning
Kılıç et al. Data mining and statistics in data science
Goel et al. 7 Cyber Security
Goel et al. Cyber Security Technique for Internet of Things Using Machine Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20230519

Address after: 1611 Gangxinda Henggang Building, No. 5008, Longgang Avenue, Songbai Community, Henggang Street, Longgang District, Shenzhen, Guangdong 518000

Applicant after: Shenzhen Spread Technology Co.,Ltd.

Address before: No. 258, Zhengyang North Road, Longyang District, Baoshan, Yunnan 678000

Applicant before: Dai Hongli

GR01 Patent grant
GR01 Patent grant