CN113923037B

CN113923037B - Anomaly detection optimization device, method and system based on trusted computing

Info

Publication number: CN113923037B
Application number: CN202111212808.2A
Authority: CN
Inventors: 李飞; 阮安邦; 魏明; 陈旭明; 翟东雪
Original assignee: Beijing Octa Innovations Information Technology Co Ltd
Current assignee: Beijing Octa Innovations Information Technology Co Ltd
Priority date: 2021-10-18
Filing date: 2021-10-18
Publication date: 2024-03-26
Anticipated expiration: 2041-10-18
Also published as: CN113923037A

Abstract

The invention relates to an anomaly detection optimization device, method and system based on trusted computing. The anomaly detection optimization system comprises at least: the data acquisition module is configured to be capable of acquiring data information of a user associated with a risk scene to be handled; the abnormality detection module is configured to be capable of inputting the data information of the acquired user associated with the risk scene to be handled into an abnormality detection model group, and performing abnormality detection judgment on the data information of the acquired user associated with the risk scene to be handled according to the abnormality detection judgment policy by using an abnormality detection model in the abnormality detection model group and outputting a detection result.

Description

Anomaly detection optimization device, method and system based on trusted computing

Technical Field

The present invention relates to the field of network security technologies, and in particular, to an anomaly detection optimization apparatus, method, and system based on trusted computing.

Background

The 21 st century is the age of the large development of data information, and mobile interconnection, social networks, electronic commerce and the like greatly expand the boundaries and application range of the internet, and various data are rapidly expanding and becoming large. The internet (social, search, e-commerce), mobile internet (microblog), internet of things (sensor, smart earth), internet of vehicles, GPS, medical images, security monitoring, finance (banking, stock market, insurance), telecommunications (conversation, short message) all produce data at a crazy level, and huge information is implied by massive data. Data is a carrier of information that, once subjected to a data disaster, may cause immeasurable loss to the user. Therefore, it is desirable to provide a method for performing anomaly detection, so as to effectively monitor the behavior of the user.

For example, chinese patent publication No. CN112364286a discloses a method, apparatus and related products for abnormality detection based on UEBA. The abnormal detection method based on the UEBA comprises the following steps: capturing system operation log source data associated with user entity behaviors in real time; inputting real-time captured system operation log source data into an anomaly detection model group, wherein the anomaly detection model group comprises a plurality of anomaly detection models; and performing abnormality detection judgment on the system operation log source data captured in real time according to the abnormality detection judgment strategy by using the abnormality detection models in the abnormality detection model group, and outputting a detection result. The embodiment of the application can perform abnormality detection, thereby effectively monitoring the behavior of the user. However, the invention still has the following technical shortcomings: 1) Since the localization and specialization of the user entity behavior analysis technique used for anomaly detection optimization is a means to address some very specific risk scenario, it cannot solve a too broad problem, such as analyzing behavior habits of thirty thousands of users. Before preparing to implement the user entity behavior analysis technology, firstly, considering how to solve specific risk scenes, such as solving the problem of electronic bank collision risk detection or solving the problem of using legal accounts to steal policy information, and defining specific risk scenes is the premise of implementing the user entity behavior analysis technology, and only if the definition of the solved risk scenes is clear, the subsequent analysis work can be carried out in a targeted manner, and the prior user entity behavior analysis technology lacks a technical scheme for defining application scenes; 2) Extensive data collection is the basis for user entity behavior analysis application landing. If the input data quantity is small or the data quality is low, the final analysis result of the user entity behavior analysis is definitely low in value, even if the system platform and the model algorithm are good. The data required for the user entity behavior analysis is not as good as the more. This is because the data can only be a burden if it is independent of the risk scenario that needs to be analyzed. The premise of data acquisition is to match a specific scene to be analyzed, that is, data required by the specific scene is acquired when the specific scene is analyzed. Nor does the prior art have a classification approach for specific risk scenarios on source data for user entity behavior analysis techniques. Accordingly, there is a need for improvements in light of the deficiencies of the prior art.

Furthermore, there are differences in one aspect due to understanding to those skilled in the art; on the other hand, as the inventors studied numerous documents and patents while the present invention was made, the text is not limited to details and contents of all that are listed, but it is by no means the present invention does not have these prior art features, the present invention has all the prior art features, and the applicant remains in the background art to which the rights of the related prior art are added.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides an anomaly detection optimization method based on trusted computing, which at least comprises the following steps: acquiring data information of a user associated with a risk scene to be handled through a data acquisition module; inputting data information related to a risk scene to be handled of a user into an anomaly detection model group through an anomaly detection module, wherein the anomaly detection model group comprises a plurality of anomaly detection models; and carrying out abnormality detection judgment on the obtained data information of the user, which is related to the risk scene to be handled, according to the abnormality detection judgment strategy by using the abnormality detection models in the abnormality detection model group, and outputting a detection result.

According to a preferred embodiment, the method for acquiring the data information associated with the risk scene to be handled by the user by the data acquisition module is as follows: and inputting a risk scene to be handled as the risk scene to be handled by the abnormality detection module through an input module, and only acquiring data information related to the risk scene to be handled of a user through the data acquisition module.

The input module is used for inputting the risk scene to be handled by the user, so that the risk scene to be solved which is clearly defined for the subsequent user entity behavior analysis is realized, and the specific risk which is handled by the user entity behavior analysis system is clear. For example, risk scenarios include, but are not limited to: the internal personnel steal sensitive data, account numbers are lost, hosts are lost, data are revealed, financial anti-fraud is performed, control behaviors are bypassed, the risk of electronic bank collision is avoided, and legal accounts are utilized to steal policy information.

Because of the positioning and specialization of user entity behavior analysis techniques, it is a means to address some very specific risk scenario. It cannot solve a too extensive problem, such as analyzing behavior habits of thirty thousands of users, which is too extensive to form a specific risk scenario to be handled, and is not suitable for being solved by using a user entity behavior analysis technology. Before preparing to implement the user entity behavior analysis technology, it should be considered first to solve what specific risk scenario, such as solving the risk detection of electronic bank collision or solving the theft of policy information by using legal accounts. Defining specific risk scenes is a precondition for implementing user entity behavior analysis technology, and subsequent analysis work can be carried out in a targeted manner only if the to-be-solved risk scenes are clearly defined. Therefore, the user entity behavior analysis system clearly defines the risk scene which is needed to be handled and analyzed by the user entity behavior analysis system through setting the input module.

Secondly, extensive data collection is the basis for user entity behavior analysis application landing. If the input data quantity is small or the data quality is low, the final analysis result of the user entity behavior analysis is definitely low in value, even if the system platform and the model algorithm are good. If a pile of garbage data is input, the final result is a pile of analysis results with low value. However, the data required for the user entity behavior analysis is not as good as the more. This is because the data can only be a burden if it is independent of the risk scenario that needs to be analyzed. The premise of data collection is to match the specific scene to be analyzed, that is, what data is needed to analyze the specific scene, rather than having a pile of data to see what results can be analyzed. With this premise, the gist of data acquisition is high quality and multiple categories. Therefore, the user entity behavior analysis system only collects the data information of the user related to the risk scene defined by the input module through the data collection module, so as to provide high-quality and multi-kind analysis data sources for the subsequent analysis process.

The data acquisition module can acquire the risk scene input by the input module and acquire data information corresponding to the user according to the risk scene. Preferably, the input module is capable of sending the risk scenario input by the user through the input module to the data acquisition module. Preferably, the risk scenario may be defined by the user himself. For example, a user may input, via an input module, sensitive data, a collapse account number, a collapse host, data leakage, risk ranking, business API security, remote office security, and the like. And the data acquisition module is in data connection with the input module and acquires a risk scene of the needed countermeasure input by the user through the input module. For example, when the content input by the user through the input module steals sensitive data for an inside person, the input module defines the risk scenario as a first risk scenario. Under the condition of responding to the first risk scene input by the input module, the data acquisition module in data connection with the input module acquires the first risk scene and only acquires data information of the user related to the first risk scene, such as database logs, receipt logs, user access logs and data information of full flow, work and rest time of personnel, work places, behavior characteristics (such as operation frequency and work hot zone time period), personal characteristics (age and affiliated institutions) and the like. And so on. By the configuration mode, the data acquisition module can only acquire data information related to a risk scene to be handled and send the acquired related data information to the abnormality detection module, namely, high-quality and multi-kind analysis data sources are provided for the subsequent abnormality detection module, so that the accuracy of subsequent analysis of the abnormality detection module is improved.

According to a preferred embodiment, the anomaly detection module comprises a whitelist generation unit and a user entity behavior analysis unit. The white list generation unit is configured to be capable of generating a white list matched with the safety requirements of different users based on the application scenes of the users and/or the safety situations monitored by the abnormality detection module; the user entity behavior analysis unit is configured to at least monitor and analyze a process or a program running on a white list by a user so as to monitor whether the process or the program running on the white list by the user is abnormal. Preferably, the anomaly detection module is further capable of monitoring the security posture of the user server in a time-sequential manner. Preferably, the security posture includes at least user system version information. Preferably, the security posture may also include an update of an application used by the user, a change of a network used by the user, and the like. The user server may be a personal computer, workstation, or the like. The system version information may be system version basic information, time and interval of upgrading or downgrading of the system version, and the like. The application may be a variety of application software used by the user. Particularly preferably, the abnormality detection module is capable of identifying system version information of the user. The anomaly detection module is also capable of identifying updates to the application used by the user, changes to the network used by the user. For example, if a user's server is gradually upgraded over time, the anomaly detection module determines that the user's security posture is benign. When the security situation of the user is benign, the whitelist generating unit is configured to be able to treat the abnormal activity related to the system upgrade found by the user entity behavior analyzing unit as benign abnormality and add the benign abnormality into the original whitelist. And when the server of the user is gradually degraded or unchanged for a long time, the abnormality detection module judges that the security situation of the user is malignant. When the security situation of the user is malignant, the white list generating unit is configured to treat the abnormal activity related to the system upgrade found by the user entity behavior analyzing unit as a real abnormality, and send the real abnormality to the abnormality detecting module for early warning or alarming.

According to a preferred embodiment, the anomaly detection module further comprises a whitelist database unit. The white list database unit is configured to collect white lists of a plurality of different users at least to form a white list database, so that the white list database is used for analyzing and comparing the white list monitored by the user entity behavior analysis unit to be in an abnormal state, and the false alarm rate of the user entity behavior analysis unit is reduced. Preferably, the trusted whitelist in the whitelist database is obtained for the whitelist database unit taking the maximum intersection of whitelists of a plurality of different users.

According to a preferred embodiment, the method for continuously updating the white list by the abnormality detection module includes: if the user entity behavior analysis unit analyzes and compares the white list monitored by the user entity behavior analysis unit to be in an abnormal state through the white list database, and finds that the white list in the abnormal state is in the range of the white list database, the user entity behavior analysis unit judges the white list in the abnormal state as an error alarm, and meanwhile, the white list generation unit acquires the instruction to update the white list of the corresponding server immediately and adds benign abnormality found by the user entity behavior analysis unit into the original white list; if the user entity behavior analysis unit analyzes and compares the white list monitored by the user entity behavior analysis unit to be in an abnormal state through the white list database, and the white list is found to be not in the range of the white list database, the user entity behavior analysis unit judges the white list in the abnormal state as a real alarm and sends the activity of the white list in the abnormal state to the abnormality detection module; if the user entity behavior analysis unit compares and analyzes the white list database once or a plurality of times and confirms the security threat of the abnormal white list behavior, the white list generation unit acquires the instruction to update the white list of the corresponding server immediately and deletes the abnormal white list found by the user entity behavior analysis unit from the original white list. By the configuration mode, the user entity behavior analysis unit continuously performs data interaction with the white list database to continuously modify and update the original white list, so that the false alarm rate and the false alarm rate of the user entity behavior analysis unit are reduced.

According to a preferred embodiment, the performing, by the abnormality detection model in the abnormality detection model group, abnormality detection judgment on the obtained data information related to the risk scenario to be handled according to the abnormality detection judgment policy, and outputting a detection result includes: and if the detection result shows that the data information associated with the risk scene to be handled is abnormal, generating an alarm event.

According to a preferred embodiment, a plurality of the abnormality detection models have a cascaded logic processing relationship; the abnormality detection judgment strategy is determined according to the cascade logic processing relationship; the abnormality detection model in the abnormality detection model group performs abnormality detection and judgment on the obtained data information of the user associated with the risk scene to be handled according to the abnormality detection and judgment strategy and outputs a detection result, and the abnormality detection method comprises the following steps: if the output of the last abnormality detection model indicates that the data information of the acquired user associated with the risk scene to be handled is normal, forwarding the data information of the acquired user associated with the risk scene to be handled to the next abnormality detection model by the last abnormality detection model, performing abnormality detection judgment on the data information of the acquired user associated with the risk scene to be handled, and outputting a detection result.

According to a preferred embodiment, a plurality of the abnormality detection models have a parallel logical processing relationship; the abnormality detection judgment strategy is determined according to the cascade logic processing relationship; the abnormality detection model in the abnormality detection model group performs abnormality detection and judgment on the acquired data information related to the risk scene to be handled according to the abnormality detection and judgment strategy and outputs a detection result, and the abnormality detection method comprises the following steps: and the plurality of abnormality detection models carry out abnormality detection judgment on the acquired data information related to the risk scene to be handled in parallel and output detection results.

According to a preferred embodiment, an anomaly detection optimization system based on trusted computing comprises at least: the data acquisition module is configured to be capable of acquiring data information of a user associated with a risk scene to be handled; the abnormality detection module is configured to input the data information of the acquired user, which is associated with the risk scene to be handled, into an abnormality detection model group, and perform abnormality detection judgment on the data information of the acquired user, which is associated with the risk scene to be handled, according to the abnormality detection judgment policy by using an abnormality detection model in the abnormality detection model group, and output a detection result.

According to a preferred embodiment, an electronic device comprises: a memory having stored thereon computer executable instructions for executing the computer executable instructions to perform the steps of: and establishing an anomaly detection model according to the data information related to the risk scene to be handled and the machine learning training model.

Drawings

Fig. 1 is a simplified schematic diagram of a preferred embodiment of the present invention.

List of reference numerals

1: a data acquisition module; 2: an anomaly detection module; 201: a white list generation unit;

202: a user entity behavior analysis unit; 203: white list database unit.

Detailed Description

The following detailed description refers to the accompanying drawings.

FIG. 1 shows an anomaly detection optimization system based on trusted computing, which is characterized by at least comprising: a data acquisition module 1 and an anomaly detection module 2.

The data acquisition module 1 is configured to acquire the risk scene input by the input module and acquire data information corresponding to a user according to the risk scene.

The abnormality detection module 2 is at least capable of acquiring the data information acquired by the data acquisition module 1.

The anomaly detection module 2 is configured to analyze the risk scene input by the input module and the data information related to the risk scene acquired by the data acquisition module 1, so as to make a user portrait for the user and the information system by using user entity behavior analysis, determine whether the user and the information system have abnormal activities and/or abnormal processes according to the formed user portrait, and further monitor and pre-warn the anomalies.

Preferably, the user includes, but is not limited to: personal computers, workstations, etc.

Preferably, the input module is configured for the user to input a risk scenario for which a countermeasure is required.

Preferably, the risk scenario to be handled includes, but is not limited to: the internal personnel steal sensitive data, account numbers are lost, hosts are lost, data are revealed, financial anti-fraud is performed, control behaviors are bypassed, the risk of electronic bank collision is avoided, and legal accounts are utilized to steal policy information.

The theft of sensitive data by internal personnel is a typical internal threat scenario for enterprises. Since the insiders have legal access rights to the enterprise data assets and generally know the storage locations of the enterprise sensitive data, such behavior cannot be detected by conventional behavior auditing means.

Account sag or account theft has been a pain spot that plagues various organizations and involves the benefit and experience of the end user, with privileged accounts being the target of attacks targeted by hackers.

The collapse host is one of typical internal threats of enterprises, and an attacker often carries out transverse attack on an enterprise network after forming a meat machine by invading an intranet server.

Data leakage may cause serious loss to the brand reputation of an organization, resulting in huge customs pressure, which is one of the most interesting security threats for the organization.

The risk grading and sorting are due to limited human resources of security team, all organizations almost face the problem of excessive alarms, and it is difficult to comprehensively process the security alarms triggered by each security device. How to put the limited precious human resources into, bring the maximum safe operation income, and become the value of risk ranking.

Business API security is a large number of business Application Programming Interfaces (APIs) that are usually provided by enterprise WEB business systems, such as login APIs, data acquisition APIs, and business call APIs, where an attacker may obtain a rough range of enterprise business API entries by capturing access data or request data from a specific website, and by maliciously calling these APIs, malicious access, data theft, and other related malicious activities may be implemented, which seriously affect normal business development of an enterprise.

The remote office security is that an enterprise generally performs remote office through VPN, so that isolation can avoid external personnel from directly accessing internal resources, and certain security risk can be brought.

Preferably, the input module may include, but is not limited to: a keyboard, a touch screen, a microphone, a camera, etc. Preferably, the input module is capable of sending the risk scenario to be handled, which is input by the user through the input module, to the data acquisition module 1. Preferably, the risk scenario to be handled may be defined by the user, for example, the user may input, through the input module, internal personnel to steal sensitive data, a collapse account number, a collapse host, data leakage, risk ranking, service API security, remote office security, and the like.

The data acquisition module 1 in data connection with the input module acquires a risk scene to be handled, which is input by a user through the input module and is needed to be handled. When the content input by the user through the input module is sensitive data for internal personnel, the input module defines the risk scene to be handled as a first risk scene. In the case of responding to the first risk scenario input by the input module, the data acquisition module 1 in data connection with the input module acquires the first risk scenario and only acquires data information related to the first risk scenario of the user, such as database logs, call logs, user access logs and access full traffic, work and rest time of personnel, work places, behavior characteristics (such as operation frequency and work hot zone time period), personal characteristics (age and affiliated institutions) and the like.

When the risk scene to be handled, which is input by the user through the input module, is the collapse account, the input module defines the risk scene to be handled as a second risk scene. In the case of responding to the second risk scenario input by the input module, the data acquisition module 1 in data connection with the input module acquires the second risk scenario input by the input module, and only acquires data information related to the second risk scenario of the user, such as data information of frequently logging in and out, accessing information systems or data assets not accessed by the history, logging in at abnormal time points, and the like.

When the risk scene to be handled, which is input by the user through the input module, is a collapse host, the input module defines the risk scene to be handled as a third risk scene. Under the condition of responding to the third risk scene input by the input module, the data acquisition module 1 in data connection with the input module acquires the third risk scene input by the input module, and only acquires data information of a user related to the third risk scene, such as historical time sequence fluctuation rules of time sequence characteristics of an intranet host or a server, and characteristics of a request domain name, account login, flow size, access security zone frequency, link host standard deviation and the like.

When the risk scene to be handled, which is input by the user through the input module, is data leakage, the input module defines the risk scene to be handled as a fourth risk scene. In the case of responding to the fourth risk scenario input by the input module, the data acquisition module 1 in data connection with the input module acquires the fourth risk scenario input by the input module, and only acquires data information related to the fourth risk scenario of the user, such as data information of an enterprise database log, a call-back log, a user access log, an access full flow, an access period, a time sequence, an action, a frequency and the like.

When the risk scenes to be handled, which are input by the user through the input module, are in risk grading order, the input module defines the risk scenes to be handled as fifth risk scenes. In the case of responding to the fifth risk scene input by the input module, the data acquisition module 1 in data connection with the input module acquires the fifth risk scene input by the input module, and only acquires data information related to the fifth risk scene of the user, such as data information of organization structure, asset criticality, personnel role, access level and the like.

When the risk scene to be handled, which is input by the user through the input module, is the service API security, the input module defines the risk scene to be handled as a sixth risk scene. In the case of responding to the sixth risk scenario input by the input module, the data acquisition module 1 in data connection with the input module acquires the sixth risk scenario input by the input module, and only acquires data information related to the sixth risk scenario, such as data information of an enterprise service API access frequency feature, a requester access frequency feature, a parameter transformation standard deviation, a request time day-night distribution, and the like, of the user.

When the to-be-handled risk scene input by the user through the input module is remote office security, the input module defines the to-be-handled risk scene as a seventh risk scene. In the case of responding to the seventh risk scenario input by the input module, the data acquisition module 1 in data connection with the input module acquires the seventh risk scenario input by the input module, and only acquires data information related to the seventh risk scenario, such as VPN and internal traffic log, employee login location, login time, online time length, network behavior, protocol distribution, and the like, of the user.

Secondly, extensive data collection is the basis for user entity behavior analysis application landing. If the input data quantity is small or the data quality is low, the final analysis result of the user entity behavior analysis is definitely low in value, even if the system platform and the model algorithm are good. If a pile of garbage data is input, the final result is a pile of analysis results with low value. However, the data required for the user entity behavior analysis is not as good as the more. This is because the data can only be a burden if it is independent of the risk scenario that needs to be analyzed. The premise of data collection is to match the specific scene to be analyzed, that is, what data is needed to analyze the specific scene, rather than having a pile of data to see what results can be analyzed. With this premise, the gist of data acquisition is high quality and multiple categories. Therefore, the user entity behavior analysis system only collects the data information of the user related to the risk scene defined by the input module by arranging the data collection module 1 so as to provide high-quality and multi-kind analysis data sources for the subsequent analysis process.

Preferably, the risk scenario to be handled can be defined by the user by means of the input module. Preferably, the input module is capable of sending the risk scene to be processed, which is input by the user through the input module, to the data acquisition module 1.

According to a preferred embodiment, the abnormality detection module 2 includes:

the sample data analysis unit is configured to acquire data information related to the risk scene to be handled, which is acquired by the data acquisition module 1, and analyze the data information to obtain key log sample data;

and the model building unit is configured to build an abnormality detection model according to the plurality of types of key log sample data and the machine learning training model.

The sample data analysis unit is further used for establishing a plurality of log templates according to the message type of the data information related to the risk scene to be handled; and according to the established log templates, analyzing and processing the data information related to the risk scene to be handled to obtain key log sample data.

Optionally, in this embodiment, the sample data parsing unit is further configured to determine a message type according to a template word and a parameter word in the data information related to the risk scenario to be handled, and establish a plurality of log templates according to the determined message type.

Specifically, the sample data analysis unit includes a message type determination subunit and a log template creation subunit, where the message type is determined according to a template word and a parameter word in the data information related to the risk scenario to be handled, and the log template creation subunit is configured to create a plurality of log templates according to the determined message type.

In particular, in this embodiment, the message type may be understood as a set of data information related to the risk scenario to be handled, where the message types of template words and parameter words in the data information related to the risk scenario to be handled are similar in message characteristics, and the principle is simple and easy to implement. Because there may be massive data information related to the risk scene to be handled, the message type is determined by means of template words and parameter words, so that a plurality of log templates can be effectively established, and massive data information related to the risk scene to be handled can be conveniently analyzed, so that key log sample data can be obtained quickly and accurately.

The data information related to the corresponding risk scenario to be handled, which is collected by the data collection module 1, is related to the user entity behavior, that is, the data information related to the risk scenario to be handled, which is collected by the data collection module 1, may indirectly reflect the user entity behavior.

Preferably, the user entity behavior may include: time, place, person, interaction, content of interaction. Such as user search: what time, what platform, ID, whether to do the search, what the content of the search is.

Preferably, the order downloaded by the user may be monitored by loading a monitoring code (or alternatively referred to as a buried point) on the sample data source, by which the user clicks on the registration button.

Preferably, the form of the data information related to the corresponding risk scenario to be handled, which is collected by the data collection module 1, is not limited at all, such as txt documents, or list mode.

Preferably, the data information collected by the data collection module 1 and related to the corresponding risk scenario to be handled is stored on various terminals used by the user.

Preferably, considering that the data information collected by the data collection module 1 and related to the corresponding risk scenario to be handled may be a large amount of unstructured sample data, the direct use may result in low efficiency of sample data processing and consume a large amount of computing power, for this purpose, in this embodiment, the data information collected by the data collection module 1 and related to the corresponding risk scenario to be handled is preprocessed or pre-analyzed, so as to achieve the purpose of structuring, and the data information collected by the structured data collection module 1 and related to the corresponding risk scenario to be handled is directly used in the subsequent steps, so that the efficiency of sample data processing is improved and computing power is saved.

Preferably, a series of analysis rules, such as an analysis log keyword, an analysis sample data step length, and a format or structure of sample data, are defined in the log template, so as to analyze the data information related to the corresponding risk scenario to be handled, which is acquired by the data acquisition module 1, to obtain key log sample data. Alternatively, the log template may also be referred to as a sample data parsing model.

Preferably, since the terminals used by the users are quite different from the product form or the operating systems of the terminals are different, a log template is respectively configured corresponding to each type of product form or each type of operating system.

As previously mentioned, the user entity behavior resulting from the user entity behavior typically includes five dimensions: time, place, person, interaction, content of interaction, thereby resulting in that the key log sample data may actually also include the five dimensions. As previously mentioned, the terminals where the user entity actions occur have various product morphologies, or they have different operating systems, resulting in the key log sample data actually also having dimensions in these respects.

Preferably, in order to reflect the user entity behavior, the Key Log sample data may be classified by the multiple sample data classification dimensions of step S103 to obtain several types of Key Log sample data, for example, the Key Log sample data is also called Log Key.

In this embodiment, the anomaly monitoring model may be established by training the neural network model according to the plurality of types of key log sample data. Specifically, the neural network model is not particularly limited, and may be LSTM, for example. The anomaly detection model may be a density-based approach or a distance-based approach when anomaly detection is performed.

Optionally, in the density-based method, there is defined: the density of the normal sample data point is similar to that of the adjacent sample data point, the density of the abnormal point is greatly different from that of the adjacent sample data point, therefore, when in abnormal detection, the density around a certain sample data point is compared with that around the local adjacent sample data point, the relative density of the sample data point and the adjacent sample data point is an abnormal score, and if the abnormal score exceeds a set threshold value, the sample data point is abnormal, and the abnormal behavior of the user entity is indicated.

Optionally, since there are several types of critical log sample data, when the anomaly detection model is established, an anomaly detection model can be established based on each type of critical log sample data, so as to determine whether the data information collected by the data collection module 1 and related to the corresponding risk scenario to be handled is anomalous from multiple dimensions, thereby detecting the anomaly condition of the user entity behavior.

For example, aiming at a first risk scene, the anomaly detection model generates relevant characteristics such as a sensitive data access period, time sequence, action, frequency and the like through managing information such as a database log, a call log, a user access log, access full flow and the like of an organization, and generates a dynamic baseline, a user access dynamic baseline and a group access dynamic baseline of sensitive data access through time sequence association and a self-learning algorithm. Firstly, an outlier analysis is utilized by the outlier detection model to mine the behavioural outlier individuals. The anomaly detection model automatically selects log data in a certain time without any direct operation on a user application system, and performs outlier analysis on a plurality of circumferences such as work and rest time, work place, behavior characteristics (such as operation frequency and work hot area time period), personal characteristics (age and affiliated mechanism) and the like of personnel, so that personnel with abnormal behaviors, namely users or accounts, are excavated; 2) The abnormality detection model builds a behavior base line, and reveals individual query behaviors. According to the self requirements of the user, a behavior base line is built by combining the user or the account, for example, an anomaly detection model can define which accounts can access a service system when; which account has access rights, and so on. When the abnormality detection model finds that the daily access amount of the user is mutated, the abnormality detection model determines that the user has a query action. 3) Based on the doubt behavior, the abnormal book of the individual is judged. For example, the anomaly detection model extracts anomaly behavior information of account work and rest time. 4) The anomaly detection model can trace suspicious associated personnel by utilizing a relationship graph, and perform association analysis on suspicious personnel, account numbers and users so as to analyze personnel with association from multiple dimensions (such as institutions, applications, contents and the like); 5) And restoring log information and listing suspicious personnel operation. And the anomaly detection model backtracks the query operation of the anomaly detection model by utilizing log search according to the screened suspicious personnel list, so that the threat behavior of the anomaly detection model is finally confirmed.

For another example, for the second risk scenario, the anomaly detection model generates individual behavior portraits and group behavior portraits by abstracting and generalizing normal behavior and personnel, using big data techniques. On the basis, whether the activities of the account have abnormal behaviors, such as frequent login and exit, information systems or data assets which are not accessed by access history, abnormal time place login and the like, and whether the activities of the account deviate from personal behavior portraits and group (such as departments or project groups) behavior portraits are compared and analyzed, the suspected theft risk score of the account is comprehensively judged, and the security team is helped to find account collapse in time. The anomaly detection model provides an optimal security view for detecting the collapse account, improves the signal-to-noise ratio of data, can combine and reduce the alarm amount, allows security teams to preferentially process ongoing threats, and facilitates response investigation. Meanwhile, the anomaly detection model can monitor and analyze user behaviors aiming at the established account, identify excessive privileges or anomaly access rights, and is suitable for all types of users and accounts such as privileged users and service accounts. The usage anomaly detection model can also be used to help clear accounts and user rights that set dormant accounts and rights higher than desired. Through behavior analysis of the anomaly detection model, an Identification and Access Management (IAM) and Privilege Account Management (PAM) system can more comprehensively evaluate the security of an access subject, and supports a Zero Trust (Zero Trust) network security architecture and deployment scene.

For another example, for the third risk scenario, the anomaly detection model may construct a timing anomaly detection model, and construct a dynamic behavior baseline of a single server, and a dynamic behavior baseline of a group (such as a service type, a security domain, etc.) server according to the historical timing fluctuation rule of the timing characteristics of the intranet host or the server, and the characteristics of the request domain name, the account login, the traffic size, the frequency of accessing the security zone, the standard deviation of the linking host, etc. And (3) taking a suspected collapse scene of a specific host, such as botnet, leucavirus, command control (C & C or C2) and the like into consideration by utilizing a baseline, giving comprehensive anomaly scores of different entities of different models in different time periods, detecting the collapse host, positioning to the specific time period and host information by combining asset information, and assisting an enterprise to discover the collapse host in time and trace the source.

For another example, for the fourth risk scenario, the anomaly detection model generates sensitive data access related features such as access period, time sequence, action, frequency and the like by managing information such as enterprise database logs, call-back logs, user access logs and access full flow, and generates multiple detection scenarios such as an accessed dynamic baseline, a user access dynamic baseline, a group access dynamic baseline and the like of the sensitive database by time sequence association and a self-learning algorithm.

For another example, for the fifth risk scenario, the anomaly detection model uses a baseline and threat model, and also builds a behavioral timeline of users and entities based on alarms generated in all security solutions, for risk aggregation. Weight assessment is also typically performed in conjunction with organizational structure, asset criticality, personnel roles, access levels, etc., to rank and order the overall risk, and thus to determine the scope in which users, entities, events, or potential events should be prioritized. The current situation of the shortage of manpower of the security team can be greatly relieved through risk grading and sorting.

For another example, for a sixth risk scenario, an attacker may have achieved the goal of maliciously calling the API by transforming a number of different request parameters. The abnormality detection model generally comprises URL request parameters and request main bodies corresponding to the APIs through analyzing the composition and the use mode of the APIs commonly used at present, and various detection scenes such as an API request frequency dynamic base line, an API request time sequence dynamic base line, a parameter transformation dynamic base line and the like are constructed through extracting characteristics such as enterprise service API access frequency characteristics, requester access frequency characteristics, parameter transformation standard deviation, request time day and night distribution and the like. Based on a dynamic baseline, the method and the system realize the detection of abnormal behaviors such as mutation and abnormality detection of the API request quantity, periodic abnormality, unknown users, suspicious group latent users (a certain user uses a large number of different IPs) and the like, further combine the specific service attribute of the API, realize the detection of the API abnormal request behavior of the WEB service system, can locate specific time period and service and data information, assist enterprises to discover abnormal calling behaviors in time, and ensure the safety of the whole service and the data.

For another example, for the seventh risk scene, the abnormality detection model can discover suspicious personnel accounts at the first time by comparing historical behavior baselines of users with behavior baselines of staff in the same group, and timely prevent VPN account illegal operation or account collapse risks through investigation and analysis.

According to a preferred embodiment, the sample data parsing unit is configured to obtain data information related to the risk scenario to be handled from the data acquisition module 1, and the method for parsing the data information to obtain key log sample data includes:

establishing a plurality of log templates according to the data information related to the risk scene to be handled; and according to the established log templates, analyzing and processing the data information related to the risk scene to be handled to obtain key log sample data.

According to a preferred embodiment, the method for creating a plurality of log templates according to the data information related to the risk scenario to be handled comprises:

determining a message type according to template words and parameter words in the data information related to the risk scene to be handled;

and establishing a plurality of log templates according to the determined message types.

An electronic device includes: a memory having stored thereon computer executable instructions for executing the computer executable instructions to perform the steps of:

and establishing an anomaly detection model according to the plurality of types of key log sample data and the machine learning training model.

The hardware structure of the electronic device may include: a processor, a communication interface, a computer readable medium and a communication bus; wherein the processor, the communication interface and the computer readable medium complete the communication with each other through a communication bus;

preferably, the communication interface may be an interface of a communication module, such as an interface of a GSM module;

wherein the processor may be specifically configured to execute an executable program stored on the memory, thereby performing all or part of the processing steps of any of the method embodiments described above.

The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc.; but may also be a Digital Signal Processor (DSP), application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

A computer storage medium, wherein computer executable instructions are stored on the computer storage medium, and when executed, the computer executable instructions establish an anomaly detection model based on a plurality of types of key log sample data and a machine learning training model.

The computer storage medium has stored thereon computer executable instructions that when executed perform the steps of:

acquiring data information related to the risk scene to be handled, which is related to the behavior of a user entity, from a data acquisition module 1;

according to the established log template, analyzing and processing the data information related to the risk scene to be handled to obtain key log sample data;

A method of user entity behavior analysis of a persistent immune security system, the method comprising:

the user inputs a risk scene to be treated, which is required to be treated, through an input module;

the data acquisition module 1 acquires the risk scene to be handled which is input by the input module, and acquires data information corresponding to a user according to the risk scene to be handled;

The abnormality detection module 2 analyzes the to-be-handled risk scene input by the input module and the data information related to the to-be-handled risk scene acquired by the data acquisition module 1, so as to make a user portrait for the user and/or the information system by using a user entity behavior analysis technology, and judges whether abnormal activities and/or abnormal processes exist in the user and/or the information system based on the formed user portrait.

It should be noted that the above-described embodiments are exemplary, and that a person skilled in the art, in light of the present disclosure, may devise various solutions that fall within the scope of the present disclosure and fall within the scope of the present disclosure. It should be understood by those skilled in the art that the present description and drawings are illustrative and not limiting to the claims. The scope of the invention is defined by the claims and their equivalents.

The present specification contains several inventive concepts, and applicant reserves the right to issue a divisional application according to each of the inventive concepts. The description of the invention encompasses multiple inventive concepts, such as "preferably," "according to a preferred embodiment," or "optionally," all means that the corresponding paragraph discloses a separate concept, and that the applicant reserves the right to filed a divisional application according to each inventive concept.

Claims

1. An anomaly detection optimization method based on trusted computing,

characterized in that it comprises at least:

the method comprises the steps that data information of a user, which is related to a risk scene to be handled, is acquired through a data acquisition module (1), and the data acquisition module (1) which is in data connection with an input module acquires the risk scene to be handled, which is required to be handled and is input by the user through the input module;

inputting the data information of the acquired user associated with the risk scene to be handled into an anomaly detection model group through an anomaly detection module (2), wherein the anomaly detection model group comprises a plurality of anomaly detection models, and performing anomaly detection judgment on the data information of the acquired user associated with the risk scene to be handled according to the anomaly detection judgment strategy by the anomaly detection models in the anomaly detection model group and outputting a detection result,

The anomaly detection module (2) comprises a white list generation unit (201) capable of generating a white list matched with the safety requirements of different users based on application scenes of the users and/or the safety situation monitored by the anomaly detection module (2), and a user entity behavior analysis unit (202) capable of monitoring and analyzing processes or programs running on the white list of the users to monitor whether the processes or programs running on the white list of the users are anomalous.

2. The anomaly detection optimization method according to claim 1, wherein the method for acquiring data information of a user associated with a risk scenario to be handled by the data acquisition module (1) is as follows: the risk scene to be handled is input through an input module to serve as the risk scene to be handled by the abnormality detection module (2), and only data information related to the risk scene to be handled of a user is acquired through the data acquisition module (1).

3. The anomaly detection optimization method according to claim 2, wherein the data acquisition module (1) is capable of acquiring a risk scenario entered by the input module, the input module being capable of sending the risk scenario to the data acquisition module (1), wherein the risk scenario is capable of being defined by a user.

4. The abnormality detection optimization method according to claim 3, wherein the abnormality detection judgment of the acquired user data information associated with the risk scene to be handled by the abnormality detection model in the abnormality detection model group according to the abnormality detection judgment policy and outputting a detection result, includes: and if the detection result shows that the data information associated with the risk scene to be handled is abnormal, generating an alarm event.

5. The anomaly detection optimization method of claim 4, wherein a plurality of the anomaly detection models have a cascaded logic processing relationship; the abnormality detection judgment strategy is determined according to the cascade logic processing relationship; the abnormality detection model in the abnormality detection model group performs abnormality detection and judgment on the obtained data information of the user associated with the risk scene to be handled according to the abnormality detection and judgment strategy and outputs a detection result, and the abnormality detection method comprises the following steps: if the output of the last abnormality detection model indicates that the data information of the acquired user associated with the risk scene to be handled is normal, forwarding the data information of the acquired user associated with the risk scene to be handled to the next abnormality detection model by the last abnormality detection model, performing abnormality detection judgment on the data information of the acquired user associated with the risk scene to be handled, and outputting a detection result.

6. The anomaly detection optimization method of claim 5, wherein a plurality of the anomaly detection models have a parallel logical processing relationship; the abnormality detection judgment strategy is determined according to the cascade logic processing relationship; the abnormality detection model in the abnormality detection model group performs abnormality detection and judgment on the acquired data information related to the risk scene to be handled according to the abnormality detection and judgment strategy and outputs a detection result, and the abnormality detection method comprises the following steps: and the plurality of abnormality detection models carry out abnormality detection judgment on the acquired data information related to the risk scene to be handled in parallel and output detection results.

7. An anomaly detection optimization system based on trusted computing, comprising at least:

the data acquisition module (1) is configured to be capable of acquiring data information of a user associated with a risk scene to be handled, and the data acquisition module (1) in data connection with the input module acquires the risk scene to be handled of a required handling input by the user through the input module;

an anomaly detection module (2) configured to be able to input data information of an acquired user associated with a risk scenario to be handled into an anomaly detection model group, and perform anomaly detection judgment on the data information of the acquired user associated with the risk scenario to be handled by an anomaly detection model in the anomaly detection model group according to an anomaly detection judgment policy and output a detection result,

8. An electronic device, comprising: a memory having stored thereon computer executable instructions for executing the computer executable instructions to perform the trusted computing based anomaly detection optimization method of any one of claims 1 to 6.