CN116248371A - Method, device, equipment and storage medium for identifying abnormal message - Google Patents

Method, device, equipment and storage medium for identifying abnormal message Download PDF

Info

Publication number
CN116248371A
CN116248371A CN202310109891.3A CN202310109891A CN116248371A CN 116248371 A CN116248371 A CN 116248371A CN 202310109891 A CN202310109891 A CN 202310109891A CN 116248371 A CN116248371 A CN 116248371A
Authority
CN
China
Prior art keywords
target
dimension
information
access request
request message
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310109891.3A
Other languages
Chinese (zh)
Inventor
李任鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202310109891.3A priority Critical patent/CN116248371A/en
Publication of CN116248371A publication Critical patent/CN116248371A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/20Network architectures or network communication protocols for network security for managing network security; network security policies in general

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The disclosure provides an abnormal message identification method, device, equipment and storage medium, relates to the technical field of Internet, and particularly relates to the technical fields of data security, big data, cloud computing and the like. The method for identifying the abnormal message comprises the following steps: receiving a target access request message, wherein the target access request message comprises target identification information of a plurality of dimensions; matching the target identification information of each dimension in the plurality of dimensions with the information of each dimension acquired in advance to obtain a matching result of each dimension; wherein the information is acquired based on an abnormal recognition result of the history access request message; acquiring target coding information of the target access request message based on the matching result of each dimension; and acquiring an abnormal identification result of the target access request message based on the target coding information. The present disclosure may improve recognition accuracy.

Description

Method, device, equipment and storage medium for identifying abnormal message
Technical Field
The disclosure relates to the technical field of internet, in particular to the technical fields of data security, big data, cloud computing and the like, and particularly relates to an abnormal message identification method, device, equipment and storage medium.
Background
Attacks by web crawlers or malicious parties may cause network traffic anomalies. In order to protect data security, it is necessary to identify an abnormal access request message in time.
Disclosure of Invention
The present disclosure provides a method, apparatus, device and storage medium for identifying an exception message.
According to an aspect of the present disclosure, there is provided a method for identifying an exception message, including: receiving a target access request message, wherein the target access request message comprises target identification information of a plurality of dimensions; matching the target identification information of each dimension in the plurality of dimensions with the information of each dimension acquired in advance to obtain a matching result of each dimension; wherein the information is acquired based on an abnormal recognition result of the history access request message; acquiring target coding information of the target access request message based on the matching result of each dimension; and acquiring an abnormal identification result of the target access request message based on the target coding information.
According to another aspect of the present disclosure, there is provided an apparatus for identifying an abnormal message, including: the receiving module is used for receiving a target access request message, wherein the target access request message comprises target identification information of multiple dimensions; the matching module is used for carrying out matching processing on the target identification information of each dimension in the plurality of dimensions and the information of each dimension acquired in advance so as to obtain a matching result of each dimension; wherein the information is acquired based on an abnormal recognition result of the history access request message; the encoding module is used for obtaining target encoding information of the target access request message based on the matching result of each dimension; and the determining module is used for acquiring an abnormal identification result of the target access request message based on the target coding information.
According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of the above aspects.
According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method according to any one of the above aspects.
According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements a method according to any of the above aspects.
According to the technical scheme, the identification accuracy can be improved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a schematic diagram according to a first embodiment of the present disclosure;
fig. 2 is a schematic diagram of an application scenario provided according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of an overall architecture provided in accordance with an embodiment of the present disclosure;
FIG. 4 is a schematic diagram according to a second embodiment of the present disclosure;
FIG. 5 is a schematic diagram according to a third embodiment of the present disclosure;
FIG. 6 is a schematic diagram according to a fourth embodiment of the present disclosure;
fig. 7 is a schematic diagram of an electronic device for implementing a method of identifying an exception message according to an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
By analyzing the historical access request messages (e.g., using statistical information or an algorithmic model), the status of the historical access request messages may be determined, including: abnormal or normal. The identification Information (ID) included in the abnormal history access request message may be represented by a black ID, and the identification information included in the normal history access request message may be represented by a white ID.
Intelligence (inteligence) information, a term in the field of network security, is a useful piece of information that can be gathered, processed, analyzed to assist decision makers in making corrective actions.
In the embodiment of the present disclosure, the information includes the above-mentioned black ID and/or white ID.
In the related art, abnormality recognition is directly performed based on identification information included in an access request message, for example, if the identification information included in the access request message belongs to a predetermined black ID, the access request message is considered to be an abnormal message.
However, this method has a problem of insufficient recognition accuracy.
In order to improve recognition accuracy, the present disclosure provides the following embodiments.
Fig. 1 is a schematic diagram of a first embodiment of the present disclosure, where the present embodiment provides a method for identifying an abnormal message, the method includes:
101. and receiving a target access request message, wherein the target access request message comprises target identification information of multiple dimensions.
102. Matching the target identification information of each dimension in the plurality of dimensions with the information of each dimension acquired in advance to obtain a matching result of each dimension; wherein the intelligence information is acquired based on an abnormality recognition result of the history access request message.
103. And obtaining target coding information of the target access request message based on the matching result of each dimension.
104. And acquiring an abnormal identification result of the target access request message based on the target coding information.
The target access request message refers to a network access request message to be identified, and the network access request message is, for example, a hypertext transfer protocol (HyperTextTransferProtocol, HTTP) request message.
The target identification information refers to identification information (abbreviated as ID) contained in the target access request.
The target identification information is identification information of a plurality of dimensions.
The identification information of the plurality of dimensions includes, for example: at least two of internet protocol (InternetProtocol, IP) address, IPC segment address, user Agent (UA) information, JA3 fingerprint, etc.
The IPC segment address refers to a C-type IP address, specifically refers to the first three segments of numbers as network numbers, and the remaining segment of numbers are numbers of a local computer. If the IP address is represented in binary, the class C IP address consists of a3 byte network address and a 1 byte host address.
The UA information is a special string header, and the web server identifies the UA information to determine the version of the operating system, the type of the central processing unit (CentralProcessingUnit, CPU), the version of the browser, and the like used by the user. In addition, the website server can also send different pages to the client by judging UA information.
The JA3 fingerprint, also known as a browser fingerprint, may be used to identify the browser, and does not change as the user updates the IP address or UA information.
By analyzing the historical access request messages (e.g., using statistical information or an algorithmic model), the status of the historical access request messages may be determined, including: abnormal or normal. The identification Information (ID) included in the abnormal history access request message may be represented by a black ID, and the identification information included in the normal history access request message may be represented by a white ID.
Intelligence (inteligence) information, a term in the field of network security, is a useful piece of information that can be gathered, processed, analyzed to assist decision makers in making corrective actions.
In the embodiment of the present disclosure, the information includes the above-mentioned black ID and/or white ID.
Since the identification information is multi-dimensional, the corresponding informative information is also multi-dimensional informative information, and each dimension can include a black ID and/or a white ID of the corresponding dimension. Taking an example that the information in each dimension includes a black ID of a corresponding dimension, the plurality of dimensions includes an IP dimension and a UA dimension, the information of the plurality of dimensions includes: black IP, black UA.
In the matching process, for example, the target identification information includes a target IP and a target UA, the information includes a black IP and a black UA, and the matching process may be performed on the target IP and the black IP, so as to obtain a matching result in the IP dimension, and the matching process may be performed on the target UA and the black UA, so as to obtain a matching result in the UA dimension.
The intelligence information of each dimension includes at least one history identification information, such as for the IP dimension, the black IP specifically includes at least one IP address, and similarly, the black UA includes at least one UA information.
Taking matching processing for the IP dimension as an example, assuming that the target IP is represented by IP0, the black IP includes IP1 and IP2, and may be that whether IP0 is the same as any one of IP1 and IP2 is determined, if IP0 is the same as IP1, or IP0 is the same as IP2, a matching result of the IP dimension is obtained and is represented by 1; otherwise, if IP0 is different from IP1 and IP2, the matching result of the IP dimension is not matching, which can be represented by 0. Alternatively, the matching result of IP0 and IP1 and the matching result of IP0 and IP2 may be obtained, and the matching result of IP0 and IP1 and the matching result of IP0 and IP2 may be combined into the matching result of IP dimension. For example, if IP0 is the same as IP1, the matching result of IP0 and IP1 may be denoted by 1, similarly, if IP0 is different from IP2, the matching result of IP0 and IP2 may be denoted by 0, and the matching result of IP dimension may be denoted by 10.
After the matching result is obtained, the target encoding information may be obtained based on the matching result of each dimension. The coding strategy can be pre-configured, and the matching result of each dimension is coded according to the coding strategy.
For example, taking three dimensions as an example, assuming that the matching results of the three dimensions are 1, 0, and 1, respectively, and the encoding policy is, for example, a combination, the combined data (i.e., 101) may be used as target encoding information. Alternatively, the encoding strategy may include a combination and a hash operation, where the three matching results may be combined first, and then the combined data (i.e., 101) may be subjected to a hash operation, and the hash value obtained by the operation is used as the target encoding information.
The hash operation, which may also be referred to as a hash operation, refers to an operation performed using a hash (hash) function. The input of the hash operation is a set of information of arbitrary length, which is transformed into data of fixed length by a hash function and output, such as a combination of letters and numbers, which is the "hash value". The hash function includes, for example: a Message digest algorithm (Message-DigestAlgorithm, MD), a secure hash algorithm (SecureHashAlgorithm, SHA), and the like.
After the target encoding information is obtained, the target access request message may be subjected to anomaly identification based on the target encoding information.
The anomaly identification may be offline identification, for example, a plurality of target access request messages are obtained from log data, and the anomaly target access request message in the multi-entry target access request message is identified based on target coding information; alternatively, the anomaly identification may be online identification, that is, the current single target access request message may be identified in real time, so as to determine whether the target access request message is an anomalous target access request message.
In this embodiment, by obtaining the matching result of the target identification information and the information in each dimension, generating the target coding information according to the matching result of each dimension, and obtaining the abnormal recognition result of the target access request based on the target coding information, the recognition accuracy can be improved compared with the manner of performing recognition directly based on the target identification information because the target coding information fuses the information in a plurality of dimensions.
In order to better understand the embodiments of the present disclosure, application scenarios to which the embodiments of the present disclosure are applicable are described below. The present embodiment takes a web crawler as an example.
Web crawlers (also known as web spiders, web robots, web chasers) are programs or scripts that automatically crawl web information according to certain rules. Other names that are not commonly used are ants, auto-indexes, simulators, or worms.
Taking a web resource where a crawling object is a business party as an example, as shown in fig. 2, a crawler program or script may be deployed on a user terminal 201, a web resource may be deployed on a server 202, and the user terminal 201 automatically sends an access request message to the server 202 by adopting the program or script to crawl the web resource on the server 202. The access request message is, for example, an HTTP request message. The user terminal includes, for example: personal computers (personalcomputers), notebook computers, mobile devices (e.g., cell phones), and the like. The server may be a local server or a cloud server. The user terminal and the server may communicate over a wired network and/or a wireless network.
The web crawlers can generate a large amount of access request messages in a short time, the network traffic is too large, normal business processing of business parties can be seriously interfered, and the data security of the business parties can be endangered, so that abnormal messages (such as access request messages generated by the web crawlers) need to be identified.
In the related art, processing is generally performed directly according to the target identification information (target ID) included in the target access request message, for example, after obtaining a black ID (a plurality of black IDs are generally used to form a black ID list) based on history data, if the target ID included in the target access request message belongs to the black ID list, the target access request message is considered as an abnormal message, and then a preset abnormal processing policy may be adopted to process the abnormal message, for example, access to the abnormal message is denied.
In addition, regarding the identification information of a plurality of dimensions, it is common in the related art to pre-configure a rule that the ID of each dimension needs to satisfy, and identify whether the target access request message is an abnormal message based on the rule. For example, the IDs of the multiple dimensions include a first ID and a second ID, and the preconfigured rule is that the first ID satisfies a first condition, the second ID satisfies a second condition, and if the first ID in the target access request message satisfies the first condition and the second ID in the target access request message satisfies the second condition, the target access request message is determined to be an exception message. Although considering IDs of multiple dimensions can improve recognition accuracy relative to a single dimension, the configuration workload is complicated because rules need to be configured for each dimension separately. In addition, when in comparison, the comparison is needed to be carried out on each dimension, and the efficiency is poor; in addition, since the IDs of each dimension are compared respectively, the different dimensions are relatively independent, and the information between the dimensions cannot be fused, so that the identification accuracy is still to be improved.
In this embodiment, as shown in fig. 3, the target access request message includes target identification information of multiple dimensions, and information of multiple dimensions may also be obtained. In each dimension, matching processing can be carried out on the target identification information of each dimension and the information of the corresponding dimension so as to obtain a matching result of each dimension; after the matching results of each dimension are coded, target coding information of target access request information can be obtained; thereafter, abnormality recognition may be performed based on the target encoding information, thereby obtaining an abnormality recognition result of the target access request message.
In the technical scheme of the disclosure, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the user accord with the regulations of related laws and regulations, and the public order colloquial is not violated.
In this embodiment, the anomaly identification is performed based on the target coding information, which is obtained after the matching result of each dimension is coded, so that the information of each dimension can be fused, and the identification accuracy is improved; in addition, as the target coding information is the result of fusing the information of the multiple dimensions, one strategy is not required to be configured corresponding to the identification information of each dimension, and only a unified strategy is required to be configured for the target coding information, compared with a mode of configuring multiple strategies, the method has the advantages of simplifying workload, improving recognition efficiency, improving accuracy and the like.
In combination with the application scenario, the present disclosure further provides an abnormal message identification method.
Fig. 4 is a schematic diagram of a second embodiment of the present disclosure, where the present embodiment provides a method for identifying an abnormal message, the method includes:
401. and acquiring a history access request message, wherein the history access request message comprises history identification information of a plurality of dimensions.
The history access request message may be periodically acquired from the log data, for example, yesterday log data is acquired today, yesterday access request message is recorded in yesterday log data, and yesterday access request message is used as the history access request message of today.
The identification information contained in the history access request message may be referred to as history identification information.
The obtained historical access request message is usually a large number (a plurality of) and different historical access request messages can contain the same or different dimension identification information, and the same historical access request message can contain one or more dimension identification information.
For example, the first historical access request message includes a first historical IP address and first historical UA information, the second historical access request message includes a second historical IP address, and the third historical access request message includes a third historical IP address, second historical UA information, a first JA3 fingerprint, and the like.
The history identification information of three dimensions, that is, the history identification information of the IP dimension (specifically including the first history IP address, the second history IP address, and the third history IP address), the history identification information of the UA dimension (specifically including the first history UA information and the second history UA information), and the history identification information of the JA3 fingerprint dimension (specifically including the first JA3 fingerprint) may be obtained based on the above-described history access request message.
402. Obtaining an abnormality identification result of the historical access request message, wherein the abnormality identification result comprises: normal or abnormal.
The historical access request message may be subjected to anomaly identification by using an existing anomaly identification policy, for example, the anomaly identification policy is identified based on statistical information or an algorithm model.
For example, when the identification is performed based on the statistical information, it may be that the historical access request messages within a preset period (for example, every hour) are obtained, and if the number of the historical access request messages with the same identification information (for example, the IP address) within the period is greater than the preset number, and the number ratio of the logged-in users corresponding to the historical access request messages (the historical access request messages may include the login status of the users, and the number ratio may be obtained based on the login status) is smaller than the preset ratio, the abnormal identification result of the historical access request messages is determined to be abnormal.
For another example, when the model is based on an algorithm, an abnormality recognition model may be trained in advance, the model being a deep neural network model, the input of the model being identification information, and the output being an abnormality recognition result. Therefore, by adopting the abnormality recognition model, after the history identification information contained in the input history access request message is processed, the abnormality recognition result of the history access request message can be obtained.
403. For each dimension of the plurality of dimensions, generating information of each dimension based on history identification information contained in a normal history access request message of each dimension and/or history identification information contained in an abnormal history access request message of each dimension.
After the statistical information or the algorithm model is adopted to obtain the abnormal identification result of the history access request message, if the history access request message is an abnormal message, the history identification information contained in the abnormal history access request message is used as a black ID, and the history identification information contained in the normal history access request message is used as a white ID.
Since a large number of history access request messages contain identification information of a plurality of dimensions, a black ID and/or a white ID of a plurality of dimensions can be obtained. For example, the number of black IPs is 15219, the number of white IPs is 1937, the number of black IPCs is 3259, the number of white IPCs is 5187, the number of black UAs is 3658, the number of white UAs is 3903, and the like.
The black IDs and/or white IDs of the multiple dimensions may constitute information of the multiple dimensions, i.e. the information of the respective dimensions comprises the black IDs and/or white IDs of the corresponding dimensions.
Assuming that the plurality of dimensions includes an IP dimension, an IPC dimension, a UA dimension, and a JA3 fingerprint dimension, and the information of each dimension includes a black ID of the corresponding dimension, the information of the plurality of dimensions includes: black IP, black IPC, black UA and black JA3 fingerprints. The number of black IDs per dimension is one or more.
The 401-403 described above may be performed offline, i.e. before the target access request message is received.
In this embodiment, information of multiple dimensions is generated based on the anomaly recognition result of the historical access request message, so that basic data can be provided for the generation of subsequent target coding information, and thus the target coding information can be obtained efficiently and simply.
404. And receiving a target access request message, wherein the target access request message comprises target identification information of multiple dimensions.
The target access request message refers to a network access request message to be identified. The identification information contained in the target access request message may be referred to as target identification information.
Assuming that the multiple dimensions include an IP dimension and a UA dimension, the target identification information of the multiple dimensions may be represented as a target IP and a target UA.
405. And carrying out matching processing on the target identification information of each dimension in the plurality of dimensions and the information of each dimension acquired in advance to obtain a matching result of each dimension.
Wherein the intelligence information of each dimension comprises at least one history identification information;
the matching processing of the target identification information of each dimension in the plurality of dimensions and the information of each dimension acquired in advance to obtain a matching result of each dimension includes:
For a target dimension, wherein the target dimension is any dimension of the plurality of dimensions, and if the target identification information of the target dimension is the same as any one of the history identification information included in the information of the target dimension, a matching result of the target dimension is determined to be matching; or,
for a target dimension, wherein the target dimension is any dimension of the plurality of dimensions, matching processing is sequentially performed on target identification information of the target dimension and each history identification information included in information of the target dimension so as to obtain a matching result of the target identification information and each history identification information, and the matching result of the target dimension is obtained according to the matching result of the target identification information and each history identification information.
In the matching process, for example, the target identification information includes a target IP and a target UA, the information includes a black IP and a black UA, and the matching process may be performed on the target IP and the black IP, so as to obtain a matching result in the IP dimension, and the matching process may be performed on the target UA and the black UA, so as to obtain a matching result in the UA dimension.
The intelligence information of each dimension includes at least one history identification information, such as for the IP dimension, the black IP specifically includes at least one IP address, and similarly, the black UA includes at least one UA information.
Taking matching processing for the IP dimension as an example, assuming that the target IP is represented by IP0, the black IP includes IP1 and IP2, and may be that whether IP0 is the same as any one of IP1 and IP2 is determined, if IP0 is the same as IP1, or IP0 is the same as IP2, a matching result of the IP dimension is obtained and is represented by 1; otherwise, if IP0 is different from IP1 and IP2, the matching result of the IP dimension is not matching, which can be represented by 0. Alternatively, the matching result of IP0 and IP1 and the matching result of IP0 and IP2 may be obtained, and the matching result of IP0 and IP1 and the matching result of IP0 and IP2 may be combined into the matching result of IP dimension. For example, if IP0 is the same as IP1, the matching result of IP0 and IP1 may be denoted by 1, similarly, if IP0 is different from IP2, the matching result of IP0 and IP2 may be denoted by 0, and the matching result of IP dimension may be denoted by 10.
In some embodiments, for each dimension, matching may be performed for the target identification information and the overall information of the corresponding dimension, that is, whether the target identification information is the same as any item of information of the corresponding dimension is compared, so as to obtain a matching result of the corresponding dimension.
In some embodiments, for each dimension, matching may be performed for the target identification information and each information of the corresponding dimension, and a matching result of the corresponding dimension may be obtained based on the matching result of the target identification information and each information, so that accuracy of the matching result may be improved.
406. And obtaining target coding information of the target access request message based on the matching result of each dimension.
Wherein 404-406 may be performed online, i.e., in real-time after receiving the target access request message.
After the matching results of all the dimensions are obtained, the matching results of all the dimensions can be combined so as to obtain combined data; and obtaining the target coding information based on the combined data.
The combining process refers to stitching together the matching results of the dimensions, for example, the matching results of the two dimensions are 1 and 0 respectively, and the combined data is 10.
In this embodiment, by performing combination processing on the matching results of each dimension, the target coding information is obtained based on the combined data, so that the target coding information can be fused with the information of each dimension, and further the recognition accuracy can be improved when the anomaly recognition is performed based on the target coding information.
Further, the combined data may be directly used as target encoding information. Alternatively, hash operation may be performed on the combined data to obtain a hash value, and the hash value is used as the target encoding information.
For example, the combined data is 10, and the hash operation may be performed on the data 10, and the obtained hash value may be used as the target encoding information.
In this embodiment, by performing hash operation on the combined data and using the hash value as the target encoding information, the uniqueness of the target encoding information can be ensured, and encoding conflicts can be avoided.
407. And acquiring an abnormal identification result of the target access request message based on the target coding information.
The anomaly identification may be performed offline or online.
When the abnormality identification is performed offline, a large number of (a plurality of) target access request messages may be processed, and the abnormality message therein may be identified.
Specifically, the method comprises the following steps: performing de-duplication treatment on the target coding information in a preset duration to obtain de-duplication treated coding information; and determining an abnormal target access request message in the multi-entry target access request message based on a preset offline identification strategy and the code information after the duplication removal process.
For example, if the preset duration is 3 days, the target encoded information generated within 3 days may be subjected to the de-duplication processing, and then the encoded information after the de-duplication processing is obtained. Then, an abnormal target access request message can be identified in the multi-entry target access request message based on the offline identification policy and the encoded information after the deduplication process.
The offline identification policy may be an anomaly identification policy similar to that for historical access request messages, for example, or may be based on statistical information or on an algorithmic model. Unlike the manner in which historical access request messages are handled, historical access request messages are based on historical identification information, and targeted access request messages are based on targeted encoding information.
Taking an algorithm model as an example, an identification model can be trained in advance, the input of the identification model is coding information, and the output is an abnormal identification result, when processing the target access request message, the target coding information contained in the target access request message can be input into the identification model, so that the abnormal identification result of the target access request message is obtained.
In this embodiment, the target coding information within the preset duration is subjected to the deduplication processing to obtain the coding information after the deduplication processing, and further, the anomaly identification is performed based on the coding information after the deduplication processing, so that the resource waste can be avoided, and the processing efficiency is improved.
When the abnormality identification is performed online, whether the target access request message is an abnormal message or not can be identified in real time aiming at a single target access request message.
Specifically, the method comprises the following steps: acquiring a preset online identification strategy, wherein the online identification strategy is used for online identification of abnormal coding information; and if the target coding information accords with the online identification strategy, determining that the target access request message is an abnormal target access request message.
The online identification policy is, for example, that a blacklist is obtained in advance, the blacklist includes at least one piece of abnormal coding information, and if the target coding information belongs to the blacklist, the target access request message is considered as an abnormal message. Alternatively, if a blacklist is not available, an online algorithm model may be employed that is used to identify whether the target access request message is an exception message. The algorithm model is pre-trained, the input is the coding information, and the output is the abnormal recognition result.
After determining that the target access request message is an exception message online, the exception message may be handled online based on a handling policy, e.g., access to the exception message may be denied.
In this embodiment, by performing anomaly identification on a single target access request message based on an online identification policy, online anomaly identification can be implemented, so that an anomaly message can be handled in time.
Fig. 5 is a schematic diagram of a third embodiment of the present disclosure, where an apparatus 500 for identifying an abnormal request is provided, and the apparatus includes: a receiving module 501, a matching module 502, an encoding module 503 and a determining module 504.
The receiving module 501 is configured to receive a target access request message, where the target access request message includes target identification information of multiple dimensions; the matching module 502 is configured to perform matching processing on target identification information of each dimension in the multiple dimensions and information of each dimension acquired in advance, so as to obtain a matching result of each dimension; wherein the information is acquired based on an abnormal recognition result of the history access request message; the encoding module 503 is configured to obtain target encoding information of the target access request message based on the matching result of each dimension; the determining module 504 is configured to obtain an anomaly identification result of the target access request message based on the target encoding information.
In this embodiment, by obtaining the matching result of the target identification information and the information in each dimension, generating the target coding information according to the matching result of each dimension, and obtaining the abnormal recognition result of the target access request based on the target coding information, the recognition accuracy can be improved compared with the manner of performing recognition directly based on the target identification information because the target coding information fuses the information in a plurality of dimensions.
In some embodiments, the encoding module 503 is further configured to:
combining the matching results of the dimensions to obtain combined data;
and obtaining the target coding information based on the combined data.
In this embodiment, by performing combination processing on the matching results of each dimension, the target coding information is obtained based on the combined data, so that the target coding information can be fused with the information of each dimension, and further the recognition accuracy can be improved when the anomaly recognition is performed based on the target coding information.
In some embodiments, the encoding module 503 is further configured to:
and carrying out hash operation on the combined data to obtain a hash value, and taking the hash value as the target coding information.
In this embodiment, by performing hash operation on the combined data and using the hash value as the target encoding information, the uniqueness of the target encoding information can be ensured, and encoding conflicts can be avoided.
In some embodiments, the informative information for each dimension includes at least one history identification information; the matching module 502 is further configured to: for a target dimension, the target dimension is any dimension of the plurality of dimensions, and if the target identification information of the target dimension is the same as any one of the history identification information included in the information of the target dimension, the matching result of the target dimension is determined to be matching.
In this embodiment, for each dimension, matching may be performed for the target identification information and the overall information of the corresponding dimension, that is, whether the target identification information is the same as any item of information of the corresponding dimension is compared, so as to obtain a matching result of the corresponding dimension.
In some embodiments, the informative information for each dimension includes at least one history identification information; the matching module 502 is further configured to: for a target dimension, wherein the target dimension is any dimension of the plurality of dimensions, matching processing is sequentially performed on target identification information of the target dimension and each history identification information included in information of the target dimension so as to obtain a matching result of the target identification information and each history identification information, and the matching result of the target dimension is obtained according to the matching result of the target identification information and each history identification information.
In this embodiment, for each dimension, matching may be performed for each information of the target identification information and the corresponding dimension, and a matching result of the corresponding dimension may be obtained based on the matching result of the target identification information and each information, so that accuracy of the matching result may be improved.
In some embodiments, the target access request message is a plurality of pieces; the determining module 504 is further configured to: performing de-duplication treatment on the target coding information in a preset duration to obtain de-duplication treated coding information; and determining an abnormal target access request message in the multi-entry target access request message based on a preset offline identification strategy and the code information after the duplication removal process.
In this embodiment, the target coding information within the preset duration is subjected to the deduplication processing to obtain the coding information after the deduplication processing, and further, the anomaly identification is performed based on the coding information after the deduplication processing, so that the resource waste can be avoided, and the processing efficiency is improved.
In some embodiments, the target access request message is a single piece; the determining module 504 is further configured to: acquiring a preset online identification strategy, wherein the online identification strategy is used for online identification of abnormal coding information; and if the target coding information accords with the online identification strategy, determining that the target access request message is an abnormal target access request message.
In this embodiment, by performing anomaly identification on a single target access request message based on an online identification policy, online anomaly identification can be implemented, so that an anomaly message can be handled in time.
Fig. 6 is a schematic diagram of a fourth embodiment of the present disclosure, where an apparatus 600 for identifying an abnormal request includes: a receiving module 601, a matching module 602, an encoding module 603 and a determining module 604. Further comprises: an acquisition module 605, an identification module 606, and a generation module 607.
The description of the receiving module 601, the matching module 602, the encoding module 603 and the determining module 604 can be found in the above-described related embodiments.
The obtaining module 605 is configured to obtain a history access request message, where the history access request message includes history identification information of multiple dimensions; the identifying module 606 is configured to obtain an anomaly identification result of the historical access request message, where the anomaly identification result includes: normal or abnormal; the generating module 607 is configured to generate, for each dimension of the plurality of dimensions, information of the each dimension based on history identification information included in a normal history access request message of the each dimension and/or history identification information included in an abnormal history access request message of the each dimension.
In this embodiment, information of multiple dimensions is generated based on the anomaly recognition result of the historical access request message, so that basic data can be provided for the generation of subsequent target coding information, and thus the target coding information can be obtained efficiently and simply.
It is to be understood that in the embodiments of the disclosure, the same or similar content in different embodiments may be referred to each other.
It can be understood that "first", "second", etc. in the embodiments of the present disclosure are only used for distinguishing, and do not indicate the importance level, the time sequence, etc.
In the technical scheme of the disclosure, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the user accord with the regulations of related laws and regulations, and the public order colloquial is not violated.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
Fig. 7 illustrates a schematic block diagram of an example electronic device 700 that may be used to implement embodiments of the present disclosure. The electronic device 700 is intended to represent various forms of digital computers, such as laptops, desktops, workstations, servers, blade servers, mainframes, and other appropriate computers. Electronic device 700 may also represent various forms of mobile apparatuses, such as personal digital assistants, cellular telephones, smartphones, wearable devices, and other similar computing apparatuses. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 7, the electronic device 700 includes a computing unit 701 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 702 or a computer program loaded from a storage unit 707 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data required for the operation of the electronic device 700 may also be stored. The computing unit 701, the ROM702, and the RAM703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
Various components in the electronic device 700 are connected to the I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, etc.; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, an optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the electronic device 700 to exchange information/data with other devices through a computer network, such as the internet, and/or various telecommunication networks.
The computing unit 701 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The calculation unit 701 performs the respective methods and processes described above, for example, the recognition method of an abnormal message. For example, in some embodiments, the method of identifying an exception message may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 708. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 700 via the ROM702 and/or the communication unit 709. When the computer program is loaded into the RAM703 and executed by the computing unit 701, one or more steps of the above-described method of identifying an abnormal message may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured to perform the method of identifying the exception message by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems-on-chips (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable retrieval device such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram block or blocks to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service ("VirtualPrivate Server" or simply "VPS") are overcome. The server may also be a server of a distributed system or a server that incorporates a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (19)

1. An anomaly message identification method, comprising:
receiving a target access request message, wherein the target access request message comprises target identification information of a plurality of dimensions;
matching the target identification information of each dimension in the plurality of dimensions with the information of each dimension acquired in advance to obtain a matching result of each dimension; wherein the information is acquired based on an abnormal recognition result of the history access request message;
Acquiring target coding information of the target access request message based on the matching result of each dimension;
and acquiring an abnormal identification result of the target access request message based on the target coding information.
2. The method of claim 1, wherein the obtaining the target encoding information of the target access request message based on the matching results of the respective dimensions comprises:
combining the matching results of the dimensions to obtain combined data;
and obtaining the target coding information based on the combined data.
3. The method of claim 2, wherein the obtaining the target encoding information based on the combined data comprises:
and carrying out hash operation on the combined data to obtain a hash value, and taking the hash value as the target coding information.
4. The method of claim 1, wherein,
the information of each dimension comprises at least one history identification information;
the matching processing of the target identification information of each dimension in the plurality of dimensions and the information of each dimension acquired in advance to obtain a matching result of each dimension includes:
For a target dimension, the target dimension is any dimension of the plurality of dimensions, and if the target identification information of the target dimension is the same as any one of the history identification information included in the information of the target dimension, the matching result of the target dimension is determined to be matching.
5. The method of claim 1, wherein,
the information of each dimension comprises at least one history identification information;
the matching processing of the target identification information of each dimension in the plurality of dimensions and the information of each dimension acquired in advance to obtain a matching result of each dimension includes:
for a target dimension, wherein the target dimension is any dimension of the plurality of dimensions, matching processing is sequentially performed on target identification information of the target dimension and each history identification information included in information of the target dimension so as to obtain a matching result of the target identification information and each history identification information, and the matching result of the target dimension is obtained according to the matching result of the target identification information and each history identification information.
6. The method of any of claims 1-5, further comprising:
Acquiring a history access request message, wherein the history access request message comprises history identification information of a plurality of dimensions;
obtaining an abnormality identification result of the historical access request message, wherein the abnormality identification result comprises: normal or abnormal;
for each dimension of the plurality of dimensions, generating information of each dimension based on history identification information contained in a normal history access request message of each dimension and/or history identification information contained in an abnormal history access request message of each dimension.
7. The method according to any one of claims 1 to 5, wherein,
the target access request message is a plurality of pieces;
the obtaining, based on the target encoding information, an anomaly identification result of the target access request includes:
performing de-duplication treatment on the target coding information in a preset duration to obtain de-duplication treated coding information;
and determining an abnormal target access request message in the multi-entry target access request message based on a preset offline identification strategy and the code information after the duplication removal process.
8. The method according to any one of claims 1 to 5, wherein,
The target access request message is a single message;
the obtaining, based on the target encoding information, an anomaly identification result of the target access request message includes:
acquiring a preset online identification strategy, wherein the online identification strategy is used for online identification of abnormal coding information;
and if the target coding information accords with the online identification strategy, determining that the target access request message is an abnormal target access request message.
9. An apparatus for identifying an abnormal message, comprising:
the receiving module is used for receiving a target access request message, wherein the target access request message comprises target identification information of multiple dimensions;
the matching module is used for carrying out matching processing on the target identification information of each dimension in the plurality of dimensions and the information of each dimension acquired in advance so as to obtain a matching result of each dimension; wherein the information is acquired based on an abnormal recognition result of the history access request message;
the encoding module is used for obtaining target encoding information of the target access request message based on the matching result of each dimension;
and the determining module is used for acquiring an abnormal identification result of the target access request message based on the target coding information.
10. The apparatus of claim 9, wherein the encoding module is further to:
combining the matching results of the dimensions to obtain combined data;
and obtaining the target coding information based on the combined data.
11. The apparatus of claim 10, wherein the encoding module is further to:
and carrying out hash operation on the combined data to obtain a hash value, and taking the hash value as the target coding information.
12. The apparatus of claim 9, wherein,
the information of each dimension comprises at least one history identification information;
the matching module is further configured to:
for a target dimension, the target dimension is any dimension of the plurality of dimensions, and if the target identification information of the target dimension is the same as any one of the history identification information included in the information of the target dimension, the matching result of the target dimension is determined to be matching.
13. The apparatus of claim 9, wherein,
the information of each dimension comprises at least one history identification information;
the matching module is further configured to:
For a target dimension, wherein the target dimension is any dimension of the plurality of dimensions, matching processing is sequentially performed on target identification information of the target dimension and each history identification information included in information of the target dimension so as to obtain a matching result of the target identification information and each history identification information, and the matching result of the target dimension is obtained according to the matching result of the target identification information and each history identification information.
14. The apparatus of any of claims 9-13, further comprising:
the acquisition module is used for acquiring a history access request message, wherein the history access request message comprises history identification information of a plurality of dimensions;
the identifying module is used for obtaining an abnormal identifying result of the historical access request message, and the abnormal identifying result comprises the following steps: normal or abnormal;
the generating module is used for generating information of each dimension based on history identification information contained in a normal history access request message of each dimension and/or history identification information contained in an abnormal history access request message of each dimension aiming at each dimension in the plurality of dimensions.
15. The device according to any one of claims 9-13, wherein,
the target access request message is a plurality of pieces;
the determination module is further to:
performing de-duplication treatment on the target coding information in a preset duration to obtain de-duplication treated coding information;
and determining an abnormal target access request message in the multi-entry target access request message based on a preset offline identification strategy and the code information after the duplication removal process.
16. The device according to any one of claims 9-13, wherein,
the target access request message is a single message;
the determination module is further to:
acquiring a preset online identification strategy, wherein the online identification strategy is used for online identification of abnormal coding information;
and if the target coding information accords with the online identification strategy, determining that the target access request message is an abnormal target access request message.
17. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-8.
18. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-8.
19. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any of claims 1-8.
CN202310109891.3A 2023-02-02 2023-02-02 Method, device, equipment and storage medium for identifying abnormal message Pending CN116248371A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310109891.3A CN116248371A (en) 2023-02-02 2023-02-02 Method, device, equipment and storage medium for identifying abnormal message

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310109891.3A CN116248371A (en) 2023-02-02 2023-02-02 Method, device, equipment and storage medium for identifying abnormal message

Publications (1)

Publication Number Publication Date
CN116248371A true CN116248371A (en) 2023-06-09

Family

ID=86630769

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310109891.3A Pending CN116248371A (en) 2023-02-02 2023-02-02 Method, device, equipment and storage medium for identifying abnormal message

Country Status (1)

Country Link
CN (1) CN116248371A (en)

Similar Documents

Publication Publication Date Title
US20220014556A1 (en) Cybersecurity profiling and rating using active and passive external reconnaissance
KR102480204B1 (en) Continuous learning for intrusion detection
US20200389495A1 (en) Secure policy-controlled processing and auditing on regulated data sets
CN113315742B (en) Attack behavior detection method and device and attack detection equipment
US20200067980A1 (en) Increasing security of network resources utilizing virtual honeypots
EP3270317B1 (en) Dynamic security module server device and operating method thereof
WO2019199769A1 (en) Cyber chaff using spatial voting
CN112242984A (en) Method, electronic device and computer program product for detecting abnormal network requests
US20230283641A1 (en) Dynamic cybersecurity scoring using traffic fingerprinting and risk score improvement
CN114338064B (en) Method, device, system, equipment and storage medium for identifying network traffic type
CN110830500B (en) Network attack tracking method and device, electronic equipment and readable storage medium
CN115883187A (en) Method, device, equipment and medium for identifying abnormal information in network traffic data
US11005797B2 (en) Method, system and server for removing alerts
CN114157480A (en) Method, device, equipment and storage medium for determining network attack scheme
US20090193494A1 (en) Managing actions of virtual actors in a virtual environment
CN114726823B (en) Domain name generation method, device and equipment based on generation countermeasure network
CN116248371A (en) Method, device, equipment and storage medium for identifying abnormal message
CN109902831B (en) Service decision processing method and device
US10819683B2 (en) Inspection context caching for deep packet inspection
CN114553550B (en) Request detection method and device, storage medium and electronic equipment
CN116341023B (en) Block chain-based service address verification method, device, equipment and storage medium
CN115664839B (en) Security monitoring method, device, equipment and medium for privacy computing process
CN118057808A (en) System and method for identifying undesired calls
CN117093627A (en) Information mining method, device, electronic equipment and storage medium
CN115774878A (en) Request processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination