CN116861428B - Malicious detection method, device, equipment and medium based on associated files - Google Patents

Malicious detection method, device, equipment and medium based on associated files Download PDF

Info

Publication number
CN116861428B
CN116861428B CN202311131104.1A CN202311131104A CN116861428B CN 116861428 B CN116861428 B CN 116861428B CN 202311131104 A CN202311131104 A CN 202311131104A CN 116861428 B CN116861428 B CN 116861428B
Authority
CN
China
Prior art keywords
file
malicious
behavior
target
detected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311131104.1A
Other languages
Chinese (zh)
Other versions
CN116861428A (en
Inventor
田国新
奚广生
白富宽
孙晋超
肖新光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Antiy Network Technology Co Ltd
Original Assignee
Beijing Antiy Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Antiy Network Technology Co Ltd filed Critical Beijing Antiy Network Technology Co Ltd
Priority to CN202311131104.1A priority Critical patent/CN116861428B/en
Publication of CN116861428A publication Critical patent/CN116861428A/en
Application granted granted Critical
Publication of CN116861428B publication Critical patent/CN116861428B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/566Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • G06F21/563Static detection by source code analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • G06F21/565Static detection by checking file integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/03Indexing scheme relating to G06F21/50, monitoring users, programs or devices to maintain the integrity of platforms
    • G06F2221/033Test or assess software

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a malicious detection method, a device, equipment and a medium based on an associated file, relating to the field of security detection, wherein the method comprises the following steps: acquiring file behavior information of a file to be detected and each target associated file; determining a target behavior vector corresponding to a file to be detected and an associated behavior vector corresponding to each target associated file; determining a fusion behavior vector; inputting the target behavior vector, the fusion behavior vector and each associated behavior vector into a target model respectively to obtain a corresponding target file identifier and each associated file identifier; if the target file identification or the associated file identification is the malicious file identification, the file to be detected or the target associated file is determined to be the malicious file. According to the method and the device, the file behaviors of the file to be detected and the target associated file are detected and combined, the fusion behavior vector is determined, whether the file to be detected and the target associated file are malicious files or not is judged, and the safety is improved.

Description

Malicious detection method, device, equipment and medium based on associated files
Technical Field
The present invention relates to the field of security detection, and in particular, to a method, apparatus, device, and medium for detecting malicious files based on associated files.
Background
The existing malicious file detection method is to detect whether a file is a malicious file by checking file attributes, digital signatures, system detection and other modes, the method for detecting the malicious file by detecting the file attributes is poor in accuracy and easy to generate false detection, and the method for detecting the malicious file by the system detection mode is characterized in that the general system detection can only detect the malicious file aiming at single type malicious types or combination of multiple types of malicious types, has limitation, and the condition that the related information is stolen by the source file through generating the related file and stealing the information by the related file is caused. The current security detection method releases the source file after detecting that the source file does not have malicious information, but the source file can continuously generate related file stealing information, so that the system security is lower.
Disclosure of Invention
In view of this, the invention provides a malicious detection method, device, equipment and medium based on associated files, which at least partially solves the technical problem in the prior art that after a source file is released, the source file can continuously generate associated files to steal information, so that the system security is lower, and adopts the technical scheme that:
According to one aspect of the present application, there is provided a malicious detection method based on an associated file, applied to a file detection system, the malicious detection method based on the associated file comprising the steps of:
responding to the received file to be detected, and if a plurality of target associated files with association relation with the file to be detected are detected in a first preset time period, acquiring the file behavior information of the file to be detected and a plurality of file behaviors of each target associated file in the first preset time period;
according to the file behavior information, determining a target behavior vector corresponding to the file to be detected and an associated behavior vector corresponding to each target associated file;
determining a fusion behavior vector according to the target behavior vector and each associated behavior vector;
inputting the target behavior vector and the fusion behavior vector into a target model to obtain a corresponding target file identifier; the target model is obtained by training according to file behavior information of a malicious sample file;
if the target file identifier is a malicious file identifier, determining the file to be detected as a malicious file;
inputting each associated behavior vector and the fusion behavior vector into a target model respectively to obtain a corresponding associated file identifier;
If the associated file identifier is a malicious file identifier, the corresponding target associated file is determined to be a malicious file.
In an exemplary embodiment of the present application, obtaining a plurality of file behavior information of a file to be detected and each target associated file performed in a first preset time period includes:
during a first preset period of time T 1 After the completion, acquiring a plurality of file behavior information of the file to be detected, and obtaining a first file behavior information set Q= (Q) 1 ,Q 2 ,...,Q i ,...,Q n ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein i=1, 2, n; n is the T of the file to be detected 1 The number of file behavior information performed internally; q (Q) i For the file to be detected at T 1 The i-th file behavior information performed internally; t (T) 1 =[t 11 ,t 12 ];t 11 <t 12 ;t 11 Is T 1 Corresponding start time; t is t 12 Is T 1 A corresponding deadline;
if the Q comprises the association behavior information, determining an association file corresponding to the association behavior information as a target association file;
acquiring a plurality of target associated files at T 1 Obtaining a second file behavior information set R= (R) by a plurality of file behavior information which are performed internally 1 ,R 2 ,...,R g ,...,R h );R g =(R g1 ,R g2 ,...,R gk ,...,R gf(g) ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein g=1, 2,..h; k=1, 2, f (g); h is the number of target associated files; f (g) is that the g-th target associated file is in T 1 The number of file behavior information performed internally; r is R g A file behavior information list corresponding to the g-th target associated file; r is R gk For g-th target association file at T 1 And (3) the kth file behavior information performed in the file.
In one exemplary embodiment of the application, the target behavior vector and the associated behavior vector are determined by:
obtaining a first preset behavior feature vector E= (E) according to the b target malicious behavior information 1 ,E 2 ,...,E a ,...,E b ) And h third preset behavior feature vectors M 1 ,M 2 ,...,M g ,...,M h ;M g =(M g1 ,M g2 ,...,M ga ,...,M gb ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein a=1, 2, b; e (E) a The behavior characteristics corresponding to the a-th target malicious behavior information in E are obtained; m is M g A third preset behavior feature vector corresponding to the g-th target association file; m is M ga The behavior characteristics corresponding to the a-th target malicious behavior information of the g-th target associated file are obtained; m is M ga Corresponding target malicious behavior information and E a The corresponding target malicious behavior information is the same;
traversing E, if E a If the corresponding target malicious behavior information exists in Q, E is determined to be a Is determined to be 1; otherwise, will E a Determined to be 0;
e is determined to be a target behavior vector corresponding to the file to be detected;
traversal M g If M ga Corresponding target malicious behavior information exists in R g In (C), then M ga Is determined to be 1; otherwise, M is ga Determined to be 0;
will M g And determining the associated behavior vector corresponding to the g-th target associated file.
In one exemplary embodiment of the application, the target malicious behavior information is determined by:
Obtaining m malicious sample files in a second preset time period T 2 =[t 21 ,t 22 ]Obtaining a sample file behavior information set F= (F) by a plurality of file behavior information carried out internally 1 ,F 2 ,...,F j ,...,F m );F j =(F j1 ,F j2 ,...,F jd ,...,F jf(j) ) The method comprises the steps of carrying out a first treatment on the surface of the Where j=1, 2, m; d=1, 2,., f (j); f (j) is the j-th malicious sample file at T 2 The number of file behavior information performed internally; f (F) j File behavior for a jth malicious sample fileAn information list; f (F) jd At T for jth malicious sample file 2 The d-th file behavior information performed internally; t is t 21 <t 22 <t 11 ;(t 22 -t 21 )=(t 12 -t 11 );t 21 Is T 2 Corresponding start time; t is t 22 Is T 2 A corresponding deadline;
and F, performing deduplication processing to obtain b pieces of target malicious behavior information.
In an exemplary embodiment of the application, the fused behavior vector is determined by:
traversal E, M 1 ,M 2 ,...,M g ,...,M h If (E) a +∑ h g=1 M ga ) Not less than 1, S is a Is determined to be 1; otherwise, S is a Determined to be 0; to obtain a fusion behavior vector s= (S) 1 ,S 2 ,...,S a ,...,S b ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein S is a The behavior characteristics corresponding to the a-th target malicious behavior information in the S are obtained; s is S a Corresponding target malicious behavior information and E a The corresponding target malicious behavior information is the same.
In an exemplary embodiment of the application, the object model is determined by:
obtaining m second preset behavior feature vectors G according to F 1 ,G 2 ,...,G j ,...,G m ;G j =(G j1 ,G j2 ,...,G ja ,...,G jb ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein G is j A second preset behavior feature vector corresponding to the jth malicious sample file; g ja Behavior characteristics corresponding to the a-th target malicious behavior information of the j-th malicious sample file;
traversal G j If G ja Corresponding target malicious behavior information exists in F j In (C), then G ja Is determined to be 1; otherwise, G is ja Determined to be 0;
will G j Determining a malicious behavior vector of a j-th malicious sample file;
obtaining malicious rows corresponding to m malicious sample filesFor type identification, a malicious behavior type identification set H= (H) is obtained 1 ,H 2 ,...,H j ,...,H m ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein H is j Identifying a malicious behavior type corresponding to the jth malicious sample file;
e malicious behavior type identifiers obtained after the duplication removal processing of the H are determined to be malicious file identifiers;
according to each malicious file identifier, m malicious sample files are grouped, and D malicious file identifier groups are determined; the malicious behavior type identifiers corresponding to a plurality of malicious sample files in each malicious file identification group are the same;
obtaining a sample fusion behavior vector corresponding to each malicious file identification group according to malicious behavior vectors corresponding to a plurality of malicious sample files in each malicious file identification group;
will G j And inputting a sample fusion behavior vector corresponding to a malicious file identification group where the jth malicious sample file is positioned and a malicious file identification corresponding to the jth malicious sample file into a preset model for training to obtain a target model.
In an exemplary embodiment of the present application, acquiring a plurality of file behavior information of a file to be detected and each target associated file performed within a first preset time period, further includes:
acquiring a plurality of file characteristics of a file to be detected and each target associated file;
detecting the characteristics of each file to obtain a corresponding detection result;
and in the plurality of detection results, if detection results indicating that the corresponding file to be detected or the target associated file is not a malicious file exist, acquiring a plurality of file behavior information of the file to be detected and each target associated file in a first preset time period.
According to one aspect of the present application, there is provided a malicious detection apparatus based on an associated file, including:
the file behavior acquisition module is used for acquiring a plurality of file behavior information of the file to be detected and each target associated file in a first preset time period if a plurality of target associated files with association relation with the file to be detected are detected in the first preset time period when the file to be detected is received;
the first vector determining module is used for determining a target behavior vector corresponding to the file to be detected and an associated behavior vector corresponding to each target associated file according to the file behavior information;
The second vector determining module is used for determining a fusion behavior vector according to the target behavior vector and each associated behavior vector;
the first identification determining module is used for inputting the target behavior vector and the fusion behavior vector into the target model to obtain a corresponding target file identification; the target model is obtained by training according to file behavior information of a malicious sample file;
the first malicious judgment module is used for determining the file to be detected as a malicious file when the target file identifier is a malicious file identifier;
the second identification determining module is used for inputting each associated behavior vector and the fusion behavior vector into the target model respectively to obtain a corresponding associated file identification;
and the second malicious judgment module is used for determining the corresponding target associated file as a malicious file when the associated file identifier is a malicious file identifier.
According to one aspect of the present application, there is provided a non-transitory computer readable storage medium having stored therein at least one instruction or at least one program loaded and executed by a processor to implement the aforementioned associated file-based malicious detection method.
According to one aspect of the present application, there is provided an electronic device comprising a processor and the aforementioned non-transitory computer-readable storage medium.
The application has at least the following beneficial effects:
after a target associated file corresponding to a file to be detected is detected, a target behavior vector corresponding to the file to be detected and an associated behavior vector corresponding to each target associated file are determined according to file behavior information of the file to be detected and the target associated files, a fusion behavior vector is determined according to the target behavior vector and each associated behavior vector, then the target behavior vector, the fusion behavior vector and each associated behavior vector are respectively input into a target model with the fusion behavior vector to obtain a corresponding target file identifier and each associated file identifier, if the target file identifier or the associated file identifier is a malicious file identifier, the file to be detected or the corresponding target associated file is determined to be a malicious file, the file behaviors of the file to be detected and the target associated file are detected and combined, the fusion behavior vector is determined, whether the file to be detected and the target associated file are malicious files is judged, and safety is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a malicious detection method based on an associated file according to an embodiment of the present invention;
FIG. 2 is a flowchart of a method for determining a target behavior vector and an associated behavior vector according to an embodiment of the present invention;
FIG. 3 is a flowchart of a method for determining a target model according to an embodiment of the present invention;
fig. 4 is a block diagram of a malicious detection device based on an associated file according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.
A malicious detection method based on an associated file is applied to a file detection system, and the file detection system is used for carrying out malicious detection on a file to be detected and a target associated file and detecting whether the file to be detected and the target associated file are malicious files or not.
As shown in fig. 1, the malicious detection method based on the association file includes the following steps:
step S100, responding to the received file to be detected, and if a plurality of target associated files with association relation with the file to be detected are detected in a first preset time period, acquiring a plurality of file behavior information of the file to be detected and each target associated file in the first preset time period;
the file to be detected is a file which is received by the file detection system and is not subjected to malicious detection. After the file detection system receives the file to be detected, acquiring a plurality of file behavior information of the file to be detected, carrying out malicious detection on the file to be detected by detecting the file behavior information of the file to be detected, wherein each file behavior information corresponds to a file behavior, the file behaviors comprise self-starting, registry generation, scanning, encryption, information stealing and other behaviors, the file behaviors of the file to be detected comprise normal file behaviors and abnormal file behaviors, the abnormal file behaviors are the behaviors of stealing or stealing user information or system information, and comprehensively judging whether the file to be detected executes malicious behaviors or not by detecting all the file behaviors of the file to be detected within a first preset time period, and then judging whether the file to be detected is the malicious file or not.
The target association file is a file with association relation with the file to be detected, wherein the association relation is the relation of downloading, releasing, triggering and the like, and the file to be detected is in T 1 The actions such as downloading, releasing, triggering and the like are executed, and corresponding downloading files, releasing files and triggering files are generated, so that the corresponding generated files are determined to be target associated files, and the situation that the related information is stolen exists in the current malicious files, such as the malicious actions such as information stealing is not executed on the first file, but the first file enters a server systemAfter the method is completed, the downloading behavior is executed, the corresponding second file is generated, the malicious behavior of information stealing is executed by the second file, and the first file only executes the downloading behavior and the downloading behavior is not malicious, so that the first file cannot be intercepted or detected by the current security detection method, the first file is released after the current security detection method detects that the first file does not have malicious information, the file which is generated after the first file is detected, and the corresponding malicious detection is also carried out on the target associated file corresponding to the file to be detected.
Further, in step S100, obtaining a plurality of file behavior information of the file to be detected and each target associated file performed within a first preset time period includes:
Step S110, in a first preset time period T 1 After the completion, acquiring a plurality of file behavior information of the file to be detected, and obtaining a first file behavior information set Q= (Q) 1 ,Q 2 ,...,Q i ,...,Q n ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein i=1, 2, n; n is the T of the file to be detected 1 The number of file behavior information performed internally; q (Q) i For the file to be detected at T 1 The i-th file behavior information performed internally; t (T) 1 =[t 11 ,t 12 ];t 11 <t 12 ;t 11 Is T 1 Corresponding start time; t is t 12 Is T 1 A corresponding deadline;
the first preset time period is the time period of the file detection system after receiving the file to be detected, namely t 11 The time can be the time when the file detection system receives the file to be detected, can also be the time set by the file detection system, can perform behavior monitoring in the server system, can also be placed in a sandbox for performing behavior monitoring, if the size of the file to be detected is smaller than the size value of the preset file, the file to be detected indicates that the type of behavior performed is smaller, can directly perform behavior monitoring in the server system, if the size of the file to be detected is larger than or equal to the size value of the preset file, the file to be detected indicates that the type of executable behavior of the file to be detected is more, and for safety, the file to be detected is placed in the sandbox for leaving the file to be detected The behavior of the file to be detected is monitored in the sandbox, the server system is not damaged even if the file to be detected executes malicious behavior, the file to be detected is maliciously detected in the sandbox, and if the file to be detected is not a malicious file, the file to be detected is moved from the sandbox to the server system, so that the information safety of the server system is ensured.
Step S120, if the Q comprises the association behavior information, determining an association file corresponding to the association behavior information as a target association file;
step S130, obtaining a plurality of target associated files at T 1 Obtaining a second file behavior information set R= (R) by a plurality of file behavior information which are performed internally 1 ,R 2 ,...,R g ,...,R h );R g =(R g1 ,R g2 ,...,R gk ,...,R gf(g) ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein g=1, 2,..h; k=1, 2, f (g); h is the number of target associated files; f (g) is that the g-th target associated file is in T 1 The number of file behavior information performed internally; r is R g A file behavior information list corresponding to the g-th target associated file; r is R gk For g-th target association file at T 1 And (3) the kth file behavior information performed in the file.
The file to be detected and the target associated file correspond to a plurality of target malicious behavior information, wherein the target malicious behavior information is information corresponding to the malicious behavior which is known at present or collected through a malicious sample file, and the malicious behavior is abnormal file behavior.
Step 200, determining a target behavior vector corresponding to a file to be detected and an associated behavior vector corresponding to each target associated file according to the file behavior information;
according to the file behavior information corresponding to each target associated file, corresponding associated behavior vectors are determined, and the corresponding target associated file can be known in T through the associated behavior vectors 1 File behavior executed internally.
Wherein, as shown in fig. 2, the target behavior vector and the associated behavior vector are determined by:
step S210, obtaining a first preset row according to the b target malicious behavior informationFor the feature vector e= (E 1 ,E 2 ,...,E a ,...,E b ) And h third preset behavior feature vectors M 1 ,M 2 ,...,M g ,...,M h ;M g =(M g1 ,M g2 ,...,M ga ,...,M gb ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein a=1, 2, b; e (E) a The behavior characteristics corresponding to the a-th target malicious behavior information in E are obtained; m is M g A third preset behavior feature vector corresponding to the g-th target association file; m is M ga The behavior characteristics corresponding to the a-th target malicious behavior information of the g-th target associated file are obtained; m is M ga Corresponding target malicious behavior information and E a The corresponding target malicious behavior information is the same;
the files to be detected are corresponding to first preset behavior feature vectors, each target associated file is corresponding to a third preset behavior feature vector, the first preset behavior feature vector and the third preset behavior feature vector are preset feature vectors, the number of features contained in the first preset behavior feature vector and the number of features contained in each third preset behavior feature vector are the same, each feature is corresponding to target malicious behavior information, the target malicious behavior information corresponding to the same feature position of different third preset behavior feature vectors are the same, and if all the third preset behavior feature vectors and the first features of the first preset behavior feature vectors represent the same target malicious behavior information, the subsequent processing of the vectors is facilitated.
Step S220, traversing E, if E a If the corresponding target malicious behavior information exists in Q, E is determined to be a Is determined to be 1; otherwise, will E a Determined to be 0;
and the target behavior vector is used for detecting whether the file behavior of the file to be detected contains the corresponding target malicious behavior, if so, the corresponding behavior characteristic in the first preset behavior characteristic vector is determined to be 1, and if not, the corresponding behavior characteristic is determined to be 0.
Step S230, determining E as a target behavior vector corresponding to the file to be detected;
step S240, traversing M g If M ga Corresponding target malicious behavior information exists inR g In (C), then M ga Is determined to be 1; otherwise, M is ga Determined to be 0;
after h third preset behavior feature vectors are preset, comparing each piece of target malicious behavior information with the file behaviors carried out by each piece of target associated file, if the file behaviors carried out by the target associated file comprise target malicious behaviors, such as information stealing behaviors, determining the behavior feature corresponding to the information stealing behaviors in the third preset behavior feature vectors corresponding to the target associated file as 1, otherwise, determining the corresponding behavior feature as 0 if the file behaviors carried out by the target associated file do not comprise the corresponding target malicious behaviors.
Step S250, M g And determining the associated behavior vector corresponding to the g-th target associated file.
In each associated behavior vector, if the behavior characteristic is 1, it indicates that the corresponding target associated file is in T 1 The corresponding target malicious behavior is executed in the file, if the behavior characteristic is 0, the corresponding target associated file is represented in T 1 The corresponding target malicious behaviors are not executed, so that whether the corresponding target associated files execute the target malicious behaviors can be known by checking each associated behavior vector.
The target malicious behavior information is determined through the following steps:
step S211, obtaining m malicious sample files in a second preset time period T 2 =[t 21 ,t 22 ]Obtaining a sample file behavior information set F= (F) by a plurality of file behavior information carried out internally 1 ,F 2 ,...,F j ,...,F m );F j =(F j1 ,F j2 ,...,F jd ,...,F jf(j) ) The method comprises the steps of carrying out a first treatment on the surface of the Where j=1, 2, m; d=1, 2,., f (j); f (j) is the j-th malicious sample file at T 2 The number of file behavior information performed internally; f (F) j A file behavior information list corresponding to the jth malicious sample file; f (F) jd At T for jth malicious sample file 2 The d-th file behavior information performed internally; t is t 21 <t 22 <t 11 ;(t 22 -t 21 )=(t 12 -t 11 );t 21 Is T 2 Corresponding start time; t is t 22 Is T 2 A corresponding deadline;
each piece of target malicious behavior information corresponds to a target malicious behavior, the target malicious behavior is determined through a malicious sample file, the malicious sample file is a known malicious file, or a malicious file in a certain statistical period, or a historical malicious file stored in a server database, and m malicious sample files are obtained in T 2 Internal file behavior, T 2 For the historical time period, since the same file behavior is executed by different malicious sample files, all the obtained file behaviors are subjected to deduplication.
And step S212, performing deduplication processing on the F to obtain b pieces of target malicious behavior information.
And obtaining b file behaviors after the file behaviors of all the malicious sample files are de-duplicated, and determining the file behaviors as target malicious behaviors, wherein the corresponding information is target malicious behavior information.
Step S300, determining a fusion behavior vector according to the target behavior vector and each associated behavior vector;
the fused behavior vector is a vector obtained according to the target behavior vector and all the associated behavior vectors, and represents the behavior executed by the file to be detected and the target associated file together.
Wherein the fusion behavior vector is determined by:
step S310, traversing E, M 1 ,M 2 ,...,M g ,...,M h If (E) a +∑ h g=1 M ga ) Not less than 1, S is a Is determined to be 1; otherwise, S is a Determined to be 0; to obtain a fusion behavior vector s= (S) 1 ,S 2 ,...,S a ,...,S b ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein S is a The behavior characteristics corresponding to the a-th target malicious behavior information in the S are obtained; s is S a Corresponding target malicious behavior information and E a The corresponding target malicious behavior information is the same.
Since the behavior of the file represented by the behavior feature at the same position of the target behavior vector and the behavior feature at the same position of each associated behavior vector are the same, the behavior features at the same position of the target behavior vector and each associated behavior vector are added, if the sum of the behavior features is more than or equal to 1, the behavior features at the position of the file to be detected or the target associated file is represented as the behavior feature corresponding to the behavior feature at the position, the behavior feature at the position of the fused behavior vector is determined as 1, and the fused file corresponding to the fused behavior vector is represented as the corresponding file behavior; otherwise, if the sum is equal to 0, the file behavior corresponding to the behavior feature of the position is not executed by the file to be detected and all the target associated files, and the behavior feature of the position of the fusion behavior vector is determined to be 0, which means that the corresponding file behavior is not executed by the fusion file corresponding to the fusion behavior vector.
Step S400, inputting the target behavior vector and the fusion behavior vector into a target model to obtain a corresponding target file identifier; the target model is obtained by training according to file behavior information of a malicious sample file;
The target model is a model obtained by training according to malicious behaviors of malicious sample files, a fusion behavior vector and a target behavior vector are input into the target model, the target model outputs corresponding target file identifications, whether the files to be detected are malicious files or not is determined by verifying the target file identifications, the target file identifications represent malicious behaviors of attack types, the type identifications corresponding to the fusion behavior vector represent the identifications of the attack types, the identifications corresponding to the target behavior vector represent the identifications of the malicious behaviors, and the behavior types of the files to be detected corresponding to the target behavior vector in the files associated with the targets and the attack types performed after the files to be detected are combined with the target file identifications can be determined through the target file identifications.
Wherein, as shown in fig. 3, the target model is determined by:
step S401, according to F,obtaining m second preset behavior feature vectors G 1 ,G 2 ,...,G j ,...,G m ;G j =(G j1 ,G j2 ,...,G ja ,...,G jb ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein G is j A second preset behavior feature vector corresponding to the jth malicious sample file; g ja Behavior characteristics corresponding to the a-th target malicious behavior information of the j-th malicious sample file;
presetting a second preset behavior feature vector corresponding to each malicious sample file according to file behaviors of the malicious sample files.
Step S402, traversing G j If G ja Corresponding target malicious behavior information exists in F j In (C), then G ja Is determined to be 1; otherwise, G is ja Determined to be 0;
if the file behaviors of the malicious sample file comprise target malicious behaviors, determining the behavior characteristics in the corresponding second preset behavior characteristic vector to be 1, otherwise, determining the behavior characteristics to be 0.
Step S403, G j Determining a malicious behavior vector of a j-th malicious sample file;
step S404, malicious behavior type identifiers corresponding to m malicious sample files are obtained, and a malicious behavior type identifier set H= (H) is obtained 1 ,H 2 ,...,H j ,...,H m ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein H is j Identifying a malicious behavior type corresponding to the jth malicious sample file;
each malicious sample file corresponds to a malicious behavior type identifier, the malicious behavior type identifier represents the identifier of the malicious behavior type performed by the corresponding malicious sample file, and the malicious behavior type is a malicious attack type and represents the attack means of the corresponding malicious sample file.
Step S405, determining e malicious behavior type identifiers obtained after the duplication removal processing of the H as malicious file identifiers;
correspondingly, the situation that malicious behavior type identifiers of different malicious sample files are identical can exist, duplication removal is needed, and the obtained malicious behavior type identifiers are determined to be malicious file identifiers.
Step S406, grouping m malicious sample files according to each malicious file identifier, and determining D malicious file identifier groups; the malicious behavior type identifiers corresponding to a plurality of malicious sample files in each malicious file identification group are the same;
grouping a plurality of malicious sample files according to malicious file identifications.
Step S407, obtaining a sample fusion behavior vector corresponding to each malicious file identification group according to malicious behavior vectors corresponding to a plurality of malicious sample files in each malicious file identification group;
step S408, G j And inputting a sample fusion behavior vector corresponding to a malicious file identification group where the jth malicious sample file is positioned and a malicious file identification corresponding to the jth malicious sample file into a preset model for training to obtain a target model.
Step S500, if the target file identifier is a malicious file identifier, determining the file to be detected as a malicious file;
step S600, inputting each associated behavior vector and the fusion behavior vector into a target model respectively to obtain a corresponding associated file identifier;
and step S700, if the associated file identifier is a malicious file identifier, determining the corresponding target associated file as a malicious file.
The representation of the associated file identifier is the same as the representation of the target file identifier, and the malicious behaviors of the corresponding files are attack types, so that the malicious behaviors executed by each target associated file can be determined through all the associated file identifiers, and the attack types of a plurality of files combined by the associated file identifier and the target file identifier can be determined through the associated file identifier and the target file identifier.
In addition, each piece of target malicious behavior information corresponds to a behavior monitoring policy, wherein the behavior monitoring policy is a method for monitoring the behaviors of a file to be detected or a target related file by a file detection system, and each behavior monitoring policy corresponds to a plurality of pieces of target malicious behavior information, namely, each behavior monitoring policy monitors each corresponding target malicious behavior. Therefore, when malicious detection is performed on the file to be detected and the target associated file, the file detection system is further configured to perform the following steps:
step S131, monitoring target malicious behavior information corresponding to the file to be detected and the target associated file through each behavior monitoring strategy;
step S132, if at present T 1 T of (2) 12 Time of day E, M 1 ,M 2 ,...,M g ,...,M h N of (a) p1 ,N p2 ,...,N py ,...,N pf(p) The corresponding behavior features are 1, then at the next T 1 T of (2) 11 And stopping the behavior monitoring of the file to be detected and the target associated file by the p-th behavior monitoring strategy at the moment.
At t 12 At moment, if all target malicious behaviors corresponding to one of the behavior monitoring strategies are detected to be executed, namely the file to be detected and the target associated file are in T 1 And if all the target malicious behaviors corresponding to the behavior monitoring strategy are executed, the behavior monitoring strategy is indicated to have monitored all the corresponding target malicious behaviors, and in order to reduce the system calculation power and save the system resources, the behavior monitoring strategy is stopped.
After a target associated file corresponding to a file to be detected is detected, a target behavior vector corresponding to the file to be detected and an associated behavior vector corresponding to each target associated file are determined according to file behavior information of the file to be detected and the target associated files, a fusion behavior vector is determined according to the target behavior vector and each associated behavior vector, then the target behavior vector, the fusion behavior vector and each associated behavior vector are respectively input into a target model with the fusion behavior vector to obtain a corresponding target file identifier and each associated file identifier, if the target file identifier or the associated file identifier is a malicious file identifier, the file to be detected or the corresponding target associated file is determined to be a malicious file, the file behaviors of the file to be detected and the target associated file are detected and combined, the fusion behavior vector is determined, whether the file to be detected and the target associated file are malicious files is judged, and safety is improved.
Further, in the second embodiment, the step of obtaining the file behavior information of the file to be detected and the files associated with each object in the first preset time period further includes:
step S001, acquiring a file to be detected and a plurality of file characteristics of each target associated file;
the file characteristics comprise one or more of hash values, file structure information, MD5 values, file code characteristics and the like of the file to be detected or the target associated file, and whether the file to be detected or the target associated file is a malicious file is judged through detection of the file characteristics of the file to be detected or the target associated file.
Step S002, detecting the characteristics of each file to obtain a corresponding detection result;
the detection of the file characteristics is preliminary detection of the files to be detected and the target associated files, and the detection method of the file characteristics is convenient, so that the file characteristics of the files to be detected and the target associated files are detected firstly, if the files to be detected or the target associated files after the detection of the file characteristics are malicious files, the files to be detected or the target associated files can be determined to be malicious files without subsequent detection steps, the malicious detection flow is simplified, if the files to be detected through the file characteristics are not malicious files, the file characteristics of the files to be detected or the target associated files are indicated to be normal characteristics, and the subsequent detection steps are continued.
Further, in step S002, each file feature is detected, and a corresponding detection result is obtained, which includes:
step S0021, comparing each file characteristic with a preset abnormal characteristic corresponding to the file characteristic to obtain a detection result corresponding to the file to be detected;
and comparing the hash value, the file structure information, the MD5 value and the file code characteristic of the file to be detected and the target associated file with a preset abnormal hash value, a preset abnormal file structure information, a preset abnormal MD5 value and a preset abnormal file code characteristic to obtain a detection result corresponding to the file to be detected and the target associated file.
Step S0022, if any file feature is the same as the corresponding preset abnormal feature, the detection result shows that the file to be detected is a malicious file; otherwise, the detection result indicates that the file to be detected is not a malicious file.
If the hash value is the same as the preset abnormal hash value, or the file structure information is the same as the preset abnormal file structure information, or the MD5 value is the same as the preset abnormal MD5 value, or the file code characteristic is the same as the preset abnormal file code characteristic, the detection result indicates that the file to be detected is a malicious file; otherwise, the detection result indicates that the file to be detected is not a malicious file.
Because the number of the abnormal file features is smaller than that of the normal file features and the abnormal file features are easy to acquire, the file features of the file to be detected are compared with the abnormal file features to obtain corresponding detection results, and the abnormal file features can be called from a data storage library of a server system or can be obtained by analyzing malicious sample files.
If one of the file characteristics of the file to be detected is the same as the corresponding abnormal file characteristic, the file to be detected is a malicious file, and if all the file characteristics of the file to be detected are different from the corresponding abnormal file characteristic, the file to be detected is not a malicious file, and further, the subsequent steps are needed to detect the file to be detected, and whether the file to be detected is a malicious file is further judged.
Step S003, in a plurality of detection results, if a detection result indicating that the corresponding file to be detected or the target associated file is not a malicious file exists, acquiring a plurality of file behavior information of the file to be detected and each target associated file in a first preset time period;
if each detection result indicates that the corresponding file to be detected or the target associated file is a malicious file, determining the file to be detected and each target associated file as the malicious file.
In all detection results, if the detection result which is not a malicious file exists, the error possibly exists in the combination of the malicious behavior and the attack type generated by the detection result, and further malicious detection needs to be carried out on the file to be detected and the target associated file; if all the detection results are shown as malicious files, the files to be detected and the corresponding target associated files are considered to be malicious files at the moment. No subsequent detection steps are required.
Further, in the third embodiment, if there is a detection result indicating that the corresponding file to be detected or the target associated file is not a malicious file, the step of obtaining a plurality of file behavior information performed by the file to be detected and each target associated file within the first preset time period further includes:
step S004, if any detection result indicates that the corresponding file to be detected or the target associated file is a malicious file, the file to be detected and the target associated file are moved to a preset storage space, and a plurality of file behavior information of the file to be detected and each target associated file in the preset storage space within a first preset time period is obtained.
In addition, in the third embodiment, if the corresponding file is a malicious file in the detection result, the possibility that the corresponding file has the malicious file is indicated, in order to ensure the information security of the server system, the file to be detected and the target associated file are moved to a preset storage space, isolated from the server system, malicious detection is performed on the file to be detected and the target associated file in the preset storage space, and when the file is determined to be a security file, the file is moved from the preset storage space to the server system.
A malicious detection apparatus 100 based on an associated file, as shown in fig. 4, includes:
the file behavior acquisition module 110 is configured to, when receiving a file to be detected, acquire a file to be detected and a plurality of file behavior information of each target associated file performed in a first preset time period if a plurality of target associated files having an association relationship with the file to be detected are detected in the first preset time period;
the first vector determining module 120 is configured to determine, according to the plurality of file behavior information, a target behavior vector corresponding to the file to be detected and an associated behavior vector corresponding to each target associated file;
a second vector determining module 130, configured to determine a fused behavior vector according to the target behavior vector and each associated behavior vector;
the first identifier determining module 140 is configured to input the target behavior vector and the fused behavior vector into the target model, to obtain a corresponding target file identifier; the target model is obtained by training according to file behavior information of a malicious sample file;
the first malicious judgment module 150 is configured to determine a file to be detected as a malicious file when the target file identifier is a malicious file identifier;
the second identifier determining module 160 is configured to input each associated behavior vector and the fused behavior vector into the target model, to obtain a corresponding identifier of each associated file;
The second malicious judgment module 170 is configured to determine, when the associated file identifier is a malicious file identifier, the corresponding target associated file as a malicious file.
Embodiments of the present invention also provide a computer program product comprising program code for causing an electronic device to carry out the steps of the method according to the various exemplary embodiments of the invention as described in the specification, when said program product is run on the electronic device.
Furthermore, although the steps of the methods in the present disclosure are depicted in a particular order in the drawings, this does not require or imply that the steps must be performed in that particular order or that all illustrated steps be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform, etc.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a mobile terminal, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.
In an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided.
Those skilled in the art will appreciate that the various aspects of the invention may be implemented as a system, method, or program product. Accordingly, aspects of the invention may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" system.
An electronic device according to this embodiment of the invention. The electronic device is merely an example, and should not impose any limitations on the functionality and scope of use of embodiments of the present invention.
The electronic device is in the form of a general purpose computing device. Components of an electronic device may include, but are not limited to: the at least one processor, the at least one memory, and a bus connecting the various system components, including the memory and the processor.
Wherein the memory stores program code that is executable by the processor to cause the processor to perform steps according to various exemplary embodiments of the invention described in the "exemplary methods" section of this specification.
The storage may include readable media in the form of volatile storage, such as Random Access Memory (RAM) and/or cache memory, and may further include Read Only Memory (ROM).
The storage may also include a program/utility having a set (at least one) of program modules including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
The bus may be one or more of several types of bus structures including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus architectures.
The electronic device may also communicate with one or more external devices (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device, and/or with any device (e.g., router, modem, etc.) that enables the electronic device to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface. And, the electronic device may also communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through a network adapter. As shown, the network adapter communicates with other modules of the electronic device over a bus. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with an electronic device, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a terminal device, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.
In an exemplary embodiment of the present disclosure, a computer-readable storage medium having stored thereon a program product capable of implementing the method described above in the present specification is also provided. In some possible embodiments, the various aspects of the invention may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the invention as described in the "exemplary methods" section of this specification, when said program product is run on the terminal device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
Furthermore, the above-described drawings are only schematic illustrations of processes included in the method according to the exemplary embodiment of the present invention, and are not intended to be limiting. It will be readily appreciated that the processes shown in the above figures do not indicate or limit the temporal order of these processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, for example, among a plurality of modules.
It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the scope of the present invention should be included in the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (10)

1. A malicious detection method based on associated files, which is characterized by being applied to a file detection system, the method comprising the following steps:
responding to the received file to be detected, and if a plurality of target associated files with association relation with the file to be detected are detected in a first preset time period, acquiring a plurality of file behavior information of the file to be detected and each target associated file in the first preset time period; the file behavior information corresponds to file behavior, and the file behavior is one of self-starting, registry generation, scanning, encryption and information stealing;
Determining a target behavior vector corresponding to the file to be detected and an associated behavior vector corresponding to each target associated file according to the file behavior information;
determining a fusion behavior vector according to the target behavior vector and each associated behavior vector;
inputting the target behavior vector and the fusion behavior vector into a target model to obtain a corresponding target file identifier; the target model is obtained by training according to file behavior information of a malicious sample file;
if the target file identifier is a malicious file identifier, determining the file to be detected as a malicious file;
inputting each associated behavior vector and the fusion behavior vector into a target model respectively to obtain a corresponding associated file identifier;
and if the associated file identifier is a malicious file identifier, determining the corresponding target associated file as a malicious file.
2. The method according to claim 1, wherein the obtaining the file behavior information of the file to be detected and the target associated file performed in the first preset period of time includes:
during a first preset period of time T 1 After the completion, acquiring a plurality of file behavior information of the file to be detected to obtain a first file behavior information set Q= (Q) 1 ,Q 2 ,...,Q i ,...,Q n ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein i=1, 2, n; n is T of the file to be detected 1 The number of file behavior information performed internally; q (Q) i For the file to be detected at T 1 The i-th file behavior information performed internally; t (T) 1 =[t 11 ,t 12 ];t 11 <t 12 ;t 11 Is T 1 Corresponding start time; t is t 12 Is T 1 A corresponding deadline;
if the Q comprises the association behavior information, determining an association file corresponding to the association behavior information as a target association file;
acquiring a plurality of target associated files at T 1 Obtaining a second file behavior information set R= (R) from the multiple file behavior information performed in the file behavior information set 1 ,R 2 ,...,R g ,...,R h );R g =(R g1 ,R g2 ,...,R gk ,...,R gf(g) ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein g=1, 2,..h; k=1, 2, f (g); h is the number of the target associated files; f (g) is the g-th target associated file at T 1 The number of file behavior information performed internally; r is R g A file behavior information list corresponding to the g-th target associated file; r is R gk For g-th said target associated file at T 1 And (3) the kth file behavior information performed in the file.
3. The method of claim 2, wherein the target behavior vector and the associated behavior vector are determined by:
Obtaining a first preset behavior feature vector E= (E) according to the b target malicious behavior information 1 ,E 2 ,...,E a ,...,E b ) And h third preset behavior feature vectors M 1 ,M 2 ,...,M g ,...,M h ;M g =(M g1 ,M g2 ,...,M ga ,...,M gb ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein a=1, 2, b; e (E) a The behavior characteristics corresponding to the a-th target malicious behavior information in E are obtained; m is M g A third preset behavior feature vector corresponding to the g-th target association file; m is M ga The g-th target associated filea behavior features corresponding to the target malicious behavior information; m is M ga Corresponding target malicious behavior information and E a The corresponding target malicious behavior information is the same;
traversing E, if E a If the corresponding target malicious behavior information exists in Q, E is determined to be a Is determined to be 1; otherwise, will E a Determined to be 0;
e is determined to be a target behavior vector corresponding to the file to be detected;
traversal M g If M ga Corresponding target malicious behavior information exists in R g In (C), then M ga Is determined to be 1; otherwise, M is ga Determined to be 0;
will M g And determining the corresponding association behavior vector of the g-th target association file.
4. A method according to claim 3, wherein the target malicious behaviour information is determined by: obtaining m malicious sample files in a second preset time period T 2 =[t 21 ,t 22 ]Obtaining a sample file behavior information set F= (F) from the multiple file behavior information carried out in the file 1 ,F 2 ,...,F j ,...,F m );F j =(F j1 ,F j2 ,...,F jd ,...,F jf(j) ) The method comprises the steps of carrying out a first treatment on the surface of the Where j=1, 2, m; d=1, 2,., f (j); f (j) is the j-th malicious sample file at T 2 The number of file behavior information performed internally; f (F) j A file behavior information list corresponding to the jth malicious sample file; f (F) jd At T for jth malicious sample file 2 The d-th file behavior information performed internally; t is t 21 <t 22 <t 11 ;(t 22 -t 21 )=(t 12 -t 11 );t 21 Is T 2 Corresponding start time; t is t 22 Is T 2 A corresponding deadline;
and F, performing deduplication processing to obtain b pieces of target malicious behavior information.
5. The method of claim 4, wherein the fused behavior vector is determined by:
traversal E, M 1 ,M 2 ,...,M g ,...,M h If (E) a +∑ h g=1 M ga ) Not less than 1, S is a Is determined to be 1; otherwise, S is a Determined to be 0; to obtain a fusion behavior vector s= (S) 1 ,S 2 ,...,S a ,...,S b ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein S is a The behavior characteristics corresponding to the a-th target malicious behavior information in the S are obtained; s is S a Corresponding target malicious behavior information and E a The corresponding target malicious behavior information is the same.
6. The method of claim 4, wherein the target model is determined by:
obtaining m second preset behavior feature vectors G according to F 1 ,G 2 ,...,G j ,...,G m ;G j =(G j1 ,G j2 ,...,G ja ,...,G jb ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein G is j A second preset behavior feature vector corresponding to the jth malicious sample file; g ja Behavior characteristics corresponding to the a-th target malicious behavior information of the j-th malicious sample file;
Traversal G j If G ja Corresponding target malicious behavior information exists in F j In (C), then G ja Is determined to be 1; otherwise, G is ja Determined to be 0;
will G j Determining a malicious behavior vector of the j-th malicious sample file;
obtaining malicious behavior type identifiers corresponding to m malicious sample files to obtain a malicious behavior type identifier set H= (H) 1 ,H 2 ,...,H j ,...,H m ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein H is j Identifying a malicious behavior type corresponding to the jth malicious sample file;
e malicious behavior type identifiers obtained after the duplication removal processing of the H are determined to be malicious file identifiers;
according to each malicious file identifier, m malicious sample files are grouped, and D malicious file identifier groups are determined; the malicious behavior type identifiers corresponding to a plurality of malicious sample files in each malicious file identification group are the same;
obtaining a sample fusion behavior vector corresponding to each malicious file identification group according to malicious behavior vectors corresponding to a plurality of malicious sample files in each malicious file identification group;
will G j And inputting a sample fusion behavior vector corresponding to the malicious file identification group where the jth malicious sample file is located and a malicious file identification corresponding to the jth malicious sample file into a preset model for training to obtain a target model.
7. The method of claim 1, wherein the obtaining the file behavior information of the file to be detected and the target associated file performed in the first preset time period further comprises:
acquiring a plurality of file characteristics of the file to be detected and each target associated file;
detecting the characteristics of each file to obtain a corresponding detection result;
and in the detection results, if a detection result indicating that the corresponding file to be detected or the target associated file is not a malicious file exists, acquiring a plurality of file behavior information of the file to be detected and each target associated file in a first preset time period.
8. A malicious detection device based on an associated file, comprising:
the file behavior acquisition module is used for acquiring the file behavior information of the file to be detected and a plurality of target associated files of each target associated file in a first preset time period if a plurality of target associated files with the file to be detected are detected in the first preset time period when the file to be detected is received; the file behavior information corresponds to file behavior, and the file behavior is one of self-starting, registry generation, scanning, encryption and information stealing;
The first vector determining module is used for determining a target behavior vector corresponding to the file to be detected and an associated behavior vector corresponding to each target associated file according to the file behavior information;
the second vector determining module is used for determining a fusion behavior vector according to the target behavior vector and each associated behavior vector;
the first identification determining module is used for inputting the target behavior vector and the fusion behavior vector into the target model to obtain a corresponding target file identification; the target model is obtained by training according to file behavior information of a malicious sample file;
the first malicious judgment module is used for determining the file to be detected as a malicious file when the target file identifier is a malicious file identifier;
the second identification determining module is used for inputting each associated behavior vector and the fusion behavior vector into the target model respectively to obtain a corresponding associated file identification;
and the second malicious judgment module is used for determining the corresponding target associated file as a malicious file when the associated file identifier is a malicious file identifier.
9. A non-transitory computer readable storage medium having stored therein at least one instruction or at least one program, wherein the at least one instruction or the at least one program is loaded and executed by a processor to implement the method of any one of claims 1-7.
10. An electronic device comprising a processor and the non-transitory computer readable storage medium of claim 9.
CN202311131104.1A 2023-09-04 2023-09-04 Malicious detection method, device, equipment and medium based on associated files Active CN116861428B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311131104.1A CN116861428B (en) 2023-09-04 2023-09-04 Malicious detection method, device, equipment and medium based on associated files

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311131104.1A CN116861428B (en) 2023-09-04 2023-09-04 Malicious detection method, device, equipment and medium based on associated files

Publications (2)

Publication Number Publication Date
CN116861428A CN116861428A (en) 2023-10-10
CN116861428B true CN116861428B (en) 2023-12-08

Family

ID=88222018

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311131104.1A Active CN116861428B (en) 2023-09-04 2023-09-04 Malicious detection method, device, equipment and medium based on associated files

Country Status (1)

Country Link
CN (1) CN116861428B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021027831A1 (en) * 2019-08-15 2021-02-18 中兴通讯股份有限公司 Malicious file detection method and apparatus, electronic device and storage medium
CN114996698A (en) * 2021-03-02 2022-09-02 三六零数字安全科技集团有限公司 Method, device and equipment for determining virus file and storage medium
CN115758362A (en) * 2022-11-29 2023-03-07 四川大学 Multi-feature-based automatic malicious software detection method
CN116305129A (en) * 2023-05-16 2023-06-23 北京安天网络安全技术有限公司 Document detection method, device, equipment and medium based on VSTO

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021027831A1 (en) * 2019-08-15 2021-02-18 中兴通讯股份有限公司 Malicious file detection method and apparatus, electronic device and storage medium
CN114996698A (en) * 2021-03-02 2022-09-02 三六零数字安全科技集团有限公司 Method, device and equipment for determining virus file and storage medium
CN115758362A (en) * 2022-11-29 2023-03-07 四川大学 Multi-feature-based automatic malicious software detection method
CN116305129A (en) * 2023-05-16 2023-06-23 北京安天网络安全技术有限公司 Document detection method, device, equipment and medium based on VSTO

Also Published As

Publication number Publication date
CN116861428A (en) 2023-10-10

Similar Documents

Publication Publication Date Title
US11687653B2 (en) Methods and apparatus for identifying and removing malicious applications
US8108536B1 (en) Systems and methods for determining the trustworthiness of a server in a streaming environment
US9270467B1 (en) Systems and methods for trust propagation of signed files across devices
CN110929259B (en) Process security verification white list generation method and device
EP3474174B1 (en) System and method of adapting patterns of dangerous behavior of programs to the computer systems of users
CN109995523B (en) Activation code management method and device and activation code generation method and device
US11528298B2 (en) Methods and systems for preventing malicious activity in a computer system
CN116861430B (en) Malicious file detection method, device, equipment and medium
CN116305290A (en) System log security detection method and device, electronic equipment and storage medium
US11503053B2 (en) Security management of an autonomous vehicle
CN116881913B (en) Staged malicious file detection method, device, equipment and medium
US11003772B2 (en) System and method for adapting patterns of malicious program behavior from groups of computer systems
KR20120078017A (en) Cloud computing-based system for supporting analysis of malicious code and analyst terminal using the same
CN116861428B (en) Malicious detection method, device, equipment and medium based on associated files
CN113158191A (en) Vulnerability verification method based on intelligent probe and related IAST method and system
CN117033146A (en) Identification method, device, equipment and medium for appointed consensus contract execution process
CN116595523A (en) Multi-engine file detection method, system, equipment and medium based on dynamic arrangement
CN116861429B (en) Malicious detection method, device, equipment and medium based on sample behaviors
CN115296874A (en) Computer network security system, method, medium, equipment and terminal
CN113839912B (en) Method, device, medium and equipment for analyzing abnormal host by active and passive combination
US9088604B1 (en) Systems and methods for treating locally created files as trustworthy
CN117034261B (en) Exception detection method and device based on identifier, medium and electronic equipment
CN117009962B (en) Anomaly detection method, device, medium and equipment based on effective label
CN116992439B (en) User behavior habit model determining method, device, equipment and medium
US20230244786A1 (en) File integrity monitoring

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant