CN116992448B - Sample determination method, device, equipment and medium based on importance degree of data source - Google Patents

Sample determination method, device, equipment and medium based on importance degree of data source Download PDF

Info

Publication number
CN116992448B
CN116992448B CN202311254330.9A CN202311254330A CN116992448B CN 116992448 B CN116992448 B CN 116992448B CN 202311254330 A CN202311254330 A CN 202311254330A CN 116992448 B CN116992448 B CN 116992448B
Authority
CN
China
Prior art keywords
target
sample
data source
file
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311254330.9A
Other languages
Chinese (zh)
Other versions
CN116992448A (en
Inventor
吕经祥
李石磊
肖新光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Antiy Network Technology Co Ltd
Original Assignee
Beijing Antiy Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Antiy Network Technology Co Ltd filed Critical Beijing Antiy Network Technology Co Ltd
Priority to CN202311254330.9A priority Critical patent/CN116992448B/en
Publication of CN116992448A publication Critical patent/CN116992448A/en
Application granted granted Critical
Publication of CN116992448B publication Critical patent/CN116992448B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/561Virus type analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/03Indexing scheme relating to G06F21/50, monitoring users, programs or devices to maintain the integrity of platforms
    • G06F2221/033Test or assess software

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Virology (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a sample determining method, a device, equipment and a medium based on importance of a data source, which relate to the field of data processing and comprise the following steps: in response to receiving the target malicious files, acquiring name character strings set by each target data source for the target malicious files, and obtaining a target name character string list; carrying out character string splitting on each name character string to obtain a target candidate character string list set; determining the importance degree of each target data source according to the target candidate character string list set; and determining a target similar sample file corresponding to the target malicious file. According to the method, name strings of the target malicious files are split through each target data source, the number of strings for file feature analysis of each target data source is obtained, the corresponding importance degree is determined through the number of the split strings, and the similar sample files are determined through each importance degree, so that the similarity accuracy between the obtained similar sample files and the target malicious files is higher.

Description

Sample determination method, device, equipment and medium based on importance degree of data source
Technical Field
The present application relates to the field of data processing, and in particular, to a method, an apparatus, a device, and a medium for determining a sample based on importance of a data source.
Background
The current method for determining the similar sample files is obtained by acquiring the file characteristics of each historical sample file for statistics, and because the number of the file characteristics of the historical sample files is large, the system resources occupied during acquisition and statistics are also large, so that when the number of the historical sample files is large, the current method for determining the similar sample files can greatly increase the using calculation force of a system, and because the detection rules of different data sources for transmitting the historical sample files and the file characteristics with different detection emphasis are different, the accuracy of the similar sample files determined according to the different data sources can be uneven.
Disclosure of Invention
In view of the above, the application provides a method, a device, equipment and a medium for determining samples based on importance of data sources, which at least partially solve the technical problem that the accuracy of similar sample files determined by different data sources in the prior art is too large, and adopts the following technical scheme:
according to one aspect of the present application, there is provided a sample determination method based on importance of a data source, the method comprising the steps of:
In response to receiving the target malicious file, acquiring name strings set by each target data source for the target malicious file to obtain a target name string list z= (Z) 1 ,Z 2 ,...,Z j ,...,Z m ) The method comprises the steps of carrying out a first treatment on the surface of the Where j=1, 2, m; m is the number of target data sources; z is Z j Setting a name character string for the j-th target data source to the target malicious file;
according to the preset character corresponding to the jth target data source, the target data source is used for Z j Performing character string splitting to obtain a target candidate character string list set N= (N) 1 ,N 2 ,...,N j ,...,N m );N j =(N j1 ,N j2 ,...,N jc ,...,N jf(j) ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein c=1, 2,., f (j); f (j) is Z j The number of target candidate strings contained therein; n (N) j Is Z j A corresponding list of target candidate strings; n (N) jc Is Z j The c-th target candidate character string included in the list;
according to the target candidate character string list set N, determining the importance degree of each target data source to obtain an importance degree set Q= (Q) 1 ,Q 2 ,...,Q j ,...,Q m ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein Q is j Importance level for the jth target data source; q (Q) j =f(j)/(∑ m j=1 f(j));
And determining at least one target similar sample file corresponding to the target malicious file according to the importance degree of each target data source.
In an exemplary embodiment of the present application, determining at least one target similar sample file corresponding to a target malicious file according to an importance degree of each target data source includes:
Determining a plurality of target sample files from a plurality of history sample files according to the target name character string list Z;
acquiring name strings set by each target data source for each target sample file to obtain a sample name string list set p= (P) 1 ,P 2 ,...,P j ,...,P m );P j =(P j1 ,P j2 ,...,P ja ,...,P jb ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein a=1, 2, b; b is the number of target sample files; p (P) j A sample name character string list corresponding to the jth target data source; p (P) ja A name string set for the jth target data source for the jth target sample file;
according to the preset character corresponding to the jth target data source, P is compared with ja Splitting character strings to obtain a sample candidate character string list set I corresponding to the jth target data source j =(I j1 ,I j2 ,...,I ja ,...,I jb );I ja =(I ja1 ,I ja2 ,...,I jag ,...,I jah(ja) ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein g=1, 2, h (ja); h (ja) is P ja The number of sample candidate strings contained therein; i ja Is P ja A corresponding sample candidate string list; i jag Is P ja The g sample candidate character string contained in the list;
according to I ja And N j Determining a sample matching degree H between an a-th target sample file and a target malicious file a
According to H a And determining at least one target similar sample file corresponding to the target malicious file from the b target sample files.
In an exemplary embodiment of the present application, according to I ja And N j Determining a sample matching degree H between an a-th target sample file and a target malicious file a Comprising:
according to N j Obtaining a target character string statement C of a target malicious file corresponding to the jth target data source j
According to I ja Obtaining a sample character string statement U of an a-th target sample file corresponding to a j-th target data source ja
Determination of C j And U ja Semantic matching degree A between ja
According to A ja And Q j Determining a sample matching degree H between an a-th target sample file and a target malicious file a
In an exemplary embodiment of the application, according to A ja And Q j Determining a sample matching degree H between an a-th target sample file and a target malicious file a Comprising:
according to H a =(∑ m j=1 (A ja ×Q j ) A/m determines a sample match between the a-th target sample file and the target malicious file.
In an exemplary embodiment of the present application, according to H a Determining at least one target similar sample file corresponding to the target malicious file from the b target sample files, wherein the determining comprises the following steps:
if H a ≥H 0 Determining the a-th target sample file as a target similar sample file corresponding to the target malicious file; wherein H is 0 And presetting a sample matching degree threshold value.
In an exemplary embodiment of the present application, determining a plurality of target sample files from a plurality of history sample files includes:
Acquiring name strings corresponding to the s history sample files to obtain a history name string list D= (D) 1 ,D 2 ,...,D w ,...,D s ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein w=1, 2,..s; d (D) w The name character string corresponding to the w-th history sample file;
pair D w Splitting character strings to obtain D w Corresponding number of character strings B w
If MIN (f (1), f (2), f (j), f (m) is less than or equal to B w MAX (f (1), f (2), f (j), f (m)), then determining the w-th history sample file as the target sample file; wherein MIN () is a preset minimum value determination function, and MAX () is a preset maximum value determination function.
In one exemplary embodiment of the application, Q j Also determined by the following steps:
traversal I j Determining I jag At I j The number L of (3) jag
If L jag ≥L 0 Will I jag Determining the sample target character strings to obtain the number M of sample target character strings corresponding to the jth target data source j The method comprises the steps of carrying out a first treatment on the surface of the Wherein L is 0 A character threshold value is preset;
determining importance level Q of jth target data source j =M j /(∑ m j=1 M j )。
According to an aspect of the present application, there is provided a sample determination apparatus based on importance of a data source, comprising:
the target name string acquisition module is used for acquiring name strings set by each target data source for the target malicious files when the target malicious files are received, so as to obtain a target name string list Z= (Z) 1 ,Z 2 ,...,Z j ,...,Z m ) The method comprises the steps of carrying out a first treatment on the surface of the Where j=1, 2, m; m is the number of target data sources; z is Z j Setting a name character string for the j-th target data source to the target malicious file;
a target candidate character string determining module for determining Z according to the preset character corresponding to the jth target data source j Performing character string splitting to obtain a target candidate character string list set N= (N) 1 ,N 2 ,...,N j ,...,N m );N j =(N j1 ,N j2 ,...,N jc ,...,N jf(j) ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein c=1, 2,., f (j); f (j) is Z j Target candidate character contained in the character stringThe number of strings; n (N) j Is Z j A corresponding list of target candidate strings; n (N) jc Is Z j The c-th target candidate character string included in the list;
the importance degree determining module is configured to determine an importance degree of each target data source according to the target candidate string list set N, so as to obtain an importance degree set q= (Q) 1 ,Q 2 ,...,Q j ,...,Q m ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein Q is j Importance level for the jth target data source; q (Q) j =f(j)/(∑ m j=1 f(j));
And the similar sample determining module is used for determining at least one target similar sample file corresponding to the target malicious file according to the importance degree of each target data source.
According to one aspect of the present application, there is provided a non-transitory computer readable storage medium having stored therein at least one instruction or at least one program loaded and executed by a processor to implement the aforementioned data source importance-based sample determination method.
According to one aspect of the present application, there is provided an electronic device comprising a processor and the aforementioned non-transitory computer-readable storage medium.
The application has at least the following beneficial effects:
according to the method, name character strings set by each target data source on the target malicious file are obtained according to the received target malicious file, the name character strings are split according to preset characters corresponding to each target data source to obtain a plurality of corresponding target candidate character strings, corresponding importance degrees are determined according to the number of the plurality of target candidate character strings corresponding to each target data source, then a target similar sample file corresponding to the target malicious file is determined from a plurality of target sample files according to the importance degrees of each target data source, the name character strings of the target malicious file are split through each target data source to obtain the number of character strings for file feature analysis of each target data source, the corresponding importance degrees are determined through the number of the character strings obtained through splitting, and the similar sample file is determined through each importance degree.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a method for determining a sample based on importance of a data source according to an embodiment of the present invention;
fig. 2 is a block diagram of a sample determining device based on importance of a data source according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.
A method for determining a sample based on importance of a data source, as shown in fig. 1, the method comprising the steps of:
Step S100, in response to receiving the target malicious file, obtaining name strings set by each target data source for the target malicious file to obtain a target name string list Z= (Z) 1 ,Z 2 ,...,Z j ,...,Z m ) The method comprises the steps of carrying out a first treatment on the surface of the Where j=1, 2, m; m is the number of target data sources; z is Z j Setting a name character string for the j-th target data source to the target malicious file;
the target malicious files are malicious files for searching similar sample files, and a plurality of target sample files are determined from a plurality of history sample files according to the received target malicious files. The target sample file can be any history sample file, and can also be a history sample file which is set according to the requirement or meets the preset condition.
The method comprises the steps that target data sources, namely suppliers of history sample files, are provided with a detection rule corresponding to each target data source, the target data sources perform malicious detection on files to be detected through the corresponding detection rule, each target data source is provided with a plurality of preset characters, the preset characters are represented as segmentation characters in corresponding name character strings, the name character strings are character strings of virus names of viruses in the corresponding target malicious files, and the name character strings comprise attack type character strings, virus family character strings, application platform character strings, virus variant character strings and the like of the viruses; because the extraction methods of the name strings of each target data source are different, the information sequences in the name strings of the same file extracted by different target data sources are possibly different, so that the name strings corresponding to the target malicious files are split through preset characters corresponding to the target data sources to obtain a plurality of target candidate strings corresponding to each target data source, wherein the target candidate strings are attack type strings, virus family strings, application platform strings, virus variant strings and the like.
Step S200, according to the preset character corresponding to the jth target data source, the step S is to Z j Performing character string splitting to obtain a target candidate character string list set N= (N) 1 ,N 2 ,...,N j ,...,N m );N j =(N j1 ,N j2 ,...,N jc ,...,N jf(j) ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein c=1, 2,., f (j); f (j) is Z j The number of target candidate strings contained therein; n (N) j Is Z j A corresponding list of target candidate strings; n (N) jc Is Z j The c-th target candidate character string included in the list;
the target candidate character strings are a plurality of character strings obtained by splitting the name character strings of the target malicious files according to preset characters corresponding to the jth target data source.
Step S300, determining the importance degree of each target data source according to the target candidate character string list set N to obtain an importance degree set Q= (Q) 1 ,Q 2 ,...,Q j ,...,Q m ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein Q is j Importance level for the jth target data source; q (Q) j =f(j)/(∑ m j=1 f(j));
The importance degree of the target data sources, namely the weight of the corresponding target data sources, is the proportion of each target data source when determining the target similar sample file, and is used for reflecting the proportion degree of the corresponding target data sources.
Step S400, determining at least one target similar sample file corresponding to the target malicious file according to the importance degree of each target data source;
further, in step S400, determining at least one target similar sample file corresponding to the target malicious file according to the importance degree of each target data source, including:
Step S410, determining a plurality of target sample files from a plurality of history sample files according to the target name character string list Z;
the history sample files are sample files which pass detection, wherein the sample files comprise malicious sample files and non-malicious sample files, and a plurality of target sample files are determined from a plurality of history sample files by comparing file information of the history sample files with file information of target malicious files.
In step S410, a plurality of target sample files are determined from a plurality of history sample files, including:
step S411, obtaining name strings corresponding to the S history sample files to obtain a history name string list D= (D) 1 ,D 2 ,...,D w ,...,D s ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein w=1, 2,..s; d (D) w The name character string corresponding to the w-th history sample file;
step S412, pair D w Splitting character strings to obtain D w Corresponding number of character strings B w
Step S413, if MIN (f (1), f (2), f (j), f (m) is less than or equal to B w MAX (f (1), f (2), f (j), f (m)), then determining the w-th history sample file as the target sample file; wherein MIN () is a preset minimum value determination function, and MAX () is a preset maximum value determination function.
In step S410, the target sample file is determined according to the number of character strings of the history sample file, so as to obtain name character strings of each history sample file, and split each name character string to obtain the corresponding number of character strings; if the number of the character strings is in the range of the minimum value and the maximum value of the number of the target candidate character strings determined by all the target data sources, the character strings after the history sample file is split are indicated to be the number of the character strings conforming to the splitting rule of the target data sources, and the character strings are determined to be the target sample file.
In addition, the target sample file may be determined by:
traversing each history sample file, and determining the history sample file as a target sample file if the file information of the history sample file is the same as the file information of the target malicious file. The file information of the target malicious file is the file format, the file type, the coding mode and the like of the target malicious file. And determining the historical sample file which is the same as the file information of the target malicious file as a target sample file, and primarily screening the huge historical sample file through the file information to determine the target sample file.
Step S420, obtaining name strings set by each target data source for each target sample file to obtain a sample name string list set P= (P) 1 ,P 2 ,...,P j ,...,P m );P j =(P j1 ,P j2 ,...,P ja ,...,P jb ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein a=1, 2, b; b is the number of target sample files; p (P) j A sample name character string list corresponding to the jth target data source; p (P) ja A name string set for the jth target data source for the jth target sample file;
step S430, according to the preset character corresponding to the jth target data source, P is compared with ja Splitting character strings to obtain a sample candidate character string list set I corresponding to the jth target data source j =(I j1 ,I j2 ,...,I ja ,...,I jb );I ja =(I ja1 ,I ja2 ,...,I jag ,...,I jah(ja) ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein g=1, 2, h (ja); h (ja) is P ja The number of sample candidate strings contained therein; i ja Is P ja A corresponding sample candidate string list; i jag Is P ja The g sample candidate character string contained in the list;
step S440, according to I ja And N j Determining a sample matching degree H between an a-th target sample file and a target malicious file a
In step S440, according to I ja And N j Determining a sample matching degree H between an a-th target sample file and a target malicious file a Comprising:
step S441, according to N j Obtaining a target character string statement C of a target malicious file corresponding to the jth target data source j
Step S442, according to I ja Obtaining a sample character string statement U of an a-th target sample file corresponding to a j-th target data source ja
Step S443, determining C j And U ja Semantic matching degree A between ja
Step S444 according to A ja And Q j Determining a sample matching degree H between an a-th target sample file and a target malicious file a =(∑ m j=1 (A ja ×Q j ))/m。
Step S450, according to H a Determining at least one target similar sample file corresponding to the target malicious file from the b target sample files;
further, in step S450, according to H a Determining at least one target similar sample file corresponding to the target malicious file from the b target sample files, wherein the determining comprises the following steps:
Step S451, ifH a ≥H 0 Determining the a-th target sample file as a target similar sample file corresponding to the target malicious file; wherein H is 0 And presetting a sample matching degree threshold value.
Comparing the sample matching degree with a preset sample matching degree threshold, and if the sample matching degree is greater than or equal to the preset sample matching degree threshold, determining the corresponding target sample file as a target similar sample file corresponding to the target malicious file.
In addition, step S300 is a first embodiment of a method for determining importance levels of target data sources, in which name strings of target malicious files are split according to each target data source to obtain a plurality of target candidate strings corresponding to each target data source, and then importance levels of each target data source are determined according to the number of target candidate strings corresponding to each target data source.
In a second embodiment of the method for determining importance of a target data source, Q j It can also be determined by the following steps:
step S310, according to I j Determining a plurality of sample target character strings from a plurality of sample candidate character strings corresponding to the jth target data source;
further, in step S310, according to I j Determining a plurality of sample target character strings from a plurality of sample candidate character strings corresponding to the jth target data source, wherein the method comprises the following steps:
step S311, traversing I j Determining I jag At I j The number L of (3) jag
Step S312, if L jag ≥L 0 Will I jag Determining the sample target character strings to obtain the number M of sample target character strings corresponding to the jth target data source j The method comprises the steps of carrying out a first treatment on the surface of the Wherein L is 0 The character threshold is preset.
Step S320, determining the importance degree of the jth target data source according to a plurality of sample target character strings corresponding to the jth target data source;
step S321, obtaining the number M of sample target strings corresponding to the jth target data source j
Step S322, determining importance level Q of the jth target data source j =M j /(∑ m j=1 M j )。
In the second embodiment of the method for determining the importance degree of the target data source, the importance degree of the target data source is determined by the number of the plurality of sample target strings corresponding to each target data source, and compared with the method for determining the number of the target candidate strings through the target malicious file in the first embodiment, the second embodiment is determined according to the sample candidate strings corresponding to the same target data source, and the number of samples is increased, and the target sample files are historical sample files which pass the transmission verification of each target data source, so that the determined importance degree of the target data source is more accurate.
Accordingly, the first embodiment and the second embodiment of the method for determining the importance level of the target data source can also determine the corresponding third embodiment, that is, the importance level of the third embodiment is the sum of the importance level obtained by the first embodiment and the importance level obtained by the second embodiment, so as to further improve the accuracy of the determined importance level.
In addition, step S440 is the sample matching degree H a In the first embodiment of the determining method of (a), the first embodiment obtains the corresponding semantic matching degree by carrying out semantic matching on the target character string statement and the sample character string statement, obtains the product of the semantic matching degree and the importance degree of the corresponding target data source, and matches all the semantics of the same target sample fileThe product of the matching degree and the importance degree of each target data source is averaged to obtain a corresponding sample matching degree, the method is suitable for the situation that the number of sample candidate character strings is too large so as to determine character string sentences, when the character string sentences cannot be determined due to the fact that the number of the sample candidate character strings or the target candidate character strings is small, or the determined character string sentences are too short, the obtained semantic matching degree is inaccurate, so that in order to solve the problem, the sample matching degree H is provided a As shown in steps S500 to S510.
Sample matching degree H a A second embodiment of the determination method of (2) is:
step S500, according to I ja And N j Determining a name matching degree list set e= (E 1 ,E 2 ,...,E a ,...,E b );E a =(E a1 ,E a2 ,...,E aj ,...,E am ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein E is a A name matching degree list corresponding to the a-th target sample file and the target malicious file; e (E) aj Is P ja And Z is j The degree of name matching between the two;
wherein E is aj Is determined by the following steps:
step S501, pair I ja And N j Intersection processing is carried out to obtain P ja And Z is j K between ja Matching candidate character strings;
step S502, K ja Is determined as P ja And Z is j Degree of name matching E between aj
Step S510, according to E aj And Q j Determining a sample matching degree H between an a-th target sample file and a target malicious file a =∑ m j=1 (E aj ×Q j )。
Further, the method for determining the malicious detection rule through the target malicious file and the target similar sample file is as follows:
step S600, according to the descending order of the sample matching degree corresponding to each target similar sample file, for each target phaseThe similar sample files are sequenced to obtain a sequenced similar sample file list T 1 ,T 2 ,...,T n ,...,T q The method comprises the steps of carrying out a first treatment on the surface of the Wherein n=1, 2, q; q is the number of target similar sample files; t (T) n The n-th target similar sample file is sequenced according to the sample matching degree;
And sorting the target similar sample files according to the sample matching degree to obtain a sorted similar sample file list, wherein the lower the position in the sorted similar sample file list is, the lower the similarity between the target similar sample files and the target malicious files is.
Step S610, let n=1;
step S611, if n is less than or equal to q, according to the ordered similar sample file list T 1 ,...,T n The method comprises the steps that the candidate detection rules are obtained through the included file characteristics and the file characteristics included in the target malicious file;
step S612, according to the candidate detection rule, for T n+1 ,...,T q Performing malicious detection to obtain q-n corresponding malicious detection results;
step S613, if each malicious detection result represents that the corresponding target similar sample file is a malicious file, determining the candidate detection rule as an initial detection rule; otherwise, let n=n+1, and return to step S611.
In order to further reduce the data processing amount, when determining candidate detection rules, according to the sequence of the sample matching degree from high to low, taking the file characteristics of the target similar sample file and the target malicious file to obtain the corresponding candidate detection rules, and then verifying the obtained candidate detection rules, namely, T n+1 ,...,T q Performing malicious detection to obtain corresponding malicious detection results, wherein the target similar sample files are similar sample files of the target malicious files, so that the target similar sample files are malicious files, if each malicious detection result represents that the corresponding target similar sample file is a malicious file, the candidate detection rules pass verification detection, the candidate detection rules are determined to be initial detection rules, otherwise, the file characteristics of the target similar sample files with the sample matching degree are continuously taken down to determine the candidate detection rules, and then And verifying the obtained candidate detection rule until the verification is passed or all file characteristics of the target similar sample files are completely fetched.
Step S620, carrying out malicious detection on a plurality of preset verification sample files according to the initial detection rules to obtain detection accuracy corresponding to the initial detection rules;
the verification sample file is a sample file for rule verification.
Step 630, if the detection accuracy is smaller than the preset detection accuracy threshold, a supplementary sample file is obtained;
if the detection accuracy is smaller than the preset detection accuracy threshold, the detection accuracy of the initial detection rule is lower, and then the initial detection rule is redetermined by acquiring a supplementary sample file.
Further, in step S630, the method for acquiring the supplementary sample file includes:
step S631, acquiring the determination time t of the initial detection rule;
step S632, sequentially obtaining the files to be detected received from t to the current time, to obtain a set y= (Y) of files to be detected 1 ,Y 2 ,...,Y k ,...,Y u ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein k=1, 2,. -%, u; u is the number of files to be detected received from t to the current time; y is Y k The method comprises the steps that a kth file to be detected is received from t to the current time;
step S633, let k=1;
Step S634, if k is less than or equal to u, according to the initial detection rule, for Y k Carrying out similarity detection on the included file characteristics to obtain corresponding similarity detection results;
step S635, if the similar detection result represents Y k Is a similar file, Y is then k Determining to supplement the sample file; otherwise, let k=k+1, and return to step S634.
The detection accuracy of the initial detection rule being smaller than the preset detection accuracy threshold may be caused by too low reference value due to too early acquisition time of the historical sample file, so the first similar file acquired after the initial detection rule determination time is selected as the supplementary sample file.
Step S640, redetermining an initial detection rule according to the supplementary sample file, the target malicious file and the plurality of target similar sample files, and determining the initial detection rule as a malicious detection rule if the detection accuracy corresponding to the initial detection rule is greater than or equal to a preset detection accuracy threshold.
And re-determining an initial detection rule through the determined supplementary sample file, the target malicious file and the plurality of target similar sample files, verifying the initial detection rule according to the verification sample file, acquiring a new supplementary sample file if the corresponding detection accuracy is still smaller than a preset detection accuracy threshold value, re-determining the initial detection rule until the detection accuracy corresponding to the initial detection rule is larger than or equal to the preset detection accuracy threshold value, indicating that the initial detection rule at the moment meets the detection verification requirement, and determining the initial detection rule as the malicious detection rule.
According to the method, name character strings set by each target data source for the target malicious files are obtained according to the received target malicious files, the name character strings are split according to preset characters corresponding to each target data source to obtain a plurality of corresponding target candidate character strings, corresponding importance degrees are determined according to the number of the plurality of target candidate character strings corresponding to each target data source, and then target similar sample files corresponding to the target malicious files are determined from the plurality of target sample files according to the importance degrees of each target data source. Splitting name character strings of target malicious files through each target data source to obtain the number of character strings for file feature analysis of each target data source, determining corresponding importance degrees through the number of character strings obtained through splitting, and determining similar sample files through each importance degree.
A sample determination apparatus 100 based on the importance of a data source, as shown in fig. 2, includes:
The target name string obtaining module 110 is configured to obtain, when receiving the target malicious file, name strings set by each target data source for the target malicious file, so as to obtain a target name string list z= (Z) 1 ,Z 2 ,...,Z j ,...,Z m ) The method comprises the steps of carrying out a first treatment on the surface of the Where j=1, 2, m; m is the number of target data sources; z is Z j Setting a name character string for the j-th target data source to the target malicious file;
a target candidate character string determining module 120, configured to determine, according to the preset character corresponding to the jth target data source, Z j Performing character string splitting to obtain a target candidate character string list set N= (N) 1 ,N 2 ,...,N j ,...,N m );N j =(N j1 ,N j2 ,...,N jc ,...,N jf(j) ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein c=1, 2,., f (j); f (j) is Z j The number of target candidate strings contained therein; n (N) j Is Z j A corresponding list of target candidate strings; n (N) jc Is Z j The c-th target candidate character string included in the list;
the importance determining module 130 is configured to determine an importance of each target data source according to the target candidate string list set N, so as to obtain an importance set q= (Q) 1 ,Q 2 ,...,Q j ,...,Q m ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein Q is j Importance level for the jth target data source; q (Q) j =f(j)/(∑ m j=1 f(j));
The similarity sample determining module 140 is configured to determine at least one target similarity sample file corresponding to the target malicious file according to the importance level of each target data source.
Embodiments of the present invention also provide a computer program product comprising program code for causing an electronic device to carry out the steps of the method according to the various exemplary embodiments of the invention as described in the specification, when said program product is run on the electronic device.
Furthermore, although the steps of the methods in the present disclosure are depicted in a particular order in the drawings, this does not require or imply that the steps must be performed in that particular order or that all illustrated steps be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform, etc.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a mobile terminal, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.
In an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided.
Those skilled in the art will appreciate that the various aspects of the invention may be implemented as a system, method, or program product. Accordingly, aspects of the invention may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" system.
An electronic device according to this embodiment of the invention. The electronic device is merely an example, and should not impose any limitations on the functionality and scope of use of embodiments of the present invention.
The electronic device is in the form of a general purpose computing device. Components of an electronic device may include, but are not limited to: the at least one processor, the at least one memory, and a bus connecting the various system components, including the memory and the processor.
Wherein the memory stores program code that is executable by the processor to cause the processor to perform steps according to various exemplary embodiments of the invention described in the "exemplary methods" section of this specification.
The storage may include readable media in the form of volatile storage, such as Random Access Memory (RAM) and/or cache memory, and may further include Read Only Memory (ROM).
The storage may also include a program/utility having a set (at least one) of program modules including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
The bus may be one or more of several types of bus structures including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus architectures.
The electronic device may also communicate with one or more external devices (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device, and/or with any device (e.g., router, modem, etc.) that enables the electronic device to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface. And, the electronic device may also communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through a network adapter. As shown, the network adapter communicates with other modules of the electronic device over a bus. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with an electronic device, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a terminal device, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.
In an exemplary embodiment of the present disclosure, a computer-readable storage medium having stored thereon a program product capable of implementing the method described above in the present specification is also provided. In some possible embodiments, the various aspects of the invention may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the invention as described in the "exemplary methods" section of this specification, when said program product is run on the terminal device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
Furthermore, the above-described drawings are only schematic illustrations of processes included in the method according to the exemplary embodiment of the present invention, and are not intended to be limiting. It will be readily appreciated that the processes shown in the above figures do not indicate or limit the temporal order of these processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, for example, among a plurality of modules.
It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the scope of the present invention should be included in the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (8)

1. A method for determining a sample based on importance of a data source, the method comprising the steps of:
in response to receiving a target malicious file, acquiring name strings set by each target data source for the target malicious file to obtain a target name string list z= (Z) 1 ,Z 2 ,...,Z j ,...,Z m ) The method comprises the steps of carrying out a first treatment on the surface of the Where j=1, 2, m; m is the number of target data sources; z is Z j Setting a name character string for the j-th target data source for the target malicious file;
according to the preset character corresponding to the jth target data source, the target data source is used for Z j Performing character string splitting to obtain a target candidate character string list set N= (N) 1 ,N 2 ,...,N j ,...,N m );N j =(N j1 ,N j2 ,...,N jc ,...,N jf(j) ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein c=1, 2,., f (j); f (j) is Z j The number of target candidate strings contained therein; n (N) j Is Z j A corresponding list of target candidate strings; n (N) jc Is Z j The c-th target candidate character string included in the list;
determining the importance degree of each target data source according to the target candidate character string list set N to obtain an importance degree set Q= (Q) 1 ,Q 2 ,...,Q j ,...,Q m ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein Q is j Importance level for the jth target data source; q (Q) j =f(j)/(∑ m j=1 f(j));
Determining at least one target similar sample file corresponding to the target malicious file according to the importance degree of each target data source;
the determining at least one target similar sample file corresponding to the target malicious file according to the importance degree of each target data source includes:
determining a plurality of target sample files from a plurality of history sample files according to the target name character string list Z;
acquiring name strings set by each target data source for each target sample file to obtain a sample name string list set p= (P) 1 ,P 2 ,...,P j ,...,P m );P j =(P j1 ,P j2 ,...,P ja ,...,P jb ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein a=1, 2, b; b is the number of target sample files; p (P) j A sample name character string list corresponding to the jth target data source; p (P) ja A name string set for the jth target data source for the jth target sample file;
according to the preset character corresponding to the jth target data source, P is compared with ja Splitting character strings to obtain a sample candidate character string list set I corresponding to the jth target data source j =(I j1 ,I j2 ,...,I ja ,...,I jb );I ja =(I ja1 ,I ja2 ,...,I jag ,...,I jah(ja) ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein g=1, 2, h (ja); h (ja) is P ja The number of sample candidate strings contained therein; i ja Is P ja A corresponding sample candidate string list; i jag Is P ja The g sample candidate character string contained in the list;
according to I ja And N j Determining a sample matching degree H between an a-th target sample file and the target malicious file a
According to H a Determining at least one target similar sample file corresponding to the target malicious file from b target sample files;
wherein, determining a plurality of target sample files from a plurality of history sample files includes:
acquiring name strings corresponding to the s history sample files to obtain a history name string list D= (D) 1 ,D 2 ,...,D w ,...,D s ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein w=1, 2,..s; d (D) w The name character string corresponding to the w-th history sample file;
Pair D w Splitting character strings to obtain D w Corresponding number of character strings B w
If MIN (f (1), f (2), f (j), f (m) is less than or equal to B w MAX (f (1), f (2), f (j), f (m)), then determining the w-th history sample file as the target sample file; wherein MIN () is a preset minimum value determination function, and MAX () is a preset maximum value determination function.
2. The method according to claim 1, wherein the step of ja And N j Determining a sample matching degree H between an a-th target sample file and the target malicious file a Comprising:
according to N j Obtaining a target character string statement C of a target malicious file corresponding to the jth target data source j
According to I ja Obtaining a sample character string statement U of an a-th target sample file corresponding to a j-th target data source ja
Determination of C j And U ja Semantic matching degree A between ja
According to A ja And Q j Determining a sample matching degree H between an a-th target sample file and the target malicious file a
3. The method according to claim 2, wherein the method according to a ja And Q j Determining a sample matching degree H between an a-th target sample file and the target malicious file a Comprising:
according to H a =(∑ m j=1 (A ja ×Q j ) Determining an a-th target sample file and the target Sample matching degree between malicious files.
4. The method according to claim 1, wherein the step of forming a pattern according to H a Determining at least one target similar sample file corresponding to the target malicious file from b target sample files, wherein the method comprises the following steps:
if H a ≥H 0 Determining an a-th target sample file as a target similar sample file corresponding to the target malicious file; wherein H is 0 And presetting a sample matching degree threshold value.
5. The method of claim 1, wherein Q j Also determined by the following steps:
traversal I j Determining I jag At I j The number L of (3) jag
If L jag ≥L 0 Will I jag Determining the sample target character strings to obtain the number M of sample target character strings corresponding to the jth target data source j The method comprises the steps of carrying out a first treatment on the surface of the Wherein L is 0 A character threshold value is preset;
determining importance level Q of jth target data source j =M j /(∑ m j=1 M j )。
6. A sample determination device based on importance of a data source, comprising:
the target name string acquisition module is used for acquiring name strings set by each target data source for the target malicious files when the target malicious files are received, so as to obtain a target name string list Z= (Z) 1 ,Z 2 ,...,Z j ,...,Z m ) The method comprises the steps of carrying out a first treatment on the surface of the Where j=1, 2, m; m is the number of target data sources; z is Z j Setting a name character string for the j-th target data source to the target malicious file;
a target candidate character string determining module for determining Z according to the preset character corresponding to the jth target data source j Splitting character strings toObtaining target candidate character string list set N= (N) 1 ,N 2 ,...,N j ,...,N m );N j =(N j1 ,N j2 ,...,N jc ,...,N jf(j) ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein c=1, 2,., f (j); f (j) is Z j The number of target candidate strings contained therein; n (N) j Is Z j A corresponding list of target candidate strings; n (N) jc Is Z j The c-th target candidate character string included in the list;
the importance degree determining module is configured to determine an importance degree of each target data source according to the target candidate string list set N, so as to obtain an importance degree set q= (Q) 1 ,Q 2 ,...,Q j ,...,Q m ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein Q is j Importance level for the jth target data source; q (Q) j =f(j)/(∑ m j=1 f(j));
The similarity sample determining module is used for determining at least one target similarity sample file corresponding to the target malicious file according to the importance degree of each target data source;
according to the importance degree of each target data source, determining at least one target similar sample file corresponding to the target malicious file, including:
determining a plurality of target sample files from a plurality of history sample files according to the target name character string list Z;
Acquiring name strings set by each target data source for each target sample file to obtain a sample name string list set p= (P) 1 ,P 2 ,...,P j ,...,P m );P j =(P j1 ,P j2 ,...,P ja ,...,P jb ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein a=1, 2, b; b is the number of target sample files; p (P) j A sample name character string list corresponding to the jth target data source; p (P) ja A name string set for the jth target data source for the jth target sample file;
according to the preset character corresponding to the jth target data source, P is compared with ja Splitting character strings to obtain a sample candidate character string list set I corresponding to the jth target data source j =(I j1 ,I j2 ,...,I ja ,...,I jb );I ja =(I ja1 ,I ja2 ,...,I jag ,...,I jah(ja) ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein g=1, 2, h (ja); h (ja) is P ja The number of sample candidate strings contained therein; i ja Is P ja A corresponding sample candidate string list; i jag Is P ja The g sample candidate character string contained in the list;
according to I ja And N j Determining a sample matching degree H between an a-th target sample file and a target malicious file a
According to H a Determining at least one target similar sample file corresponding to the target malicious file from the b target sample files;
wherein, confirm a plurality of target sample files from a plurality of history sample files, include:
acquiring name strings corresponding to the s history sample files to obtain a history name string list D= (D) 1 ,D 2 ,...,D w ,...,D s ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein w=1, 2,..s; d (D) w The name character string corresponding to the w-th history sample file;
pair D w Splitting character strings to obtain D w Corresponding number of character strings B w
If MIN (f (1), f (2), f (j), f (m) is less than or equal to B w MAX (f (1), f (2), f (j), f (m)), then determining the w-th history sample file as the target sample file; wherein MIN () is a preset minimum value determination function, and MAX () is a preset maximum value determination function.
7. A non-transitory computer readable storage medium having stored therein at least one instruction or at least one program, wherein the at least one instruction or the at least one program is loaded and executed by a processor to implement the method of any one of claims 1-5.
8. An electronic device comprising a processor and the non-transitory computer readable storage medium of claim 7.
CN202311254330.9A 2023-09-27 2023-09-27 Sample determination method, device, equipment and medium based on importance degree of data source Active CN116992448B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311254330.9A CN116992448B (en) 2023-09-27 2023-09-27 Sample determination method, device, equipment and medium based on importance degree of data source

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311254330.9A CN116992448B (en) 2023-09-27 2023-09-27 Sample determination method, device, equipment and medium based on importance degree of data source

Publications (2)

Publication Number Publication Date
CN116992448A CN116992448A (en) 2023-11-03
CN116992448B true CN116992448B (en) 2023-12-15

Family

ID=88530597

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311254330.9A Active CN116992448B (en) 2023-09-27 2023-09-27 Sample determination method, device, equipment and medium based on importance degree of data source

Country Status (1)

Country Link
CN (1) CN116992448B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110610084A (en) * 2018-06-15 2019-12-24 武汉安天信息技术有限责任公司 Dex file-based sample maliciousness determination method and related device
CN114637990A (en) * 2020-12-15 2022-06-17 网神信息技术(北京)股份有限公司 File malice degree evaluation method and device, electronic equipment and medium
CN114925757A (en) * 2022-05-09 2022-08-19 中国电信股份有限公司 Multi-source threat intelligence fusion method, device, equipment and storage medium
CN116303290A (en) * 2023-05-16 2023-06-23 北京安天网络安全技术有限公司 Office document detection method, device, equipment and medium
CN116522338A (en) * 2023-04-18 2023-08-01 深圳市深信服信息安全有限公司 File processing method, equipment and computer readable storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12013941B2 (en) * 2018-06-28 2024-06-18 Crowdstrike, Inc. Analysis of malware
US20230195896A1 (en) * 2021-12-21 2023-06-22 Palo Alto Networks, Inc. Identification of .net malware with "unmanaged imphash"

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110610084A (en) * 2018-06-15 2019-12-24 武汉安天信息技术有限责任公司 Dex file-based sample maliciousness determination method and related device
CN114637990A (en) * 2020-12-15 2022-06-17 网神信息技术(北京)股份有限公司 File malice degree evaluation method and device, electronic equipment and medium
CN114925757A (en) * 2022-05-09 2022-08-19 中国电信股份有限公司 Multi-source threat intelligence fusion method, device, equipment and storage medium
CN116522338A (en) * 2023-04-18 2023-08-01 深圳市深信服信息安全有限公司 File processing method, equipment and computer readable storage medium
CN116303290A (en) * 2023-05-16 2023-06-23 北京安天网络安全技术有限公司 Office document detection method, device, equipment and medium

Also Published As

Publication number Publication date
CN116992448A (en) 2023-11-03

Similar Documents

Publication Publication Date Title
US8631498B1 (en) Techniques for identifying potential malware domain names
US20230326466A1 (en) Text processing method and apparatus, electronic device, and medium
CN110363220B (en) Behavior class detection method and device, electronic equipment and computer readable medium
CN111079408B (en) Language identification method, device, equipment and storage medium
CN109858045B (en) Machine translation method and device
CN115221516B (en) Malicious application program identification method and device, storage medium and electronic equipment
CN108595412B (en) Error correction processing method and device, computer equipment and readable medium
US12020697B2 (en) Systems and methods for fast filtering of audio keyword search
CN116992448B (en) Sample determination method, device, equipment and medium based on importance degree of data source
US11557288B2 (en) Hindrance speech portion detection using time stamps
US10909316B2 (en) Technique for automatically splitting words
CN115858776B (en) Variant text classification recognition method, system, storage medium and electronic equipment
CN116992450B (en) File detection rule determining method and device, electronic equipment and storage medium
CN116992449B (en) Method and device for determining similar sample files, electronic equipment and storage medium
CN112307183B (en) Search data identification method, apparatus, electronic device and computer storage medium
CN113778902A (en) Method and device for detecting coverage of test case
CN117034275B (en) Malicious file detection method, device and medium based on Yara engine
CN117009961B (en) Method, device, equipment and medium for determining behavior detection rule
US11769487B2 (en) Systems and methods for voice topic spotting
CN116305172B (en) OneNote document detection method, oneNote document detection device, oneNote document detection medium and OneNote document detection equipment
CN114238976B (en) File detection method and device, readable medium and electronic equipment
CN116910756A (en) Detection method for malicious PE (polyethylene) files
CN109918293B (en) System test method and device, electronic equipment and computer readable storage medium
CN109524026B (en) Method and device for determining prompt tone, storage medium and electronic device
CN110647519A (en) Method and device for predicting missing attribute value in test sample

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant