FR3035984A1

FR3035984A1 - METHOD FOR DETECTING MALWARE SOFTWARE

Info

Publication number: FR3035984A1
Application number: FR1553992A
Authority: FR
Inventors: Patrick Ragaru
Original assignee: Lexsi
Current assignee: Lexsi
Priority date: 2015-05-04
Filing date: 2015-05-04
Publication date: 2016-11-11
Anticipated expiration: 2035-05-04
Also published as: FR3035984B1

Abstract

L'invention concerne un procédé d'analyse d'un programme, caractérisé en ce qu'il comprend les étapes suivantes : capture (E0) d'un premier état de la mémoire vive d'une machine virtuelle dans laquelle est installé un système d'exploitation ; lancement (E2) de l'exécution dudit programme dans ladite machine virtuelle ; après une durée prédéterminée (E3), capture (E4) d'un second état de la mémoire vive de ladite machine virtuelle ; analyse comparative (E6) du premier et du second états capturés ; et identification (E8) d'au moins un marqueur de compromission dudit programme, en fonction du résultat de l'analyse comparative.The invention relates to a method of analysis of a program, characterized in that it comprises the following steps: capture (E0) of a first state of the RAM of a virtual machine in which is installed a system of exploitation; launching (E2) the execution of said program in said virtual machine; after a predetermined duration (E3), capturing (E4) a second state of the RAM of said virtual machine; comparative analysis (E6) of captured first and second states; and identifying (E8) at least one compromising marker of said program, based on the result of the comparative analysis.

Description

PROCEDE DE DETECTION D'UN LOGICIEL MALVEILLANT DOMAINE DE L'INVENTION L'invention se situe dans le domaine de la sécurité informatique. Elle s'applique, en particulier, à l'analyse du comportement de programmes informatiques afin de détecter un éventuel comportement malveillant. CONTEXTE DE L'INVENTION Il existe à ce jour divers outils permettant d'analyser des programmes malveillants (malwares en anglais), notamment présents sur des supports de stockage externes. Par ailleurs, lors de l'exécution d'un programme donné sur une machine, la mémoire vive (ou mémoire volatile) utilisée par le système d'exploitation installé sur cette machine est modifiée. Dans ce contexte, l'analyse de cette mémoire vive peut aider à déterminer la nature malveillante ou non du programme exécuté. Une telle analyse est généralement menée dans un environnement de test spécifique appelé machine virtuelle (ou sandbox en anglais), dans lequel le système d'exploitation a été préalablement installé. Cet environnement émule les différents éléments d'un ordinateur réel (processeur, mémoire vive, disque dur, carte réseau, etc.) afin de reproduire au mieux le contexte dans lequel s'exécute le programme. Il permet d'exécuter celui-ci dans un cadre contrôlé, sans mettre en péril la sécurité de l'ordinateur hôte. La détermination de la nature malveillante d'un programme suppose généralement l'instrumentation de la machine virtuelle dans lequel il s'exécute, pour permettre l'analyse in situ du contenu de la mémoire vive virtuelle. Toutefois, une telle instrumentation est de mieux en mieux détectée par les programmes malveillants, qui adaptent leur comportement en conséquence. De fait, les malwares sont de plus en plus furtifs et parviennent même pour certains à se comporter de manière totalement transparente pour le système d'exploitation, par exemple en utilisant des fonctions non-documentées ou encore en modifiant directement la mémoire vive utilisée par le système d'exploitation afin d'altérer certaines structures de données pouvant potentiellement les faire repérer. Certains processus cachés (rootkits en anglais), peuvent ainsi agir de manière 3035984 2 malveillante sur le système d'exploitation, à l'insu de celui-ci mais aussi à l'insu des programmes visant à analyser leur comportement. Face à l'augmentation du nombre de programmes malveillants et au perfectionnement de leurs techniques de détection, il existe un besoin pour un procédé 5 d'analyse adapté permettant notamment d'identifier les programmes malveillants furtifs (rootkits en anglais). RESUME DE L'INVENTION La présente invention a ainsi pour objet de pallier au moins un de ces 10 inconvénients. Dans ce contexte, un premier aspect de l'invention concerne un procédé d'analyse d'un programme, caractérisé en ce qu'il comprend les étapes suivantes : capture d'un premier état de la mémoire vive d'une machine virtuelle dans laquelle est installé un système d'exploitation ; 15 lancement de l'exécution dudit programme dans ladite machine virtuelle ; après une durée prédéterminée, capture d'un second état de la mémoire vive de ladite machine virtuelle ; analyse comparative du premier et du second états capturés ; et identification d'au moins un marqueur de compromission dudit programme, 20 en fonction du résultat de l'analyse comparative. Ainsi, l'invention permet l'analyse d'un programme afin de déterminer sa nature malveillante ou non, sans mettre en péril la sécurité de l'ordinateur hôte sur lequel il s'exécute et sans que ce programme s'aperçoive qu'il est analysé. En effet, le programme est exécuté dans une machine virtuelle qui émule 25 les attributs habituels d'un ordinateur tout en isolant le programme des composants essentiels au fonctionnement de l'hôte. De plus, les captures d'état (ou image) de la mémoire vive de la machine virtuelle permettent d'obtenir des informations sur l'état de cette mémoire vive à un instant donné, sans intervention in situ. Ainsi, le comportement du programme et les 30 effets de son exécution ne seront pas modifiés par rapport à un contexte classique d'exécution (par exemple non sécurisé) puisque le programme ne perçoit pas d'intervention particulière lors de la capture de l'état de la mémoire vive. La comparaison de ces captures permet ainsi de rendre compte du comportement et des effets réels du programme afin de repérer la présence de 3035984 3 marqueurs de compromission caractéristiques permettant de conclure sur la nature malveillante du programme. D'autres caractéristiques du procédé selon des modes de réalisation de l'invention sont décrites dans les revendications dépendantes.FIELD OF THE INVENTION The invention is in the field of computer security. It applies, in particular, to the analysis of the behavior of computer programs in order to detect a possible malicious behavior. BACKGROUND OF THE INVENTION To date, there are various tools for analyzing malware (malwares in English), in particular present on external storage media. In addition, when executing a given program on a machine, the RAM (or volatile memory) used by the operating system installed on this machine is modified. In this context, the analysis of this RAM can help to determine the malicious nature or not of the executed program. Such an analysis is usually conducted in a specific test environment called a virtual machine (or sandbox), in which the operating system was previously installed. This environment emulates the different elements of a real computer (processor, RAM, hard disk, network card, etc.) in order to better reproduce the context in which the program runs. It allows you to run it in a controlled environment, without compromising the security of the host computer. Determining the malicious nature of a program usually involves the instrumentation of the virtual machine in which it runs, to allow in situ analysis of the contents of the virtual RAM. However, such an instrumentation is better and better detected by malicious programs, which adapt their behavior accordingly. In fact, malware is increasingly stealthy and even manages to behave in a completely transparent manner for the operating system, for example by using undocumented functions or by directly modifying the RAM used by the operating system. operating system in order to alter some of the data structures that can potentially identify them. Some hidden processes (rootkits in English), can thus act maliciously on the operating system, without the knowledge of it but also without the knowledge of programs to analyze their behavior. In view of the increase in the number of malicious programs and the improvement of their detection techniques, there is a need for a suitable analysis method making it possible, in particular, to identify stealth malicious programs (rootkits in English). SUMMARY OF THE INVENTION The object of the present invention is thus to overcome at least one of these disadvantages. In this context, a first aspect of the invention relates to a method for analyzing a program, characterized in that it comprises the following steps: capture of a first state of the RAM of a virtual machine in which is installed an operating system; Launching the execution of said program in said virtual machine; after a predetermined duration, capturing a second state of the RAM of said virtual machine; comparative analysis of captured first and second states; and identifying at least one compromising marker of said program, based on the result of the comparative analysis. Thus, the invention allows the analysis of a program to determine its malicious nature or not, without jeopardizing the security of the host computer on which it runs and without this program noticing that it is analyzed. Indeed, the program is run in a virtual machine that emulates the usual attributes of a computer while isolating the program from components essential to the operation of the host. In addition, status captures (or image) of the virtual machine's RAM make it possible to obtain information on the state of this random access memory at a given instant, without any in situ intervention. Thus, the behavior of the program and the effects of its execution will not be modified with respect to a conventional execution context (for example unsecured) since the program does not perceive any particular intervention during the capture of the state. of the RAM. The comparison of these catches thus makes it possible to account for the behavior and the real effects of the program in order to identify the presence of 3035984 3 characteristic compromising markers making it possible to conclude on the malicious nature of the program. Other features of the method according to embodiments of the invention are described in the dependent claims.

5 Dans des modes particuliers de réalisation, l'analyse comparative du premier et du second état capturés comprend les étapes suivantes : parcours, au sein desdites captures, de structures de données parmi une liste de structures de données prédéterminées ; et évaluation d'au moins un critère de compromission basé sur une 10 comparaison desdites structures de données entre les premier et second états capturés. Avantageusement, l'analyse comparative ne porte pas sur l'intégralité des structures de données du système mais se focalise sur des structures connues comme sensibles, limitant ainsi la durée de l'analyse.In particular embodiments, the comparative analysis of captured first and second states includes the steps of: traversing, within said captures, data structures from a list of predetermined data structures; and evaluating at least one compromise criterion based on a comparison of said data structures between the first and second captured states. Advantageously, the comparative analysis does not relate to the entirety of the data structures of the system but focuses on structures known as sensitive, thus limiting the duration of the analysis.

15 Dans des modes particuliers de réalisation, le procédé comprend également l'affichage, en fonction dudit au moins un marqueur de compromission identifié, d'un graphe reliant lesdites structures de données parcourues répondant audit critère de compromission évalué. L'affichage d'un tel graphe permet avantageusement d'extraire l'essentiel 20 des informations relatives à l'aspect malveillant du programme analysé. Par exemple, les structures de données prédéterminées présentes dans la liste sont par exemple relatives à des processus, des pilotes, des connexions réseau, des gestionnaires d'objets, ou encore des tables de descripteurs d'interruptions. Le critère de compromission diffère selon la nature des structures de 25 données parcourues. Dans des modes particuliers de réalisation, la durée prédéterminée est égale ou supérieure à la durée d'une phase de décompression du code dudit programme. Ainsi, l'image capturée reflète bien l'état de la mémoire vive après que le 30 programme ait commencé à s'exécuter. Par exemple, la durée prédéterminée est égale à 10 minutes. Dans des modes particuliers de réalisation, les différentes étapes du procédé précité sont déterminées par des instructions de programmes d'ordinateurs. En conséquence, l'invention vise aussi un programme d'ordinateur sur un support d'informations, ce programme étant susceptible d'être mis en oeuvre par un 3035984 4 microprocesseur, ce programme comprenant des instructions adaptées à la mise en oeuvre des étapes du procédé tel que mentionné ci-dessus. Ce programme peut utiliser n'importe quel langage de programmation, et être sous la forme de code source, code objet, ou de code intermédiaire entre code 5 source et code objet, tel que dans une forme partiellement compilée, ou dans n'importe quelle autre forme souhaitable. L'invention vise aussi un support d'informations lisible par un microprocesseur, et comprenant des instructions d'un programme d'ordinateur tel que mentionné ci-dessus.In particular embodiments, the method also comprises displaying, based on said at least one identified compromise marker, a graph connecting said data structures traversed answering said evaluated compromise criterion. The display of such a graph advantageously makes it possible to extract most of the information relating to the malicious aspect of the analyzed program. For example, the predetermined data structures present in the list are for example related to processes, drivers, network connections, object managers, or tables of interrupt descriptors. The compromise criterion differs according to the nature of the data structures traversed. In particular embodiments, the predetermined duration is equal to or greater than the duration of a code decompression phase of said program. Thus, the captured image accurately reflects the state of the RAM after the program has begun to execute. For example, the predetermined duration is equal to 10 minutes. In particular embodiments, the various steps of the aforementioned method are determined by instructions of computer programs. Accordingly, the invention is also directed to a computer program on an information carrier, this program being capable of being implemented by a microprocessor, this program comprising instructions adapted to the implementation of the steps of FIG. process as mentioned above. This program can use any programming language, and be in the form of source code, object code, or intermediate code between source code and object code, such as in a partially compiled form, or in any another desirable form. The invention is also directed to a microprocessor-readable information medium, and including instructions of a computer program as mentioned above.

10 Le support d'informations peut être n'importe quelle entité ou dispositif capable de stocker le programme. Par exemple, le support peut comprendre un moyen de stockage, tel qu'une ROM, par exemple une ROM de microcircuit, ou encore un moyen d'enregistrement magnétique, par exemple un disque dur, ou encore une mémoire flash.The information carrier may be any entity or device capable of storing the program. For example, the medium may comprise storage means, such as a ROM, for example a microcircuit ROM, or a magnetic recording means, for example a hard disk, or a flash memory.

15 D'autre part, le support d'informations peut être un support transmissible tel qu'un signal électrique ou optique, qui peut être acheminé via un câble électrique ou optique, par radio ou par d'autres moyens. Le programme selon l'invention peut être en particulier téléchargé sur une plateforme de stockage d'un réseau de type Internet. Alternativement, le support d'informations peut être un circuit intégré dans 20 lequel le programme est incorporé, le circuit étant adapté pour exécuter ou pour être utilisé dans l'exécution du procédé en question. Le support d'informations et le programme d'ordinateur précités présentent des caractéristiques et avantages analogues au procédé qu'ils mettent en oeuvre.On the other hand, the information medium may be a transmissible medium such as an electrical or optical signal, which may be conveyed via an electrical or optical cable, by radio or by other means. The program according to the invention may in particular be downloaded to a storage platform of an Internet type network. Alternatively, the information carrier may be an integrated circuit in which the program is incorporated, the circuit being adapted to execute or to be used in the execution of the method in question. The above-mentioned information carrier and computer program have characteristics and advantages similar to the method they implement.

25 BREVE DESCRIPTION DES FIGURES D'autres particularités et avantages de l'invention apparaîtront encore dans la description ci-après, illustrée par les figures ci-jointes qui en illustrent des exemples de réalisation dépourvus de tout caractère limitatif. Sur les figures : 30 - La Figure 1 représente sous forme d'organigramme des étapes générales d'un procédé d'analyse conforme à des modes de réalisation de l'invention ; - Les Figures 2 et 3 représentent sous forme d'organigrammes des exemples d'étapes d'analyse comparative, conformes à des modes de réalisation particuliers de l'invention ; 3035984 5 - La Figure 4 est constituée par les Figures 4a, 4b et 4c, avec les Figures 4a et 4b qui illustrent un exemple de données utilisées pour construire le graphe de la Figure 4c pouvant être affiché lors de la mise en oeuvre d'un procédé d'analyse conforme à des modes de réalisation de l'invention.BRIEF DESCRIPTION OF THE FIGURES Other features and advantages of the invention will become apparent in the following description, illustrated by the accompanying figures which illustrate embodiments having no limiting character. In the figures: Figure 1 represents in flowchart form the general steps of an analysis method according to embodiments of the invention; FIGS. 2 and 3 represent, in the form of flow diagrams, examples of comparative analysis steps, in accordance with particular embodiments of the invention; FIG. 4 is constituted by FIGS. 4a, 4b and 4c, with FIGS. 4a and 4b illustrating an example of data used to construct the graph of FIG. 4c that can be displayed during the implementation of a analysis method according to embodiments of the invention.

5 DESCRIPTION DETAILLEE DE L'INVENTION Un des objectifs de l'invention est d'obtenir des informations pertinentes sur le comportement d'un programme suspect, de manière non détectable par celui-ci. Pour ce faire, une analyse dite post-mortem (forensics en anglais) de la mémoire vive 10 est réalisée à l'extérieur de la machine virtuelle (sandbox) dans laquelle le programme suspect s'exécute. Autrement dit, il s'agit de capturer l'état de la mémoire vive du système (c'est-à-dire une image de celle-ci) avant exécution du programme suspect et de le comparer avec un état ultérieur de la mémoire vive de ce système, capturé après le lancement de l'exécution du programme suspect.DETAILED DESCRIPTION OF THE INVENTION One of the objects of the invention is to obtain relevant information on the behavior of a suspicious program, in a manner undetectable by it. To do this, a so-called post-mortem analysis (forensics in English) of the RAM 10 is performed outside the virtual machine (sandbox) in which the suspicious program runs. In other words, it is to capture the state of the RAM system (ie an image thereof) before executing the suspicious program and compare it with a subsequent state of RAM of this system, captured after launching the execution of the suspicious program.

15 Dans ce document, une machine virtuelle est un programme informatique qui, lorsqu'il s'exécute sur un ordinateur hôte, permet d'émuler les ressources matérielles d'un ordinateur virtuel et leur fonctionnement. Cet ordinateur virtuel peut alors, tout comme un ordinateur classique, être doté d'un système d'exploitation, identique ou différent de celui de l'ordinateur hôte, et exécuter tout programme 20 informatique compatible avec ce système d'exploitation. Le fonctionnement de l'ordinateur virtuel est indépendant du fonctionnement du système hôte qu'il ne peut corrompre. Ces techniques de virtualisation sont aujourd'hui bien connues et largement utilisées dans divers domaines de l'informatique. On peut, par exemple, citer les produits de virtualisation fournis par la société VMware (marque déposée).In this document, a virtual machine is a computer program that, when running on a host computer, emulates the hardware resources of a virtual machine and their operation. This virtual machine can then, like a conventional computer, be equipped with an operating system, identical or different from that of the host computer, and execute any computer program compatible with this operating system. The operation of the virtual machine is independent of the operation of the host system that it can not corrupt. These virtualization techniques are now well known and widely used in various fields of computing. For example, the virtualization products provided by VMware (registered trademark) can be cited.

25 Dans les exemples suivants, le système d'exploitation installé dans la machine virtuelle est Windows (marque déposée). Toutefois, l'invention n'est pas limitée à ce système d'exploitation, et peut s'appliquer à d'autres systèmes d'exploitation comme Linux (marque déposée), ou encore Android (marque déposée). A titre d'exemple, des modes de réalisation de la présente invention peuvent 30 notamment permettre de détecter, notamment par l'analyse de la mémoire du système, certains rootkits Linux dont le but est de détourner des appels système en supprimant les routines de gestion de ces appels système. La Figure 1 représente des étapes générales d'un procédé d'analyse selon l'invention.In the following examples, the operating system installed in the virtual machine is Windows (registered trademark). However, the invention is not limited to this operating system, and can be applied to other operating systems such as Linux (registered trademark), or Android (trademark). By way of example, embodiments of the present invention may in particular make it possible to detect, notably by the analysis of the system memory, certain Linux rootkits whose purpose is to divert system calls by suppressing the management routines. of these system calls. Figure 1 shows general steps of an analysis method according to the invention.

3035984 6 Le système Windows gère sa mémoire via certaines structures de données, par exemple les structures EPROCESS pour les processus. Ces structures de données sont généralement chaînées par des pointeurs. Dans l'exemple représenté, un état initial Si de la mémoire vive du 5 système exécuté dans la machine virtuelle est capturé au cours d'une étape EO en utilisant typiquement une fonction de capture de la machine virtuelle. Cette capture est externe à la machine virtuelle et n'a pas d'impact détectable au sein de celle-ci. De manière générale, une telle capture, aussi appelée image ou dump en anglais, permet d'accéder à des structures de données afin par exemple de les parcourir de proche en 10 proche. Il est également possible de les parcourir en détectant leur signature, c'est-à- dire par exemple en repérant une ou plusieurs séquence(s) de bits caractéristique(s) de ces structures. En pratique, à partir de l'image capturée Si, on recherche une structure de données interne du système comprenant des informations sur le processeur en cours 15 d'utilisation par le système. Cette structure comprend également des pointeurs vers d'autres structures d'intérêt telles que les processus actifs, les connexions réseau actives, les pilotes chargés, etc. On cherche à établir une cartographie d'un ensemble de structures bien choisies du système d'exploitation. Ces structures particulières sont choisies car elles comportent des éléments susceptibles d'être modifiés par un 20 programme cherchant à dissimuler son activité. Certaines modifications de ces structures par un programme permettent donc de suspecter un comportement malveillant, elles peuvent caractériser ce qu'on appelle par la suite des marqueurs de compromission. Le choix des structures étudiées est fait préalablement et résulte d'une expertise du fonctionnement du système d'exploitation. A noter que ces structures sont 25 identifiables facilement car ce sont des éléments documentés pour le système Windows. Bien entendu, l'invention n'est pas limitée à ces éléments, comme on le verra par la suite. A titre d'exemple pour le système Windows, un outils appelé Windows debugger permet d'accéder à la structure KPCR (pour Kernel Processor Control 30 Region) à partir de l'image capturée Si. Bien entendu, d'autres moyens peuvent permettre d'accéder à cette structure, comme par exemple le logiciel Volatility développé par la Volatility Foundation. En pratique, la définition de la structure KPCR se trouve par exemple à l'adresse virtuelle OxFFDFF000 de l'image capturée Si, pour les systèmes d'exploitation Windows XP.The Windows system manages its memory via certain data structures, for example EPROCESS structures for processes. These data structures are usually chained by pointers. In the example shown, an initial state Si of the system RAM running in the virtual machine is captured during a step EO typically using a capture function of the virtual machine. This capture is external to the virtual machine and has no detectable impact within it. In general, such a capture, also called an image or dump in English, makes it possible to access data structures in order, for example, to browse them from near to near. It is also possible to browse them by detecting their signature, that is to say for example by identifying one or more characteristic bit sequence (s) of these structures. In practice, from the captured image S1, an internal system data structure of the system including information on the processor being used by the system is sought. This structure also includes pointers to other structures of interest such as active processes, active network connections, loaded drivers, and so on. We seek to map a set of well-chosen structures of the operating system. These particular structures are chosen because they include elements that can be modified by a program seeking to conceal its activity. Some modifications of these structures by a program thus make it possible to suspect a malicious behavior, they can characterize what are called later markers of compromise. The choice of structures studied is made beforehand and results from an expertise of the operation of the operating system. Note that these structures are easily identifiable because they are documented elements for the Windows system. Of course, the invention is not limited to these elements, as will be seen later. As an example for the Windows system, a tool called Windows debugger provides access to the KPCR structure (Kernel Processor Control 30 Region) from the captured image Si. Of course, other means can make it possible to access to this structure, such as the Volatility software developed by the Volatility Foundation. In practice, the definition of the KPCR structure can be found for example at the virtual address OxFFDFF000 of the captured image Si, for Windows XP operating systems.

3035984 7 La structure KPCR comprend un certain nombre d'entrées pouvant comprendre des pointeurs vers d'autres structures. Notamment, les entrées appelées I DT ou PrcbData. ProcessorState.Special Reg isters.Idtr pointent vers une structure 5 représentant une table des descripteurs d'interruptions du système, dont chaque entrée pointe vers une structure appelée KIDTENTRY apportant des informations sur une interruption particulière. Egalement, l'entrée KdVersionBlock de la structure KPCR mène vers la structure DBGKD GET VERSION64, qui elle-même comprend une entrée 10 DebuggerDataList contenant un pointeur vers la structure LIST ENTRY. Le premier élément de cette structure pointe vers la structure KDDEBUGGER DATA64 qui contient de nombreux pointeurs utiles dans le cadre de l'analyse d'un programme suspect selon l'invention, par exemple : l'entrée PsActiveProcessHead pointe vers une structure EPROCESS 15 représentant le premier processus actif d'une liste chaînée de processus actifs liés par des pointeurs ; l'entrée PsLoadedModuleList pointe vers une structure LDR DATA TABLE ENTRY représentant le premier pilote chargé d'une liste chaînée de pilotes chargés liés par des pointeurs ; 20 l'entrée ObpRootDirectoryObject pointe vers une structure Object Directory représentant la racine du gestionnaire d'objet Windows. De retour au procédé de la Figure 1, au cours d'une étape E2, l'exécution d'un programme suspect à analyser est lancée dans la machine virtuelle. En pratique, 25 ce lancement est suivi d'une période d'attente E3. Le but de cette attente est de garantir que le programme étudié ait commencé son exécution. En effet, certains programmes malveillants utilisent des outils d'encodage et/ou de chiffrement appelés packers en anglais permettant d'obfusquer leur code ou une partie de celui-ci et ainsi empêcher ou ralentir une analyse statique et/ou 30 dynamique de celui-ci. Ainsi, avant toute chose, ces programmes malveillants doivent effectuer l'opération inverse de décodage (dé-packing en anglais), avant de pouvoir s'exécuter sur la machine cible (ici la machine virtuelle). L'analyse doit donc débuter après cette phase de décodage pour être pertinente, car généralement c'est seulement après cette phase que commence l'exécution « utile » du programme suspect, c'est-à- 35 dire celle qui produit des effets sur le système.The KPCR structure includes a number of inputs that may include pointers to other structures. In particular, the entries called I DT or PrcbData. ProcessorState.Special Reg isters.Idtr point to a structure 5 representing a table of system interrupt descriptors, each of which points to a structure called KIDTENTRY providing information on a particular interrupt. Also, the KdVersionBlock entry of the KPCR structure leads to the DBGKD GET VERSION64 structure, which itself includes a DebuggerDataList entry containing a pointer to the LIST ENTRY structure. The first element of this structure points to the KDDEBUGGER DATA64 structure which contains many pointers useful in the context of the analysis of a suspicious program according to the invention, for example: the entry PsActiveProcessHead points to an EPROCESS structure representing the first active process of a linked list of active processes linked by pointers; the PsLoadedModuleList entry points to an LDR DATA TABLE ENTRY structure representing the first driver loaded with a linked list of loaded drivers linked by pointers; The ObpRootDirectoryObject entry points to an Object Directory structure representing the root of the Windows object manager. Returning to the method of Figure 1, during a step E2, the execution of a suspicious program to be analyzed is launched in the virtual machine. In practice, this launch is followed by an E3 waiting period. The purpose of this expectation is to ensure that the program under study has begun. Indeed, some malicious programs use encoding and / or encryption tools called packers in English to obfuscate their code or part of it and thus prevent or slow down a static and / or dynamic analysis of it. this. Thus, before anything, these malicious programs must perform the reverse operation of decoding (de-packing in English) before being able to execute on the target machine (here the virtual machine). The analysis must therefore begin after this decoding phase to be relevant, since it is generally only after this phase that the "useful" execution of the suspect program, that is to say the one that produces effects on the program, begins. the system.

3035984 8 La durée de l'étape E3 est donc au moins égale à la durée d'une phase de décompression du code du programme et varie selon le type de « packing » utilisé. En particulier le type des opérations cryptographiques utilisées pour l'obfuscation joue sur le temps de décodage. De plus, certains programmes malveillants ont recours à des 5 temporisations visant à retarder leur exécution. Les inventeurs ont déterminé que la durée du décodage, correspondant ici au décodage proprement dit et à d'éventuelles temporisations initiales, dépasse souvent les 6 minutes. De façon avantageuse, allonger la période d'attente E3 jusqu'à 10 minutes procure des résultats d'autant plus satisfaisants, et ce pour une grande 10 partie des programmes malveillants existants. Bien évidemment, cette durée d'attente pourra être modifiée et adaptée à l'évolution des programmes malveillants. Au cours d'une étape E4, une seconde capture d'état S2 (c'est-à-dire une image) de la mémoire vive est réalisée, toujours à l'extérieur de la machine virtuelle. Cette étape est similaire à l'étape EO de capture de l'état Si. Les deux images 15 capturées Si et S2 sont ainsi comparables. Au cours d'une étape E6, une analyse comparative différentielle des images capturées Si et S2 est mise en oeuvre à l'extérieur de la machine virtuelle. Cette analyse comparative différentielle a notamment pour but d'extraire les éléments des structures de données qui ont été modifiés, ajoutés ou supprimés, au 20 cours du fonctionnement normal du système ou par l'action du programme suspect dont l'exécution a été lancée à l'étape E2, lequel est potentiellement malveillant. En pratique, compte tenu du nombre important de structures accessibles pour une image, seulement certaines structures de données particulières prédéterminées sont parcourues. Ces structures particulières, dont quelques exemples 25 ont été donnés précédemment, représentent par exemple des processus actifs, des pilotes chargés, des descripteurs d'interruption, des connexions actives ou des demandes de connexion (socket). D'autres types de structures de données peuvent être examinés, par exemple relatifs aux appels systèmes, aux clés de registre, ou à d'autres fichiers.The duration of step E3 is therefore at least equal to the duration of a decompression phase of the code of the program and varies according to the type of "packing" used. In particular the type of cryptographic operations used for obfuscation plays on the decoding time. In addition, some malicious programs use timers to delay their execution. The inventors have determined that the duration of the decoding, corresponding here to the decoding itself and to possible initial delays, often exceeds 6 minutes. Advantageously, extending the E3 waiting period to 10 minutes provides all the more satisfactory results for a large part of the existing malware programs. Of course, this waiting time can be modified and adapted to the evolution of malicious programs. During a step E4, a second state capture S2 (that is to say an image) of the random access memory is performed, always outside the virtual machine. This step is similar to the capture step EO of the Si state. The two captured images Si and S2 are thus comparable. During a step E6, a differential comparative analysis of the captured images Si and S2 is implemented outside the virtual machine. The purpose of this differential comparative analysis is, in particular, to extract elements of the data structures that have been modified, added or deleted during the normal operation of the system or by the action of the suspicious program whose execution was launched at step E2, which is potentially malicious. In practice, given the large number of accessible structures for an image, only certain predetermined specific data structures are traversed. These particular structures, some examples of which have been given previously, represent, for example, active processes, loaded drivers, interrupt descriptors, active connections or socket requests. Other types of data structures may be examined, for example relating to system calls, registry keys, or other files.

30 Il s'agit ensuite d'évaluer au moins un critère de compromission basé sur une comparaison desdites structures de données entre les premier et second états de la mémoire vive capturés aux étapes EO et E4. Divers critères de compromissions peuvent être évalués, selon le type de structure examiné. Par exemple, le critère de compromission peut être : 3035984 9 la disparition d'un processus de la liste des processus actifs, alors que ce processus n'est pas indiqué comme terminé, signalant ainsi un processus déchaîné de la liste des processus ; l'apparition inexpliquée d'une tâche (thread en anglais) supplémentaire 5 dans un processus ; le chargement d'un nouveau pilote pourtant absent de la liste des nouveaux pilotes chargés; la disparition d'un pilote de la liste des pilotes chargés, alors que ce pilote n'est pas déchargé, signalant qu'un pilote a été déchaîné de la 10 liste des pilotes ; le déchargement inexpliqué d'un pilote ; l'ouverture inattendue d'un port par un processus suspect; la modification inexpliquée du gestionnaire d'objets, par exemple création d'un sémaphore d'utilisation exclusif d'une ressource (mutex 15 en anglais) ; la modification d'un descripteur d'interruption devenu non intègre. En pratique, pour pouvoir conclure qu'un programme est malveillant, il faut repérer un faisceau d'indices rendant le programme particulièrement suspect. Autrement dit, la vérification de certains critères de compromission peut ne 20 pas suffire pour conclure avec une certitude raisonnable sur le caractère malveillant d'un programme. C'est le cas par exemple de la création d'un port réseau, ou encore d'un mutex. En effet, la création d'un nouveau port réseau (ou d'un mutex) n'est pas, en soi, un évènement préjudiciable ou néfaste pour le système. Toutefois, un tel évènement deviendra critique lorsqu'il est causé par exemple par un processus qui a 25 disparu de la liste des processus actifs, alors que ce processus n'est pas indiqué comme terminé. Ainsi, au cours d'une étape E8, au moins un marqueur de compromission est identifié, en fonction du résultat de l'étape E6. Par exemple, la disparition d'un processus de la liste des processus actifs alors que ce processus n'est pas indiqué 30 comme terminé, permet d'identifier le déchaînage d'un processus de la liste des processus, c'est-à-dire un processus caché. Selon un autre exemple, le chargement d'un nouveau pilote absent de la liste des nouveaux pilotes chargés ou la disparition d'un pilote de la liste des pilotes chargés permet d'identifier un pilote caché. Ainsi, l'analyse des captures successives permet de faciliter la détection 35 des processus cachés (rootkits en anglais), et de déterminer plus précisément le 3035984 10 comportement des programmes malveillants complexes, y compris lorsqu'ils sont furtifs. Au cours d'une étape optionnelle E10, certaines structures parcourues présentant un intérêt particulier, par exemple parce qu'elles vérifient le critère de 5 compromission évalué à l'étape E8, peuvent être affichées sous forme d'un graphe. Cette cartographie a l'avantage de présenter essentiellement les informations pertinentes du système relatives au comportement malveillant du programme analysé. Plusieurs exemples d'analyse comparative, correspondant à l'étape E6 décrite précédemment, vont maintenant être décrits en référence aux Figures 2 et 3.Next, it is a question of evaluating at least one compromise criterion based on a comparison of said data structures between the first and second RAM states captured in steps E0 and E4. Various compromise criteria may be assessed, depending on the type of structure being examined. For example, the compromise criterion may be: the disappearance of a process from the list of active processes, while this process is not indicated as completed, thus signaling a rogue process from the list of processes; the unexplained appearance of an additional thread in a process; loading a new driver, however, missing from the list of new drivers loaded; the disappearance of a pilot from the list of loaded pilots, while this pilot is not unloaded, indicating that a pilot has been unleashed from the list of pilots; Unexplained unloading of a pilot the unexpected opening of a port by a suspicious process; the unexplained modification of the object manager, for example creation of a semaphore of exclusive use of a resource (mutex 15 in English); the modification of an interrupt descriptor that has become unhealthy. In practice, to be able to conclude that a program is malicious, it is necessary to locate a bundle of indices making the program particularly suspect. In other words, the verification of certain criteria of compromise may not be sufficient to conclude with reasonable certainty about the malicious nature of a program. This is the case for example of the creation of a network port, or a mutex. Indeed, the creation of a new network port (or a mutex) is not, in itself, a detrimental or harmful event for the system. However, such an event will become critical when caused for example by a process that has disappeared from the list of active processes, while this process is not indicated as completed. Thus, during a step E8, at least one compromise marker is identified, depending on the result of step E6. For example, the disappearance of a process from the list of active processes while this process is not indicated as completed, makes it possible to identify the unleashing of a process in the list of processes, that is, to say a hidden process. In another example, loading a new driver out of the list of new drivers loaded or missing a driver from the list of loaded drivers can identify a hidden driver. Thus, the analysis of successive captures makes it easier to detect hidden processes (rootkits in English), and to more precisely determine the behavior of complex malicious programs, even when they are stealthy. During an optional step E10, certain traversed structures of particular interest, for example because they verify the compromise criterion evaluated in step E8, can be displayed as a graph. This mapping has the advantage of essentially presenting the relevant information of the system relating to the malicious behavior of the analyzed program. Several examples of comparative analysis, corresponding to step E6 described above, will now be described with reference to FIGS. 2 and 3.

10 La Figure 2 représente un exemple détaillé d'analyse comparative, conforme à un premier mode de réalisation particulier, dans lequel les structures parcourues représentent chacune un processus. Ainsi, on cherche dans cet exemple à déterminer le comportement du programme dont l'exécution a été lancée à l'étape E2 de la Figure 1 par rapport aux 15 processus. Dans cet exemple, ce sont des structures EPROCESS. Chaque structure EPROCESS décrit les principaux paramètres liés à un processus, à savoir l'identifiant du processus (PID), son nom, le nombre de tâches en cours associées, la date et l'heure de création, etc.Figure 2 shows a detailed example of a comparative analysis, according to a first particular embodiment, in which the traversed structures each represent a process. Thus, it is sought in this example to determine the behavior of the program whose execution was started in step E2 of Figure 1 in relation to the processes. In this example, they are EPROCESS structures. Each EPROCESS structure describes the main parameters related to a process, namely the process identifier (PID), its name, the number of associated ongoing tasks, the date and time of creation, etc.

20 Tel qu'évoqué précédemment, une structure EPROCESS peut être accédée via l'entrée PsActiveProcessHead de la structure KDDEBUGGER DATA64, qui comprend un pointeur vers le premier processus (i.e. la première structure EPROCESS) d'une liste chaînée des processus actifs (i.e. des structures EPROCESS). Dans chaque structure EPROCESS, l'entrée ActiveProcessLinks 25 contient un pointeur vers le processus EPROCESS suivant. Ainsi, une liste de processus est obtenue en parcourant ces pointeurs de proche en proche. L'invention n'est pas limitée au pointeur de l'entrée PsActiveProcessHead, et d'autres listes chaînées de processus actifs peuvent être accédées pour le système Windows par d'autres biais.As previously discussed, an EPROCESS structure can be accessed via the PsActiveProcessHead input of the KDDEBUGGER DATA64 structure, which includes a pointer to the first process (ie the first EPROCESS structure) of a linked list of active processes (ie EPROCESS structures). In each EPROCESS structure, the ActiveProcessLinks entry 25 contains a pointer to the next EPROCESS process. Thus, a list of processes is obtained by traversing these pointers step by step. The invention is not limited to the pointer of the PsActiveProcessHead input, and other linked lists of active processes may be accessed for the Windows system by other means.

30 Au cours d'une étape E20, la liste des processus obtenue à partir de la première image capturée Si et la liste des processus obtenue à partir de la deuxième image capturée S2 sont parcourues. Il peut être utile à ce stade de les comparer, afin par exemple d'évaluer un premier critère de compromission. Par exemple, ce critère de compromission correspond à une différence du 35 nombre de tâches (threads en anglais) associées à un processus donné entre les deux 3035984 11 images capturées Si et S2. Ce changement peut être légitime ou non. En effet, une tâche supplémentaire peut signifier que le programme malveillant s'est injecté dans ce processus afin d'être plus furtif. Par ailleurs, comme évoqué précédemment, certains programmes 5 malveillants ont la capacité de supprimer les pointeurs vers les processus modifiés par leur exécution. De telles modifications passent alors inaperçues lors du parcours de la liste chaînée des processus. Or, le fait qu'un processus apparaisse ou non dans la chaîne des EPROCESS n'affecte en rien le fait qu'il soit ordonnancé ou non par le système d'exploitation. Autrement dit, la suppression d'un pointeur ne supprime pas la 10 structure EPROCESS visée par ce pointeur. En effet, si celle-ci disparait de la liste chaînée des processus lorsqu'on les parcourt de proche en proche, il n'en demeure pas moins que le processus correspondant est toujours actif. Dans le but de contourner cette difficulté, au cours d'une étape E22, les structures EPROCESS peuvent être parcourues indépendamment des pointeurs entre 15 ces processus, afin de repérer les éventuelles incohérences existant entre les parcours des images capturées Si et S2. En pratique, la récupération des structures EPROCESS est réalisée par signature. Il s'agit de repérer une ou plusieurs séquence(s) de bits caractéristique(s) des structures EPROCESS dans les images capturées Si et S2. On peut ensuite déterminer une incohérence entre les deux 20 images, par exemple une structure présente dans l'une des images et pas dans l'autre. Ainsi, le fait d'afficher et de parcourir les structures de plusieurs manières différentes puis de la comparer permet de confondre le programme malveillant qui chercherait à cacher certains indicateurs de son caractère malveillant. Au cours d'une étape E24, pour chaque structure EPROCESS présentant 25 une incohérence entre les deux images capturées Si et S2, on détermine si la structure correspond à un processus toujours en cours d'exécution. En pratique, on récupère la valeur d'un code de retour (par exemple dans l'entrée ExitStatus). Puis, si le code de retour indique que le processus n'est plus en cours d'exécution, cela signifie que le processus est simplement terminé.During a step E20, the list of processes obtained from the first captured image Si and the list of processes obtained from the second captured image S2 are scanned. It may be useful at this stage to compare them, for example to evaluate a first criterion of compromise. For example, this compromise criterion corresponds to a difference in the number of tasks (threads in English) associated with a given process between the two captured images Si and S2. This change may be legitimate or not. Indeed, an additional task can mean that the malicious program has injected itself into this process in order to be more stealthy. Moreover, as mentioned above, some malicious programs have the ability to delete pointers to processes modified by their execution. Such changes then go unnoticed while traversing the linked list of processes. However, the fact that a process does or does not appear in the chain of EPROCESS does not affect the fact that it is scheduled or not by the operating system. In other words, deleting a pointer does not remove the EPROCESS structure targeted by this pointer. Indeed, if it disappears from the linked list of processes when browsing step by step, the fact remains that the corresponding process is still active. In order to circumvent this difficulty, during a step E22, the EPROCESS structures can be traversed independently of the pointers between these processes, in order to identify any inconsistencies existing between the trajectories of the captured images Si and S2. In practice, the recovery of structures EPROCESS is carried out by signature. This involves locating one or more characteristic bit sequence (s) of the EPROCESS structures in the captured images Si and S2. It is then possible to determine an inconsistency between the two images, for example a structure present in one of the images and not in the other. Thus, displaying and browsing the structures in several different ways and then comparing it makes it possible to confuse the malicious program that seeks to hide certain indicators of its malicious nature. During a step E24, for each EPROCESS structure having an inconsistency between the two captured images S1 and S2, it is determined whether the structure corresponds to a process that is still running. In practice, the value of a return code is recovered (for example in the ExitStatus entry). Then, if the return code indicates that the process is no longer running, it means the process is just finished.

30 En revanche, si le code de retour indique que le processus est en cours d'exécution, on vérifie si le processus en question apparait dans la liste chaînée correspondant à la capture initiale Si (avant l'exécution du programme à analyser). Si ce processus actif (i.e. identifié par son PID) est absent de la liste chaînée avant exécution du programme, cela indique que le pointeur vers la structure EPROCESS 35 correspondante a été supprimé.On the other hand, if the return code indicates that the process is running, it is checked whether the process in question appears in the linked list corresponding to the initial capture Si (before the execution of the program to be analyzed). If this active process (i.e. identified by its PID) is not in the linked list before program execution, this indicates that the pointer to the corresponding EPROCESS structure has been removed.

3035984 12 Ainsi, au cours de cette étape E24, un autre critère de compromission est évalué, à savoir la disparition d'un processus actif d'une capture d'état à une autre. La vérification de l'ensemble des critères de compromission (ici, le premier critère et le second critère mentionnés ci-avant) permet d'identifier un marqueur de 5 compromission à l'étape E26 du procédé, à savoir un processus caché. En revanche, si par exemple on observe uniquement une modification du nombre de tâches, sans détecter de processus caché (étape E24), l'analyse comparative des images ne permettra pas d'identifier de marqueur de compromission relatif à un processus caché pour ce programme. Dans une première variante de ce 10 premier mode de réalisation, on considère des pilotes (drivers en anglais) à la place des processus. Ainsi, on cherche dans cette variante à déterminer le comportement du programme dont l'exécution a été lancée à l'étape E2 (voir Figure 1) par rapport aux pilotes chargés dans la machine virtuelle. Ces pilotes sont représentés par des structures de type 15 LDR DATA TABLE ENTRY. On accède à la liste chaînée des pilotes chargés via l'entrée PsLoadedModuleList de la structure KDDEBUGGER DATA64. Dans chaque structure LDR DATA TABLE ENTRY, l'entrée InLoadOrderLinks contient un pointeur vers le pilote suivant. Ainsi l'étape d'analyse comparative basée sur les structures représentant des pilotes est très similaire à celle décrite pour les structures 20 EPROCESS. Cette première variante de réalisation permet également de détecter les programmes malveillants furtifs en recherchant les pilotes nouvellement chargés à l'aide de leur signature (et indépendamment des pointeurs), mais n'apparaissant pas comme tel dans la liste chaînée des structures LDR DATA TABLE ENTRY. De même que décrit précédemment pour les processus, il s'agit de repérer les séquences de bits 25 caractéristiques des structures de type LDR DATA TABLE ENTRY à l'intérieur des images capturées. Dans une seconde variante envisageable du premier mode de réalisation décrit ci-dessus, plus haut-niveau dans le système, les structures parcourues au cours de l'analyse comparative appartiennent au gestionnaire d'objets du système. Pour 30 rappel, le gestionnaire d'objets est un arbre de structures utilisé pour stocker l'état de certains objets nécessaires au bon fonctionnement du système d'exploitation. L'étude des nouveaux objets dans ce gestionnaire (ou la modification d'objets existants) peut donner des indications pertinentes sur l'aspect potentiellement malveillant du programme dont l'exécution a été lancée à l'étape E2 (voir Figure 1), comme par 3035984 13 exemple l'apparition d'objets suspects pointant vers un nouveau périphérique, ou encore un nouveau pilote. Tel qu'évoqué précédemment, le gestionnaire d'objet peut être accédé via l'entrée ObpRootDirectoryObject de la structure KDDEBUGGER_DATA64, qui 5 comprend un pointeur vers la structure Object Directory représentant la racine du gestionnaire d'objet Windows. Cette structure racine comprend une entrée pointant vers un tableau HashBuckets listant les éléments de chaque niveau hiérarchique, chacun étant une liste chaînée de structures OBJECT DIRECTORY ENTRY correspondant soit à une feuille de l'arbre ou être de type Directory, auquel cas on peut 10 la parcourir de manière récursive de la même manière que la racine pour au final récupérer tous les éléments du gestionnaire. Dans cette seconde variante, on peut évaluer plusieurs critères de compromission correspondant à l'apparition d'objets (i.e. de structures OBJECT DIRECTORY ENTRY) suspects, pointant par exemple vers un nouveau 15 périphérique, un nouveau pilote ou encore un nouveau Mutex. La vérification de ces critères pourra permettre d'identifier des marqueurs de compromission caractéristiques du caractère malveillant du programme exécuté. A titre d'exemple non limitatif, une nouvelle structure OBJECT DIRECTORY ENTRY pointant vers un objet représentant le périphérique 20 \Device\TCPZ-X85D est repérée au cours de l'analyse comparative entre les images capturées Si et S2 (voir étape E6 de la Figure 1). En l'occurrence, l'apparition de ce nouveau périphérique, qui constitue un marqueur de compromission, est observée suite à l'exécution d'un programme malveillant (étape E2 de la Figure 1) appelé Neeris, qui modifie les paramètres TCP/IP du système pour augmenter le nombre de paquets 25 pouvant être envoyés par seconde, permettant ainsi d'augmenter le taux d'infection lors d'une propagation par exploitation d'une vulnérabilité distante. On observe également l'apparition d'un nouvel objet de type Pilote (voir première variante) nommé \Driver\sysdrv32 ainsi qu'un nouvel objet de type Mutex, nommé LxLXsithwarlordXLxL. A noter que le nom des nouveaux objets peut parfois 30 faciliter le repérage d'un programme malveillant. Selon une troisième variante du premier mode de réalisation, on s'intéresse aux descripteurs d'interruptions représentés par des structures KIDTENTRY accessibles via la table des descripteurs d'interruptions du système, elle-même accessible via des pointeurs compris dans les entrées IDT ou 35 PrcbData.ProcessorState.SpecialRegisters.Idtr de la structure KPCR.Thus, during this step E24, another compromise criterion is evaluated, namely the disappearance of an active process from one state capture to another. The verification of the set of compromise criteria (here, the first criterion and the second criterion mentioned above) makes it possible to identify a compromise marker in step E26 of the method, namely a hidden process. On the other hand, if, for example, only a change in the number of tasks is observed, without detecting a hidden process (step E24), the comparative analysis of the images will not make it possible to identify a compromise marker relating to a hidden process for this program. . In a first variant of this first embodiment, drivers are considered in place of the processes. Thus, it is sought in this variant to determine the behavior of the program whose execution was launched in step E2 (see Figure 1) compared to the drivers loaded in the virtual machine. These drivers are represented by structures of type LDR DATA TABLE ENTRY. The linked list of loaded drivers is accessed via the PsLoadedModuleList entry of the KDDEBUGGER DATA64 structure. In each LDR DATA TABLE ENTRY structure, the InLoadOrderLinks entry contains a pointer to the next driver. Thus the comparative analysis step based on the structures representing pilots is very similar to that described for EPROCESS structures. This first embodiment also makes it possible to detect stealthy malware programs by searching for newly loaded drivers using their signature (and independently of the pointers), but not appearing as such in the linked list of LDR DATA TABLE ENTRY structures. . As described previously for the processes, it is a question of locating the characteristic bit sequences of the LDR DATA TABLE ENTRY type structures inside the captured images. In a second conceivable variant of the first embodiment described above, higher-level in the system, the structures traversed during the comparative analysis belong to the system object manager. For recall, the object manager is a tree of structures used to store the state of some objects necessary for the proper functioning of the operating system. The study of the new objects in this manager (or the modification of existing objects) can give pertinent indications on the potentially malicious aspect of the program whose execution was launched in step E2 (see Figure 1), as for example the appearance of suspicious objects pointing to a new device, or a new driver. As previously discussed, the object manager can be accessed via the ObpRootDirectoryObject of the KDDEBUGGER_DATA64 structure, which includes a pointer to the Object Directory structure representing the root of the Windows object manager. This root structure includes an entry pointing to a HashBuckets array listing the elements of each hierarchical level, each being a linked list of OBJECT DIRECTORY ENTRY structures corresponding to either a tree leaf or Directory type, in which case it can be browse recursively in the same way as the root to finally retrieve all the elements of the manager. In this second variant, it is possible to evaluate several compromising criteria corresponding to the appearance of suspect objects (i.e. OBJECT DIRECTORY ENTRY structures), pointing for example to a new device, a new driver or a new Mutex. Verification of these criteria may identify compromising markers characteristic of the malicious nature of the program being executed. By way of nonlimiting example, a new structure OBJECT DIRECTORY ENTRY pointing to an object representing the device 20 \ Device \ TCPZ-X85D is identified during the comparative analysis between the captured images Si and S2 (see step E6 of FIG. Figure 1). In this case, the appearance of this new device, which constitutes a compromise marker, is observed following the execution of a malicious program (step E2 of Figure 1) called Neeris, which modifies the TCP / IP parameters. of the system to increase the number of packets that can be sent per second, thereby increasing the infection rate when propagating by exploiting a remote vulnerability. We also observe the appearance of a new object of type Pilot (see first variant) named \ Driver \ sysdrv32 and a new object of type Mutex, named LxLXsithwarlordXLxL. Note that the names of new objects can sometimes make it easier to locate a malicious program. According to a third variant of the first embodiment, one is interested in the interrupt descriptors represented by KIDTENTRY structures accessible via the table of system interrupt descriptors, itself accessible via pointers included in the IDT entries or 35 PrcbData.ProcessorState.SpecialRegisters.Idtr of the KPCR structure.

3035984 14 Pour rappel, les descripteurs d'interruptions permettent de prévoir le code qui sera exécuté dans le cas où une telle interruption se produit. Ainsi, ces structures sont particulièrement vulnérables et sont donc souvent visées par les programmes malveillants qui pourront les utiliser pour faire exécuter du code malveillant. De plus, 5 comme le système Windows utilise un modèle pour instancier ces structures, la modification de ce modèle entraine la modification de toutes ces instances. Un critère de compromission à évaluer peut ainsi être par exemple la modification de ce modèle et/ou l'intégrité de certains descripteurs d'interruptions. La modification du modèle constitue un marqueur de compromission 10 lorsqu'elle est initiée par un processus non identifié. De même, en fonction de la nature et de son origine, la modification de certains descripteurs d'interruption peut être identifiée comme un marqueur de compromission. Toutes ou partie de ces variantes peuvent être avantageusement combinées dans certains modes de réalisation augmentant ainsi la fiabilité de la détection des programmes malveillants.As a reminder, the interrupt descriptors make it possible to predict the code that will be executed in the event that such an interruption occurs. Thus, these structures are particularly vulnerable and are therefore often targeted by malicious programs that can use them to run malicious code. In addition, since the Windows system uses a model to instantiate these structures, modifying this model causes all these instances to be modified. A compromise criterion to be evaluated may thus be for example the modification of this model and / or the integrity of certain interrupt descriptors. The modification of the model constitutes a compromise marker when initiated by an unidentified process. Similarly, depending on the nature and its origin, the modification of some interrupt descriptors can be identified as a compromise marker. All or some of these variants may be advantageously combined in some embodiments thus increasing the reliability of malware detection.

15 La Figure 3 représente un exemple détaillé d'analyse comparative (voir étape E6 de la Figure 1), conforme à un second mode de réalisation particulier, dans lequel les structures parcourues sont relatives à l'activité du réseau. Ainsi, on cherche dans cet exemple à déterminer le comportement du programme dont l'exécution a été lancée à l'étape E2 (voir Figure 1) par rapport à 20 l'activité réseau. Par exemple, on détermine si le programme analysé a mis en écoute un port de communication. Sur Windows XP, deux principales structures de données stockent les informations liées au réseau, à savoir les structures ADDRESS OBJECT représentant des prises réseau (Socket en anglais) dès lors qu'une activité réseau est demandée, et 25 les structures TCPT OBJECT lorsque des connexions sont effectivement établies avec un tiers. De même que décrit en référence à l'étape E22 de la Figure 2, cette détection utilise la signature spécifique de ces structures de données, par exemple des séquences de bits caractéristiques de celles-ci. Ainsi, au cours d'une étape E30, les 30 structures de prise réseau et les structures de connexion sont parcourues à partir des images capturées Si et S2. Puis, au cours d'une étape E32, le port local associé à chaque structure ADDRESS OBJECT, correspondant à une prise réseau, récupéré à l'étape E30 est identifié.FIG. 3 represents a detailed example of a comparative analysis (see step E6 of FIG. 1), in accordance with a second particular embodiment, in which the structures traversed relate to the activity of the network. Thus, in this example, it is sought to determine the behavior of the program whose execution was started in step E2 (see FIG. 1) with respect to the network activity. For example, it is determined whether the analyzed program has listened to a communication port. On Windows XP, two main data structures store the network-related information, namely the ADDRESS OBJECT structures representing Sockets when a network activity is requested, and the TCPT OBJECT structures when connections are made. are actually established with a third party. As described with reference to step E22 of FIG. 2, this detection uses the specific signature of these data structures, for example sequences of characteristic bits thereof. Thus, during a step E30, the network tap structures and the connection structures are traversed from the captured images S1 and S2. Then, during a step E32, the local port associated with each ADDRESS OBJECT structure, corresponding to a network jack, recovered in step E30 is identified.

3035984 15 Au cours d'une étape E34, on évalue le critère de compromission suivant : absence du port local récupéré à l'étape E32 des structures de connexion parcourues à l'étape E30. Ce critère est notamment vérifié lorsque le port est sur écoute, ce qui permet d'identifier le port comme un marqueur de compromission (étape E36).During a step E34, the following compromise criterion is evaluated: absence of the local port recovered in step E32 of the connection structures traversed in step E30. This criterion is verified in particular when the port is tapped, which makes it possible to identify the port as a compromise marker (step E36).

5 La Figure 4 constituée des Figures 4a, 4b et 4c illustre un exemple de données pour construire le graphe affiché à l'issu de l'analyse comparative, tel que décrit en référence à l'étape E10 de la Figure 1, ainsi que le graphe lui-même. L'état d'une machine virtuelle est stocké sous la forme d'une suite de structures Python contenant le détail (nom, valeur) des différents champs et de liens 10 entre ces structures, sous forme sérialisée (module cPickle de Python). Un des objectifs de l'invention étant de réaliser une cartographie visuelle de la mémoire vive ou du moins des structures de données parcourues répondant aux critères de compromission évalués décrit précédemment, chaque pointeur entre structures pertinentes est sauvegardé dans une chaîne de caractères respectant le 15 langage de description de graphes DOT. La Figure 4a représente la liste ordonnée 40 des liens récupérés depuis la lecture de la structure KPCR 41, en passant par le parcours complet 42 de la liste chaînée des processus (structures EPROCESS), jusqu'à la boucle finale sur l'entrée PsActiveProcessHead.FIG. 4 consisting of FIGS. 4a, 4b and 4c illustrates an example of data for constructing the graph displayed at the end of the comparative analysis, as described with reference to step E10 of FIG. 1, as well as FIG. graph itself. The state of a virtual machine is stored as a sequence of Python structures containing the detail (name, value) of the various fields and links 10 between these structures, in serialized form (Python cPickle module). One of the objectives of the invention being to perform a visual mapping of the random access memory or at least of the scanned data structures satisfying the evaluated compromise criteria previously described, each pointer between relevant structures is saved in a string of characters respecting the language. description of DOT graphs. Figure 4a shows the ordered list 40 of the links retrieved from the reading of the KPCR structure 41, through the complete path 42 of the linked list of processes (EPROCESS structures), to the final loop on the PsActiveProcessHead input.

20 Sur la Figure 4b apparait un exemple de structure 45 stockant les informations sur un processus. La structure 45 comprend notamment le nom 46 du processus, la structure correspondante 47, ainsi qu'un commentaire 48 indiquant qu'il s'agit d'un processus caché. La Figure 4c représente le graphe correspondant aux données des 25 Figures 4a et 4b, notamment la liste chaînée 42 des processus (structures EPROCESS) de la Figure 4a. Sur cette figure, les processus actifs chaînés sont référencés 400-1 à 400- 15. Les lignes en grisé référencées 415-1 à 415-8 correspondent à des éléments nouveaux ou modifiés détectés lors de l'analyse comparative des captures de la 30 mémoire vive. En l'occurrence, la mise en oeuvre de l'analyse comparative a permis de repérer deux processus particuliers 420 et 425, qui n'apparaissent pas dans la liste chaînée des processus 400-1 à 400-15. Le processus 420 est absent de la liste chaînée des processus actifs, mais dans cet exemple, ce processus est terminé. Par conséquent, il ne s'agit pas d'un 3035984 16 processus caché. Cette information est affichée en commentaire dans la ligne grisée du processus 420. En revanche, le processus 425, qui est également absent de la liste des processus actifs, n'est pas indiqué comme terminé dans le champ pertinent. En 5 conséquence, il s'agit d'un marqueur de compromission du programme, représentant un processus caché. Cette information est affichée en commentaire dans la ligne grisée du processus 425. Au vu de ces marqueurs affichés, il est désormais possible de conclure quant à la nature malveillante du programme exécuté.Figure 4b shows an example of structure 45 storing information about a process. The structure 45 includes the name 46 of the process, the corresponding structure 47, and a comment 48 indicating that it is a hidden process. Figure 4c shows the graph corresponding to the data of Figures 4a and 4b, including the chained list of processes (EPROCESS structures) of Figure 4a. In this figure, the chained active processes are referenced 400-1 to 400-15. The gray lines referenced 415-1 to 415-8 correspond to new or modified elements detected during the comparative analysis of the captures of the memory. vivid. In this case, the implementation of the comparative analysis made it possible to identify two particular processes 420 and 425, which do not appear in the linked list of processes 400-1 to 400-15. Process 420 is missing from the linked list of active processes, but in this example, this process is complete. Therefore, it's not a hidden process. This information is displayed as a comment in the shaded line of process 420. In contrast, process 425, which is also absent from the list of active processes, is not indicated as completed in the relevant field. As a result, it is a compromise marker of the program, representing a hidden process. This information is displayed as a comment in the gray line of process 425. In view of these displayed markers, it is now possible to conclude as to the malicious nature of the program executed.

10 De plus, l'invention permet d'obtenir des informations pertinentes sur le comportement du programme et notamment sur ses effets, comme par exemple la création d'un mutex particulier ou d'un port, notamment grâce au faisceau d'indices révélé lors de l'analyse comparative. Avantageusement, de telles informations sont mises à disposition de 15 l'utilisateur d'une manière intelligente, puisqu'elles ne sont pas noyées dans la masse des informations relatives au fonctionnement normal du système (création classique de ports, de mutex, de processus, fermeture de processus, etc.). Par exemple, le processus 425 affiché à côté du graphe sur la Figure 4c permet de d'indiquer clairement des caractéristiques du malware correspondant, à 20 savoir le fait que son exécution a pour effet de créer un processus caché, car il est encore à l'état actif bien que non présent dans la liste chainée des processus. Les exemples qui précèdent ne sont que des modes de réalisation de l'invention qui ne s'y limite pas. Tous ces modes de réalisation sont librement combinables entre eux. 25In addition, the invention makes it possible to obtain relevant information on the behavior of the program and in particular on its effects, such as for example the creation of a particular mutex or a port, notably thanks to the cluster of indices revealed during comparative analysis. Advantageously, such information is made available to the user in an intelligent manner, since it is not embedded in the mass of information relating to the normal operation of the system (conventional creation of ports, mutex, process, process closure, etc.). For example, the process 425 displayed next to the graph in FIG. 4c makes it possible to clearly indicate the characteristics of the corresponding malware, namely the fact that its execution has the effect of creating a hidden process, since it is still in the process. active state although not present in the chained list of processes. The foregoing examples are only embodiments of the invention which is not limited thereto. All these embodiments are freely combinable with each other. 25

Claims

REVENDICATIONS1. Procédé d'analyse d'un programme, caractérisé en ce qu'il comprend les étapes suivantes : capture (E0) d'un premier état de la mémoire vive d'une machine virtuelle dans laquelle est installé un système d'exploitation ; lancement (E2) de l'exécution dudit programme dans ladite machine virtuelle ; après une durée prédéterminée (E3), capture (E4) d'un second état de la mémoire vive de ladite machine virtuelle ; analyse comparative (E6) du premier et du second états capturés ; et identification (E8) d'au moins un marqueur de compromission dudit programme, en fonction du résultat de l'analyse comparative.REVENDICATIONS1. A method of analyzing a program, characterized in that it comprises the following steps: capture (E0) of a first state of the RAM of a virtual machine in which an operating system is installed; launching (E2) the execution of said program in said virtual machine; after a predetermined duration (E3), capturing (E4) a second state of the RAM of said virtual machine; comparative analysis (E6) of captured first and second states; and identifying (E8) at least one compromising marker of said program, based on the result of the comparative analysis.

2. Procédé d'analyse selon la revendication 1, caractérisé en ce que ladite analyse comparative (E6) du premier et du second état capturés comprend les étapes suivantes : parcours, au sein desdites captures, de structures de données parmi une liste de structures de données prédéterminées ; et évaluation d'au moins un critère de compromission basé sur une comparaison desdites structures de données entre les premier et second états capturés.2. Analysis method according to claim 1, characterized in that said comparative analysis (E6) of the first and second captured state includes the following steps: traversing within said captures, data structures from a list of structures of predetermined data; and evaluating at least one compromise criterion based on a comparison of said data structures between the first and second captured states.

3. Procédé d'analyse selon la revendication 2, caractérisé en ce qu'il comprend l'affichage (El 0), en fonction dudit au moins un marqueur de compromission identifié, d'un graphe reliant lesdites structures de données parcourues répondant audit critère de compromission évalué.3. Analysis method according to claim 2, characterized in that it comprises the display (El 0), as a function of said at least one identified compromise marker, of a graph connecting said data structures traversed answering said criterion. of compromise assessed.

4. Procédé d'analyse selon l'une quelconque des revendications 2 à 3, caractérisé en ce que ladite liste de structures de données prédéterminées comprend des structures de données relatives à certains éléments de la liste suivante : processus, pilotes, connexions réseau, gestionnaires d'objets, tables de descripteurs d'interruptions. 3035984 184. Analysis method according to any one of claims 2 to 3, characterized in that said list of predetermined data structures comprises data structures relating to certain elements of the following list: processes, drivers, network connections, managers of objects, tables of descriptors of interrupts. 3035984 18

5. Procédé d'analyse selon l'une quelconque des revendications 1 à 4, caractérisé en ce que la durée prédéterminée correspond à la durée d'une phase de décompression du code dudit programme. 55. Analysis method according to any one of claims 1 to 4, characterized in that the predetermined duration corresponds to the duration of a decompression phase of the code of said program. 5

6. Procédé d'analyse selon la revendication 5, caractérisé en ce que la durée prédéterminée est égale à 10 minutes.6. Analysis method according to claim 5, characterized in that the predetermined duration is equal to 10 minutes.

7. Procédé d'analyse selon l'une quelconque des revendications 1 à 6, caractérisé en ce que ledit au moins un marqueur de compromission est un processus 10 caché ou un pilote caché.7. Analysis method according to any one of claims 1 to 6, characterized in that said at least one compromise marker is a hidden process or hidden driver.

8. Programme d'ordinateur comprenant des instructions pour la mise en oeuvre d'un procédé selon l'une quelconque des revendications 1 à 7, lorsqu'il est chargé et exécuté par un microprocesseur. 158. A computer program comprising instructions for carrying out a method according to any of claims 1 to 7, when loaded and executed by a microprocessor. 15

9. Support d'informations lisible par un microprocesseur, comprenant les instructions d'un programme d'ordinateur pour mettre en oeuvre un procédé selon l'une quelconque des revendications 1 à 7. 209. A microprocessor-readable information medium comprising the instructions of a computer program for carrying out a method according to any one of claims 1 to 7.