CN111769933A - Method and device for monitoring file change, electronic equipment and storage medium - Google Patents

Method and device for monitoring file change, electronic equipment and storage medium Download PDF

Info

Publication number
CN111769933A
CN111769933A CN202010615999.6A CN202010615999A CN111769933A CN 111769933 A CN111769933 A CN 111769933A CN 202010615999 A CN202010615999 A CN 202010615999A CN 111769933 A CN111769933 A CN 111769933A
Authority
CN
China
Prior art keywords
file
directory
hash value
basic information
traversed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010615999.6A
Other languages
Chinese (zh)
Inventor
温卓然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Topsec Technology Co Ltd
Beijing Topsec Network Security Technology Co Ltd
Beijing Topsec Software Co Ltd
Original Assignee
Beijing Topsec Technology Co Ltd
Beijing Topsec Network Security Technology Co Ltd
Beijing Topsec Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Topsec Technology Co Ltd, Beijing Topsec Network Security Technology Co Ltd, Beijing Topsec Software Co Ltd filed Critical Beijing Topsec Technology Co Ltd
Priority to CN202010615999.6A priority Critical patent/CN111769933A/en
Publication of CN111769933A publication Critical patent/CN111769933A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • H04L9/0643Hash functions, e.g. MD5, SHA, HMAC or f9 MAC
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Power Engineering (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a method and a device for monitoring file change, electronic equipment and a storage medium, and belongs to the technical field of network communication. The method comprises the following steps: obtaining and scanning a current directory to be traversed from a directory queue to obtain file information corresponding to each directly-affiliated file under the current directory, wherein the file information comprises: storing the path and the basic information; for each file, calculating a hash value of the file according to the storage path of the file; and comparing the hash value and the basic information of the file with the hash value and the basic information of the file in the hash table stored in the previous period to determine the change type of the file. And calculating a hash value through traversing the storage path in the obtained file information, and comparing the hash value and the basic information of the file with the hash value and the basic information of the file in the hash table stored in the previous period, so that the change type of the file can be quickly determined, and the CPU is small in occupation.

Description

Method and device for monitoring file change, electronic equipment and storage medium
Technical Field
The application belongs to the technical field of network communication, and particularly relates to a method and device for monitoring file change, electronic equipment and a storage medium.
Background
With the development of networks, computers are organized and enterprises and public institutions play irreplaceable roles when entering thousands of households, and network file exchange also becomes a common mode of information exchange. In the network environment with strict requirements on network security, such as an organization or a bank, the network is often divided into different security levels, and files in different security level areas cannot be directly transmitted, so that a file synchronization function is required, changes (addition, deletion and modification) of the files can be automatically found, and then the changed files are automatically transmitted from a network source end with a low security level to a network destination end with a high security level. The network of the source terminal and the destination terminal is not directly connected, so that the security is higher.
There are two main methods for finding file changes: firstly, a program automatically scans file directories needing synchronization, the scanning algorithm is a conventional recursion algorithm, after all sub-directories under a target directory are recursively scanned, files are stored, and then the files are compared with file information stored in the previous period, so that file changes are found. Second, the program automatically scans the file directories that need to be synchronized, calculates the summary information for all files using an Algorithm similar to MD5(Message-digest Algorithm5, fifth version of Message summary Algorithm), saves the summary information for all files, scans the file directories at intervals, recalculates the file summary information, and compares the new and old summary information to find file changes.
The first method has the defects that the scanning of the directories and the comparison of the file changes are performed in series, the service time is long, and the scanning of the directories is performed in series from directory to directory, so that the service time is long. The second method has the disadvantages that MD5 digest information of all files is calculated, the CPU (Central Processing Unit) resource is large, the time spent is long, and if many files exist in the scanned directory, the time spent for calculating the document digest is several minutes.
Disclosure of Invention
In view of this, an object of the present application is to provide a method, an apparatus, an electronic device and a storage medium for monitoring file changes, so as to solve the problem that file changes cannot be found in time and quickly in the existing method.
The embodiment of the application is realized as follows:
in a first aspect, an embodiment of the present application provides a method for monitoring file changes, including: obtaining and scanning a current directory to be traversed from a directory queue to obtain file information corresponding to each directly-affiliated file under the current directory, wherein the file information comprises: storing the path and the basic information; for each file, calculating a hash value of the file according to the storage path of the file; and comparing the hash value and the basic information of the file with the hash value and the basic information of the file in the hash table stored in the previous period to determine the change type of the file. In the embodiment of the application, the hash value is calculated by traversing the storage path in the obtained file information, and the hash value and the basic information of the file are compared with the hash value and the basic information of the file in the hash table stored in the previous period, so that the change type of the file can be quickly determined, and meanwhile, the CPU is small in occupation.
With reference to a possible implementation manner of the embodiment of the first aspect, after comparing the hash value and the basic information of the file with the hash value and the basic information of the file in the hash table stored in the previous cycle, the method further includes: and storing the hash value and the basic information of the file in a hash table of the current period, or storing the file information of the file in a linked list of the current period. In the embodiment of the application, after the hash value and the basic information of the file are compared with those of the file in the hash table stored in the previous period, the hash value and the basic information of the file are stored in the hash table in the current period, or the file information of the file is stored in the linked list in the current period, so that the file can be continuously repeated, and the change of the file can be quickly positioned according to the latest hash table or linked list.
With reference to a possible implementation manner of the embodiment of the first aspect, comparing the hash value and the basic information of the file with the hash value and the basic information of the file in the hash table stored in the previous cycle, and determining the change type of the file includes: if the corresponding hash value cannot be found in the hash table stored in the last period according to the hash value of the file, determining that the change type of the file is newly increased; if the corresponding hash value is found in the hash table stored in the previous period according to the hash value of the file, comparing the basic information of the file with the basic information of the file in the hash table stored in the previous period, if the basic information of the file is different from the basic information of the file in the previous period, determining that the change type of the file is modified, and if the basic information of the file is the same as the basic information of the file in the previous period, determining that the change type of the file is unchanged, wherein the basic information comprises the size of the file and/or the last modification time. In the embodiment of the application, whether the file is newly added or not can be quickly determined by searching the hash table, so that the consumed time is short; and whether the file is modified or not is quickly determined according to basic information (such as the size of the file and/or the last modification time), so that the CPU resource consumption is low, and the efficiency is high.
With reference to a possible implementation manner of the embodiment of the first aspect, if a corresponding hash value is found in the hash table stored in the previous cycle according to the hash value of the file, the method further includes: deleting the hash value and the basic information corresponding to the file in the hash table stored in the previous period; after all the directories to be traversed are traversed, the remaining files in the hash table stored in the previous period are the deleted files in the current period; or deleting the file information of the file in the linked list stored in the previous period; after all the directories to be traversed are traversed, the remaining files in the linked list stored in the previous period are the deleted files in the current period. In the embodiment of the application, the deleted file can be quickly determined by maintaining the two linked lists (or hash tables), so that the file discovery efficiency is further improved.
With reference to a possible implementation manner of the embodiment of the first aspect, before obtaining and scanning the current directory to be traversed from the directory queue, the method further includes: and acquiring and scanning a root directory to be traversed, and adding all directly subordinate subdirectories obtained by scanning the root directory into the directory queue as directories to be traversed. In the embodiment of the application, all the directly-affiliated subdirectories obtained by scanning the root directory are added into the directory queue as the directory to be traversed, so that the multi-thread control is facilitated, a plurality of members are acquired from the directory queue for traversing in a multi-thread mode in the subsequent traversing process, and the time for finding file changes is greatly reduced.
With reference to a possible implementation manner of the embodiment of the first aspect, after obtaining and scanning the current directory to be traversed from the directory queue, the method further includes: and adding all the directly subordinate subdirectories obtained by scanning the current directory into a directory queue as the directory to be traversed. In the embodiment of the application, all the directly-affiliated subdirectories obtained by scanning the current directory are added into the directory queue as the directory to be traversed, so that the multi-thread control is facilitated, a plurality of members are acquired from the directory queue for traversing in a multi-thread mode in the subsequent traversing process, and the time for finding file changes is greatly reduced.
With reference to a possible implementation manner of the embodiment of the first aspect, if the number of directories to be traversed in the directory queue is multiple, acquiring and scanning a current directory to be traversed from the directory queue, including: and acquiring and scanning the current directory to be traversed from the directory queue in a multithreading mode, wherein each thread corresponds to one current directory one by one. In the embodiment of the application, a multithreading mode is adopted to obtain a plurality of members from the directory queue for traversal, so that the time for finding file change is greatly reduced.
In a second aspect, an embodiment of the present application further provides a device for monitoring file changes, including: the device comprises a scanning module, a calculating module and a determining module; the scanning module is configured to acquire and scan a current directory to be traversed from a directory queue to obtain file information corresponding to each directly subordinate file in the current directory, where the file information includes: storing the path and the basic information; the computing module is used for computing the hash value of each file according to the storage path of the file; and the determining module is used for comparing the hash value and the basic information of the file with the hash value and the basic information of the file in the hash table stored in the previous period to determine the change type of the file.
In a third aspect, an embodiment of the present application further provides an electronic device, including: a memory and a processor, the processor coupled to the memory; the memory is used for storing programs; the processor is configured to invoke a program stored in the memory to perform the method according to the first aspect embodiment and/or any possible implementation manner of the first aspect embodiment.
In a fourth aspect, embodiments of the present application further provide a storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the method provided in the foregoing first aspect and/or any one of the possible implementation manners of the first aspect.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the embodiments of the application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and drawings.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts. The foregoing and other objects, features and advantages of the application will be apparent from the accompanying drawings. Like reference numerals refer to like parts throughout the drawings. The drawings are not intended to be to scale as practical, emphasis instead being placed upon illustrating the subject matter of the present application.
Fig. 1 shows a flowchart of a method for monitoring file changes according to an embodiment of the present application.
Fig. 2 is a schematic diagram illustrating a principle of thread pool control according to an embodiment of the present application.
Fig. 3 is a schematic flowchart illustrating a directory traversal process provided in an embodiment of the present application.
Fig. 4 shows a block diagram of a device for monitoring file changes according to an embodiment of the present application.
Fig. 5 shows a schematic structural diagram of an electronic device provided in an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, relational terms such as "first," "second," and the like may be used solely in the description herein to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
Further, the term "and/or" in the present application is only one kind of association relationship describing the associated object, and means that three kinds of relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone.
In view of the defects of the existing method for finding document changes, the embodiment of the present application provides a method for monitoring document changes, as shown in fig. 1, and the following describes the method for monitoring document changes provided by the embodiment of the present application with reference to fig. 1.
Step S101: and acquiring and scanning a current directory to be traversed from the directory queue to obtain file information corresponding to each directly-affiliated file under the current directory.
Obtaining and scanning a current directory to be traversed from a directory queue to obtain file information corresponding to each directly-affiliated file under the current directory, wherein the file information corresponding to each directly-affiliated file comprises: the path and the basic information are stored. The basic information may include the file size and/or the last modification time (save time), but may also include other information such as the file name.
If the number of the directories to be traversed in the directory queue is multiple, the current directory to be traversed can be obtained and scanned from the directory queue in a multithreading mode, and each thread corresponds to one current directory one by one. The number of threads of the multithreading cannot exceed the maximum number of threads allowed by the system, for example, the maximum number of threads is 5, and assuming that the number of directories to be traversed in the directory queue is 10, only 5 directories can be taken from the directory queue for traversal at one time. In order to control the thread number, the thread number can be controlled by a simple thread pool mechanism, and the threads are scheduled at the same time. When the directory to be traversed is scanned to obtain the subdirectories under the directory, the sub-directories are not scanned directly by sub-threads, but paths of the sub-directories to be traversed form a directory queue, namely when the traversal directory encounters the subdirectories, the paths of the sub-directories are added into the directory queue, and a global variable is maintained to represent the number of running threads. Starting a sub-thread, adding 1 to the number of threads, ending a sub-thread, subtracting 1 from the number of threads, starting a sub-thread when the main thread scans, and taking charge of starting the number of the sub-threads and controlling the number of the threads, which is called as a control thread. If the number of the threads which are running is judged to be less than the maximum number of the threads to be controlled, and the directory queue is not empty, a member (namely a subdirectory) is taken out from the directory queue, and a thread is started to scan the directory. Until the directory queue is empty or the number of threads is equal to the maximum number of threads to be controlled, wait for 1s, for example, and determine whether the threads can be restarted until the directory queue has no members and the number of threads being run is 0 (indicating that all directories are scanned completely), which is schematically shown in fig. 2. Of course, the current directory to be traversed may also be obtained and scanned from the directory queue in a single-thread manner.
After acquiring and scanning the current directory to be traversed from the directory queue, the method further comprises the following steps: and adding all the directly sub-directories obtained by scanning the current directory into the directory queue as the directories to be traversed, namely when the current directory is scanned, if the current directory also comprises the sub-directories, adding all the directly sub-directories obtained by scanning the current directory into the directory queue as the directories to be traversed. For example, taking the current directory as the first-layer sub-directory as an example, 2 directly-affiliated sub-directories (second-layer sub-directories) are included under the directory, and then the scanned 2 directly-affiliated sub-directories are added into the directory queue, so as to subsequently continue to acquire and scan the directory to be traversed from the directory queue.
As an embodiment, before obtaining and scanning the current directory to be traversed from the directory queue, the method further includes: and acquiring and scanning a root directory (the uppermost directory) to be traversed, and adding all directly subordinate subdirectories obtained by scanning the root directory into the directory queue as the directory to be traversed. In this embodiment, the root directory to be traversed may not be obtained from the directory queue any more, but after the user specifies the root directory to be traversed, the root directory is directly scanned, all the directly-owned subdirectories obtained by scanning the root directory are added into the directory queue as the directories to be traversed, and the subsequent traversal obtains and scans the directories to be traversed from the directory queue.
Step S102: and calculating the hash value of each file according to the storage path of the file.
For each file, the hash value for the file is computed based on the storage path (containing the file name, e.g., C:/test/1.txt) of the file. The algorithm used for calculating the corresponding hash value according to the storage path of the file may be a hash algorithm commonly used at present.
Step S103: and comparing the hash value and the basic information of the file with the hash value and the basic information of the file in the hash table stored in the previous period to determine the change type of the file.
After the hash value of the file is obtained through calculation according to the storage path of the file, the hash value and the basic information of the file are compared with the hash value and the basic information of the file in the hash table stored in the previous period, and then the change type (new addition, modification or no change) of the file can be determined.
If the corresponding hash value cannot be found in the hash table stored in the last period according to the hash value of the file, determining that the change type of the file is newly increased; if the corresponding hash value is found in the hash table stored in the previous period according to the hash value of the file, comparing the basic information of the file with the basic information of the file in the hash table stored in the previous period, if the basic information of the file is different from the basic information of the file in the previous period, determining that the change type of the file is modified, and if the basic information of the file is the same as the basic information of the file in the previous period, determining that the change type of the file is unchanged. The basic information includes a file size and/or a last modification time, and if a user modifies the file, the modification time or the file size may be changed.
Meanwhile, in order to facilitate to know which files in the hash table stored in the previous cycle are deleted in the current cycle, in an embodiment, if a corresponding hash value is found in the hash table stored in the previous cycle according to the hash value of the file, the method further includes: deleting the hash value and the basic information corresponding to the file in a hash table stored in the previous period; after all the directories to be traversed are traversed, the remaining files in the hash table stored in the previous period are the deleted files in the current period. Meanwhile, in order to facilitate the traversal of the subsequent period (the next period), after comparing the hash value and the basic information of the file with the hash value and the basic information of the file in the hash table stored in the previous period, the method further comprises: and storing the hash value and the basic information of the file in a hash table of the current period, so that when the root directory is traversed in the next period, the hash value and the basic information of the file obtained by traversing are compared with the hash value and the basic information of the file in the hash table stored in the period, and the file change condition in the next period is determined. In this embodiment, two hash tables (e.g., hash table a and hash table B) are maintained at the same time, if a hash value corresponding to a file is found in the hash table of the previous cycle, the file is deleted from the hash table a, the hash value and basic information of the file are added to the linked list B of the current cycle, and finally, a complete directory is scanned, and the remaining files in the hash table a are deleted files; at the next scanning moment, if the hash value corresponding to the file is found in the hash table, deleting the file from the linked list B, and adding the hash value and the basic information of the file into the linked list A, thus repeating the steps in a circulating way.
In another embodiment, if a corresponding hash value is found in the hash table stored in the previous cycle according to the hash value of the file, the method further includes: deleting the file information of the file in the linked list stored in the previous period; after all the directories to be traversed are traversed, the remaining files in the linked list stored in the previous period are the deleted files in the current period. In this embodiment, considering that the hash table is usually large, for example, the hash table may include thousands of headers, if the directory to be traversed has only tens of directories, there are hundreds of empty headers corresponding to the hash table, and when determining the file deleted in the current period by traversing the remaining files in the hash table, the empty headers are also traversed, so that, in order to obtain the file deleted in the current period more quickly, in an embodiment, the file deleted in the current period is located quickly by a linked list. Meanwhile, in order to determine the deleted file in the subsequent period (the next period), after comparing the hash value and the basic information of the file with the hash value and the basic information of the file in the hash table stored in the previous period, the method further includes: and storing the file information of the file in a linked list of the current period. That is, in this embodiment, two linked lists (e.g., linked list a and linked list B) are maintained at the same time, if a hash value corresponding to a file is found in the hash table, the file is deleted from the current linked list a, and the file information of the file is added to the linked list B, and finally, the complete directory is scanned, and the remaining files in the linked list a are deleted files; at the next scanning moment, if the hash value corresponding to the file is found in the hash table, deleting the file from the linked list B, adding the file information of the file into the linked list A, and repeating in this way.
In this embodiment, the hash value and the basic information of the file in the hash table stored in the previous cycle are also updated in real time, for example, if it is determined that the change type of the file is new, the hash value and the basic information of the file are added to the hash table in the hash table; if the change type of the file is determined to be modified, replacing the basic information corresponding to the file in the hash table with the latest basic information; and after determining the deleted files in the current period by traversing the linked list, correspondingly deleting the hash value and the basic information of the deleted files in the hash table, thereby obtaining the updated hash table.
The period may be set according to a user requirement, and the time for traversing the directory is not counted in the period, for example, the traversal period is set to 10s, and if 5s is needed after traversing the directory, the next traversal is performed every 10s after traversing the directory.
The method comprises the steps of traversing file directories at regular intervals to obtain file information of all files in the file directories, calculating a hash value based on a storage path in the file information, checking a hash table, determining that the file is a newly added file if no corresponding record exists in the hash table, and comparing the size and/or the last modification time of the file with the size and/or the last modification time of the file in the hash table if the corresponding record exists in the hash table to determine whether the file is modified. Meanwhile, by maintaining 2 linked lists (or hash tables), if a hash value corresponding to a file is found in the hash table, the file is deleted from the current linked list A, the file information of the file is added into the linked list B, and finally, the complete directory is scanned, and the remaining files in the linked list A are the deleted files. In addition, the basic scan traverses directory and reference file information using multiple threads, further reducing time.
For ease of understanding, the above traversal process is described in conjunction with the flowchart shown in fig. 3. (1) Scanning and traversing the root directory to obtain file information of all directly subordinate files under the root directory, calculating a hash value according to a storage path in the file information, then, checking a hash table, if the hash table has no corresponding record, determining that the file is a newly added file, and if the hash table has a corresponding record, further comparing the size and/or the last modification time of the file with the size and/or the last modification time of the file in the hash table, and determining whether the file is modified. And meanwhile, maintaining 2 linked lists, if a hash value corresponding to the file is found in the hash table, deleting the file from the current linked list A, and adding the file information of the file into the linked list B. If there are directly related subdirectories, all directly related subdirectories are added into the directory queue, and a child thread is enabled to traverse the subdirectories in the directory queue. (2) And starting a child thread to traverse the subdirectory to obtain file information of all directly subordinate files in the current directory, calculating a hash value according to a storage path in the file information, then checking a hash table, if the hash table does not have a corresponding record, determining that the file is a newly added file, and if the hash table has a corresponding record, further comparing the size and/or the last modification time of the file with the size and/or the last modification time of the file in the hash table, and determining whether the file is modified. And meanwhile, maintaining 2 linked lists, if a hash value corresponding to the file is found in the hash table, deleting the file from the current linked list A, and adding the file information of the file into the linked list B. If there are directly related subdirectories, all directly related subdirectories are added into the directory queue, and a child thread is enabled to traverse the subdirectories in the directory queue. And (5) continuously circulating the step (2) by starting the child thread until the files under all the subdirectories are scanned. (3) And finally, traversing the linked list A to obtain the deleted file information.
The embodiment of the present application further provides a device 100 for monitoring file changes, as shown in fig. 4. The monitoring file change apparatus 100 includes: a scanning module 110, a calculation module 120, and a determination module 130.
The scanning module 110 is configured to obtain and scan a current directory to be traversed from a directory queue, to obtain file information corresponding to each directly subordinate file in the current directory, where the file information includes: the path and the basic information are stored. If the directory queue includes a plurality of directories to be traversed, the scanning module 110 is configured to acquire and scan a current directory to be traversed from the directory queue in a multi-threaded manner, where each thread corresponds to one current directory. Optionally, the scanning module 110 is further configured to, before obtaining and scanning the current directory to be traversed from the directory queue, obtain and scan a root directory to be traversed, and add all the directly subordinate subdirectories obtained by scanning the root directory into the directory queue as the directory to be traversed. And the directory queue is also used for adding all the directly subordinate subdirectories obtained by scanning the current directory into the directory queue as the directory to be traversed after the current directory to be traversed is obtained and scanned from the directory queue.
And the calculating module 120 is configured to calculate, for each file, a hash value of the file according to the storage path of the file.
The determining module 130 is configured to compare the hash value and the basic information of the file with the hash value and the basic information of the file in the hash table stored in the previous cycle, and determine a change type of the file. Optionally, the determining module 130 is configured to determine that the change type of the file is new if the corresponding hash value cannot be found in the hash table stored in the previous cycle according to the hash value of the file;
if the corresponding hash value is found in the hash table stored in the previous period according to the hash value of the file, comparing the basic information of the file with the basic information of the file in the hash table stored in the previous period, if the basic information of the file is different from the basic information of the file in the previous period, determining that the change type of the file is modified, and if the basic information of the file is the same as the basic information of the file in the previous period, determining that the change type of the file is unchanged.
Wherein, this monitoring file changes the apparatus 100 and also includes: the storage module is configured to, after the determining module 130 compares the hash value and the basic information of the file with the hash value and the basic information of the file in the hash table stored in the previous cycle, store the hash value and the basic information of the file in the hash table in the current cycle, or store the file information of the file in the linked list in the current cycle.
The monitoring file changing apparatus 100 further includes: and deleting the module. If the corresponding hash value is found in the hash table stored in the previous period according to the hash value of the file, the deleting module is configured to: deleting the hash value and the basic information corresponding to the file in the hash table stored in the previous period; after all the directories to be traversed are traversed, the remaining files in the hash table stored in the previous period are the deleted files in the current period; or deleting the file information of the file in the linked list stored in the previous period; after all the directories to be traversed are traversed, the remaining files in the linked list stored in the previous period are the deleted files in the current period.
The implementation principle and the resulting technical effect of the monitoring document changing apparatus 100 provided in the embodiment of the present application are the same as those of the foregoing method embodiment, and for the sake of brief description, no mention may be made in part of the embodiment of the apparatus, and reference may be made to the corresponding contents in the foregoing method embodiment.
As shown in fig. 5, fig. 5 is a block diagram illustrating a structure of an electronic device 200 according to an embodiment of the present disclosure. The electronic device 200 includes: a transceiver 210, a memory 220, a communication bus 230, and a processor 240.
The elements of the transceiver 210, the memory 220, and the processor 240 are electrically connected to each other directly or indirectly to achieve data transmission or interaction. For example, the components may be electrically coupled to each other via one or more communication buses 230 or signal lines. The transceiver 210 is used for transceiving data. The memory 220 is used for storing a computer program such as a software functional module shown in fig. 4, i.e., the monitoring file change apparatus 100. The file change monitoring apparatus 100 includes at least one software function module, which may be stored in the memory 220 in the form of software or firmware (firmware) or fixed in an Operating System (OS) of the electronic device 200. The processor 240 is configured to execute an executable module stored in the memory 220, such as a software function module or a computer program included in the apparatus 100 for monitoring file changes. For example, the processor 240 is configured to obtain and scan a current directory to be traversed from a directory queue, and obtain file information corresponding to each directly subordinate file in the current directory, where the file information includes: storing the path and the basic information; the hash value calculation module is also used for calculating the hash value of each file according to the storage path of the file; and the hash value and the basic information of the file are compared with those of the file in the hash table stored in the previous period, and the change type of the file is determined.
The Memory 220 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like.
The processor 240 may be an integrated circuit chip having signal processing capabilities. The processor may be a general-purpose processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor 240 may be any conventional processor or the like.
The electronic device 200 includes, but is not limited to, a computer, a server, and the like.
The embodiment of the present application further provides a non-volatile computer-readable storage medium (hereinafter, referred to as a storage medium), where the storage medium stores a computer program, and the computer program is executed by the computer, such as the electronic device 200, to execute the above-mentioned monitoring file change method.
It should be noted that, in the present specification, the embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a notebook computer, a server, or an electronic device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A method for monitoring file changes, comprising:
obtaining and scanning a current directory to be traversed from a directory queue to obtain file information corresponding to each directly-affiliated file under the current directory, wherein the file information comprises: storing the path and the basic information;
for each file, calculating a hash value of the file according to the storage path of the file;
and comparing the hash value and the basic information of the file with the hash value and the basic information of the file in the hash table stored in the previous period to determine the change type of the file.
2. The method according to claim 1, wherein after comparing the hash value and the basic information of the file with the hash value and the basic information of the file in the hash table stored in the previous cycle, the method further comprises:
and storing the hash value and the basic information of the file in a hash table of the current period, or storing the file information of the file in a linked list of the current period.
3. The method according to claim 1, wherein comparing the hash value and the basic information of the file with those of the file in the hash table stored in the previous cycle to determine the change type of the file comprises:
if the corresponding hash value cannot be found in the hash table stored in the last period according to the hash value of the file, determining that the change type of the file is newly increased;
if the corresponding hash value is found in the hash table stored in the previous period according to the hash value of the file, comparing the basic information of the file with the basic information of the file in the hash table stored in the previous period, if the basic information of the file is different from the basic information of the file in the previous period, determining that the change type of the file is modified, and if the basic information of the file is the same as the basic information of the file in the previous period, determining that the change type of the file is unchanged, wherein the basic information comprises the size of the file and/or the last modification time.
4. A method according to claim 3, wherein if the corresponding hash value is found in the hash table stored in the previous cycle according to the hash value of the file, the method further comprises:
deleting the hash value and the basic information corresponding to the file in the hash table stored in the previous period;
after all the directories to be traversed are traversed, the remaining files in the hash table stored in the previous period are the deleted files in the current period; alternatively, the first and second electrodes may be,
deleting the file information of the file in the linked list stored in the previous period;
after all the directories to be traversed are traversed, the remaining files in the linked list stored in the previous period are the deleted files in the current period.
5. The method of claim 1, wherein prior to retrieving and scanning a current directory from a directory queue to be traversed, the method further comprises:
and acquiring and scanning a root directory to be traversed, and adding all directly subordinate subdirectories obtained by scanning the root directory into the directory queue as directories to be traversed.
6. The method of claim 1, wherein after retrieving and scanning a current directory from a directory queue to traverse, the method further comprises:
and adding all the directly subordinate subdirectories obtained by scanning the current directory into a directory queue as the directory to be traversed.
7. The method according to any one of claims 1-6, wherein if there are a plurality of directories to be traversed in the directory queue, acquiring and scanning a current directory to be traversed from the directory queue, comprising:
and acquiring and scanning the current directory to be traversed from the directory queue in a multithreading mode, wherein each thread corresponds to one current directory one by one.
8. An apparatus for monitoring document changes, comprising:
the scanning module is configured to acquire and scan a current directory to be traversed from a directory queue to obtain file information corresponding to each directly subordinate file in the current directory, where the file information includes: storing the path and the basic information;
the computing module is used for computing the hash value of each file according to the storage path of the file;
and the determining module is used for comparing the hash value and the basic information of the file with the hash value and the basic information of the file in the hash table stored in the previous period to determine the change type of the file.
9. An electronic device, comprising:
a memory and a processor, the processor coupled to the memory;
the memory is used for storing programs;
the processor to invoke a program stored in the memory to perform the method of any of claims 1-7.
10. A storage medium having stored thereon a computer program which, when executed by a processor, performs the method according to any one of claims 1-7.
CN202010615999.6A 2020-06-29 2020-06-29 Method and device for monitoring file change, electronic equipment and storage medium Pending CN111769933A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010615999.6A CN111769933A (en) 2020-06-29 2020-06-29 Method and device for monitoring file change, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010615999.6A CN111769933A (en) 2020-06-29 2020-06-29 Method and device for monitoring file change, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111769933A true CN111769933A (en) 2020-10-13

Family

ID=72724336

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010615999.6A Pending CN111769933A (en) 2020-06-29 2020-06-29 Method and device for monitoring file change, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111769933A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112306959A (en) * 2020-10-30 2021-02-02 广州朗国电子科技有限公司 File scanning method of mobile storage device, storage medium and device terminal
CN112422819A (en) * 2020-10-30 2021-02-26 西安万像电子科技有限公司 Image processing method, device, server and storage medium
CN113704176A (en) * 2021-07-09 2021-11-26 奇安信科技集团股份有限公司 File scanning method, file scanning device, electronic equipment, program product and storage medium
CN114327950A (en) * 2021-12-29 2022-04-12 北京诺禾致源科技股份有限公司 File system disk scanning method and device and file management system
CN114995913A (en) * 2022-06-10 2022-09-02 北京宇信科技集团股份有限公司 Method, device, medium and equipment for reading rule running file by decision engine
CN115292257A (en) * 2022-10-09 2022-11-04 广州鲁邦通物联网科技股份有限公司 Method and system for detecting illegal deletion of file
CN116089364A (en) * 2023-04-11 2023-05-09 山东英信计算机技术有限公司 Storage file management method and device, AI platform and storage medium
WO2024000497A1 (en) * 2022-06-30 2024-01-04 西门子(中国)有限公司 Security detection method and apparatus for memory, and computer device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102790790A (en) * 2011-10-21 2012-11-21 北京安天电子设备有限公司 Checking system and method for rapidly acquiring integrity of web server file
US20160055168A1 (en) * 2014-08-25 2016-02-25 Baidu Online Network Technology (Beijing) Co., Ltd Method and apparatus for scanning files
CN108874999A (en) * 2018-06-14 2018-11-23 成都傲梅科技有限公司 A method of the real-time synchronization based on Windows monitoring
CN109446160A (en) * 2018-11-06 2019-03-08 郑州云海信息技术有限公司 A kind of file reading, system, device and computer readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102790790A (en) * 2011-10-21 2012-11-21 北京安天电子设备有限公司 Checking system and method for rapidly acquiring integrity of web server file
US20160055168A1 (en) * 2014-08-25 2016-02-25 Baidu Online Network Technology (Beijing) Co., Ltd Method and apparatus for scanning files
CN108874999A (en) * 2018-06-14 2018-11-23 成都傲梅科技有限公司 A method of the real-time synchronization based on Windows monitoring
CN109446160A (en) * 2018-11-06 2019-03-08 郑州云海信息技术有限公司 A kind of file reading, system, device and computer readable storage medium

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112306959A (en) * 2020-10-30 2021-02-02 广州朗国电子科技有限公司 File scanning method of mobile storage device, storage medium and device terminal
CN112422819A (en) * 2020-10-30 2021-02-26 西安万像电子科技有限公司 Image processing method, device, server and storage medium
CN112306959B (en) * 2020-10-30 2023-10-17 广州朗国电子科技股份有限公司 File scanning method of mobile storage device, storage medium and device terminal
CN113704176A (en) * 2021-07-09 2021-11-26 奇安信科技集团股份有限公司 File scanning method, file scanning device, electronic equipment, program product and storage medium
CN113704176B (en) * 2021-07-09 2023-10-31 奇安信科技集团股份有限公司 File scanning method, device, electronic equipment and storage medium
CN114327950A (en) * 2021-12-29 2022-04-12 北京诺禾致源科技股份有限公司 File system disk scanning method and device and file management system
CN114995913A (en) * 2022-06-10 2022-09-02 北京宇信科技集团股份有限公司 Method, device, medium and equipment for reading rule running file by decision engine
WO2024000497A1 (en) * 2022-06-30 2024-01-04 西门子(中国)有限公司 Security detection method and apparatus for memory, and computer device
CN115292257A (en) * 2022-10-09 2022-11-04 广州鲁邦通物联网科技股份有限公司 Method and system for detecting illegal deletion of file
CN116089364A (en) * 2023-04-11 2023-05-09 山东英信计算机技术有限公司 Storage file management method and device, AI platform and storage medium
CN116089364B (en) * 2023-04-11 2023-07-14 山东英信计算机技术有限公司 Storage file management method and device, AI platform and storage medium

Similar Documents

Publication Publication Date Title
CN111769933A (en) Method and device for monitoring file change, electronic equipment and storage medium
US11321303B2 (en) Conflict resolution for multi-master distributed databases
US7707136B2 (en) System and method for providing high availability data
US10331625B2 (en) Managing sequential data store
US8949853B2 (en) Using stages to handle dependencies in parallel tasks
US10275347B2 (en) System, method and computer program product for managing caches
US20140337834A1 (en) User-Influenced Placement of Virtual Machine Instances
US20180293251A1 (en) Method for storing a dataset
US20230237045A1 (en) Methods, devices and systems for real-time checking of data consistency in a distributed heterogenous storage system
US9390095B2 (en) Rapid cloud-based image centralization
CN113448938A (en) Data processing method and device, electronic equipment and storage medium
US20240061888A1 (en) Method And System For Identifying, Managing, And Monitoring Data Dependencies
CN110825704A (en) Data reading method, data writing method and server
US11507473B2 (en) System and method for efficient backup generation
WO2017088701A1 (en) Mass picture management method and apparatus
US9471409B2 (en) Processing of PDSE extended sharing violations among sysplexes with a shared DASD
CN113687920B (en) Object policy operation method, device and equipment of distributed system
US11609894B2 (en) Data storage system conflict management
CN115129789A (en) Bucket index storage method, device and medium of distributed object storage system
CN114416696A (en) Data migration method and device, electronic equipment and storage medium
CN107016007B (en) Method and device for processing big data based on data warehouse
US11379147B2 (en) Method, device, and computer program product for managing storage system
US9854041B1 (en) Reducing network traffic when replicating memory data across hosts
US11921587B2 (en) Parallelization of incremental backups
US20240256163A1 (en) Data Processing Method and Apparatus for Shared Memory, and Device and Medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20201013