CN113704176A - File scanning method, file scanning device, electronic equipment, program product and storage medium - Google Patents

File scanning method, file scanning device, electronic equipment, program product and storage medium Download PDF

Info

Publication number
CN113704176A
CN113704176A CN202110778728.7A CN202110778728A CN113704176A CN 113704176 A CN113704176 A CN 113704176A CN 202110778728 A CN202110778728 A CN 202110778728A CN 113704176 A CN113704176 A CN 113704176A
Authority
CN
China
Prior art keywords
file
scanning
file block
block
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110778728.7A
Other languages
Chinese (zh)
Other versions
CN113704176B (en
Inventor
刘锦锋
师庆志
周飘龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qianxin Technology Group Co Ltd
Secworld Information Technology Beijing Co Ltd
Original Assignee
Qianxin Technology Group Co Ltd
Secworld Information Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qianxin Technology Group Co Ltd, Secworld Information Technology Beijing Co Ltd filed Critical Qianxin Technology Group Co Ltd
Priority to CN202110778728.7A priority Critical patent/CN113704176B/en
Publication of CN113704176A publication Critical patent/CN113704176A/en
Application granted granted Critical
Publication of CN113704176B publication Critical patent/CN113704176B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/122File system administration, e.g. details of archiving or snapshots using management policies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • G06F16/137Hash-based
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a file scanning method, a file scanning device, electronic equipment, a program product and a storage medium, wherein the method comprises the following steps: after a trigger message of file scanning is received, performing first scanning on a file block corresponding to a file to be scanned, and determining the file block with change; and scanning the file blocks with the changes for the second time to determine the files with the changes. The method provided by the invention can reduce the scanning times, accurately determine the changed file or directory and improve the file detection efficiency.

Description

File scanning method, file scanning device, electronic equipment, program product and storage medium
Technical Field
The present invention relates to the field of information technology, and more particularly, to a file scanning method, apparatus, electronic device, program product, and storage medium.
Background
In the service transmission process based on the gatekeeper platform, in order to ensure the accuracy of file transmission, attributes such as creation addition, deletion or modification of a file directory or a file need to be monitored in real time, and operations such as reading, writing, deletion, renaming and the like are performed according to the change of the attributes.
In the prior art, the file directories and the changes of the files are scanned and monitored in a timed polling mode. Usually, the file directory and the file information are recorded in a list, after each polling scan, the result of the last scan and the list information are compared, and the attribute changes of the directory or the file are screened out by one-to-one comparison, so as to perform corresponding processing.
Disclosure of Invention
The invention provides a file scanning method, a file scanning device, electronic equipment, a program product and a storage medium, which are used for solving the technical problem of low detection efficiency caused by a mode of regularly polling and scanning and comparing a scanning result with list information one by one in the prior art, and the aim of improving the detection efficiency while ensuring the accuracy of file transmission is fulfilled.
In a first aspect, the present invention provides a file scanning method, including:
after a trigger message of file scanning is received, performing first scanning on a file block corresponding to a file to be scanned, and determining the file block with change;
and scanning the file blocks with the changes for the second time to determine the files with the changes.
According to a file scanning method provided by the present invention, the first scanning is performed on a file block corresponding to a file to be scanned, and a changed file block is determined, including:
acquiring directory information of the first file block and/or information of each contained file; the first file block is any one of file blocks corresponding to the file to be scanned;
calculating first summary data of the first file block according to the directory information of the first file block and/or the information of each contained file;
determining whether the first file block is a changed file block according to the first summary data of the first file block and the second summary data of the first file block; the second summary data of the first file block is obtained by scanning or is pre-stored under the condition that the first file block is not changed.
According to a file scanning method provided by the present invention, the calculating first summary data of the first file block according to directory information of the first file block and/or information of each included file includes:
and calculating first abstract data of the first file block by a fuzzy hash algorithm or cyclic redundancy check according to the directory information of the first file block and/or the name, the size and the modification time of each contained file.
According to a file scanning method provided by the present invention, the second scanning of the file block with the change to determine the file with the change includes:
receiving identification information of a second file block; wherein the second file block is the file block with the change determined by the first scanning;
acquiring directory information of the second file block and/or information of each contained file according to the identification information of the second file block;
determining a file with a change in the second file block according to the directory information of the second file block and/or the information of each included file, and the first directory information of the second file block and/or the first information of each included file; the first directory information of the second file block and/or the first information of each contained file is stored before the current file scanning operation.
According to the file scanning method provided by the invention, the first scanning of the file block corresponding to the file to be scanned comprises the following steps: carrying out first scanning on a plurality of file blocks corresponding to a file to be scanned in parallel;
correspondingly, the receiving the identification information of the second file block includes:
and respectively receiving the identification information of the second file block from the results of the plurality of parallel first scans in a message queue mode.
According to the file scanning method provided by the invention, the first directory information of the second file block and/or the first information of each contained file is stored in a red-black tree mode.
According to the file scanning method provided by the invention, the trigger message of the file scanning is generated under the condition that the file to be scanned is newly added or modified or deleted, or under the condition that the directory information of the file block corresponding to the file to be scanned is newly added or modified or deleted.
In a second aspect, the present invention further provides a document scanning apparatus, including:
the first scanning module is used for scanning a file block corresponding to a file to be scanned for the first time after receiving a trigger message of file scanning, and determining the file block with change;
and the second scanning module is used for scanning the changed file blocks for the second time and determining the changed files.
In a third aspect, the present invention provides an electronic device comprising:
a processor, a memory, and a bus, wherein,
the processor and the memory are communicated with each other through the bus;
the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform a method as described in any of the above.
In a fourth aspect, the present invention also provides a computer program product comprising computer executable instructions, characterized in that said instructions, when executed, are adapted to implement the steps of the file filtering method according to any of the above.
In a fifth aspect, the invention also provides a non-transitory computer readable storage medium storing computer instructions which cause the computer to perform the method as described in any one of the above.
The invention provides a file scanning method, a file scanning device, electronic equipment, a program product and a storage medium, wherein the method comprises the following steps: after a trigger message of file scanning is received, performing first scanning on a file block corresponding to a file to be scanned, and determining the file block with change; and scanning the file blocks with the changes for the second time to determine the files with the changes. According to the file scanning method, the first scanning and the second scanning of the file are achieved through the message triggering mode, the changed directories and files can be accurately determined, the file detection efficiency is improved, and the user experience is improved.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a file scanning method according to an embodiment of the present invention;
fig. 2 is a schematic flowchart of file directory scanning according to an embodiment of the present invention;
fig. 3 is a schematic flow chart of a second scan according to an embodiment of the present invention;
fig. 4 is a schematic flow chart of data communication during multi-directory scanning according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of an application deployment architecture diagram provided by the present invention;
FIG. 6 is a schematic structural diagram of a document scanning apparatus according to the present invention;
fig. 7 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a file scanning method according to the present invention. As shown in fig. 1, the file filtering method provided by the present invention includes the following steps:
step 101: after a trigger message of file scanning is received, performing first scanning on a file block corresponding to a file to be scanned, and determining the file block with change;
step 102: and scanning the file blocks with the changes for the second time to determine the files with the changes.
In particular, the trigger message is a signal message issued upon occurrence of a trigger event for notifying the scanning component to perform a scanning operation.
In the embodiment of the invention, the scanning component is triggered in an event triggering mode, the corresponding file block in the file to be scanned is scanned for the first time, the changed file block is determined, then the scanning component is used for scanning the file block determined to be changed for the second time, and the changed file is accurately screened out. It should be noted that, in the embodiment, an event triggering manner is adopted, the file to be scanned is scanned only when the file information or the file block information changes, and if the file or the file block does not change, the scanning component is not triggered to perform the scanning operation.
In step 101, when a trigger message is received, a file block corresponding to a file to be scanned is scanned for the first time, and a file block with a change is determined, where the file block may be a folder or other ways of storing a plurality of files. The change of the file block may be a directory addition, a directory modification, or a directory deletion of the file block, or a file addition, a modification, or a deletion in the file block.
In step 102, the second scanning only needs to scan the file blocks determined to be changed in the first scanning, so as to determine the changed file, where the changed file may be an addition, a modification, or a deletion of the file. The setting may be specifically performed according to the needs of the user, and is not specifically limited herein.
In the embodiment of the invention, the scanning component is triggered to perform the first scanning processing on the file block corresponding to the file to be scanned in an event triggering mode, the file block with the change is determined, and then only the file block with the change is scanned for the second time, so that the file with the change is determined. The method provided by the invention can reduce the scanning times, determine the changed file blocks or files more quickly and accurately and improve the file detection efficiency.
In another embodiment of the present invention, the scanning the file block corresponding to the file to be scanned for the first time, and determining the file block with the change includes:
acquiring directory information of the first file block and/or information of each contained file; the first file block is any one of file blocks corresponding to the file to be scanned;
calculating first summary data of the first file block according to the directory information of the first file block and/or the information of each contained file;
determining whether the first file block is a changed file block according to the first summary data of the first file block and the second summary data of the first file block; the second summary data of the first file block is obtained by scanning or is pre-stored under the condition that the first file block is not changed.
Specifically, the file to be scanned may include a plurality of file blocks, and the first file block may be any one of the file blocks. The information of each file includes information such as a file name, a size, and a modification time of each file.
In the embodiment of the present invention, the first summary data of the first file block is obtained by calculation according to the obtained directory information of the first file block and/or the information of each file included in the first file block, the first summary data is compared with the second summary data, and if the first summary data is not equal to the second summary data, the first file block is determined to be a file block with a change. In this embodiment of the present invention, the first digest data and the second digest data may be hash value data obtained by calculation.
It should be noted that the second summary data is obtained by scanning when the first file block is unchanged, for example, for the first file block that has not been changed, the second summary data may also be obtained by scanning the first file block last time, or obtained by any previous scanning; for the file block which is generated and changed in the middle, second abstract data can be obtained during the first scanning after the file block is generated, and the second abstract data is used as standard data for verifying the first abstract data; in other embodiments, the second digest data may also be standard data obtained by pre-storing, for example, pre-calculating by using a hash algorithm.
For example, as shown in fig. 2, in the service operation process, if the folder a is designated as the first file block, the scanning component xscanner starts to perform first scanning to traverse the directory information of the folder a, store the directory information and calculate the hash value of the folder a, and if there is no file in the folder a, determine that the hash value is the initial value;
creating a directory A/B, A/B/C, A/D, performing blocking processing in a folder form to form a file block, performing one-time scanning on the A folder, calculating the hash value of the A folder according to the directory information of the A folder and the information of each file under the directory, storing and recording the hash value in a list as second abstract data, and using the second abstract data as a data basis for comparison of the scanning results of each time in the following process.
When the directory information or internal files of the A folder change, a trigger message is generated, the A folder is scanned for the first time according to the trigger message to obtain the directory information and/or each file information under the directory of the A folder, the hash value of the A folder is recalculated to obtain first abstract data, the first abstract data and the second abstract data are compared and analyzed, when the two data are not equal, the A folder is determined to be a file block with change, the change is generated into a corresponding event, and the corresponding event is pushed to other processing modules through a callback function to be processed.
In the embodiment of the invention, the first abstract data of the first file block is obtained by calculation according to the directory information of the first file block and/or the information of each file in the directory, and the first abstract data is compared with the second abstract data to determine whether the first file block is a file block with change, so that the expenditure on memory use can be reduced, the scanning efficiency is improved, and the sensing range is compressed.
In another embodiment of the present invention, the calculating the first summary data of the first file block according to the directory information of the first file block and/or the information of each included file includes:
and calculating first abstract data of the first file block by a fuzzy hash algorithm or cyclic redundancy check according to the directory information of the first file block and/or the name, the size and the modification time of each contained file.
Specifically, the fuzzy hash algorithm is called a content-based segmented hash algorithm (CTPH for short), and the similarity value of two fuzzy hash values is obtained through a string similarity comparison algorithm, so as to determine the similarity degree of the two files.
Cyclic Redundancy Check (CRC) is a channel coding technique for generating a short fixed-bit Check code according to data such as a network data packet or a computer file, and is mainly used to detect or Check errors that may occur after data transmission or storage.
In the embodiment of the present invention, the first digest data of the first file block is calculated based on a fuzzy hash algorithm or a cyclic redundancy check according to the directory information of the first file block and/or the name, size, and modification time of each included file. It should be noted that the first digest data may be a hash value, or may be a check code obtained through a cyclic redundancy check.
If the first file block is an A folder, scanning the A folder for the first time to obtain directory information of the A folder and each contained file information, storing information such as file name, size and modification time of each file in the A folder into a buffer, and calculating a hash value of the A folder according to the stored file information, namely obtaining first abstract data of the A folder.
In the embodiment of the invention, the first abstract data of the first file block is calculated based on a fuzzy hash algorithm or a cyclic redundancy check mode, so that the blocking processing of the file to be scanned is realized, a large amount of memory is released, the changed folder can be positioned more quickly and accurately, and the detection efficiency is improved.
In another embodiment of the present invention, the second scanning the file blocks with changes to determine the file with changes includes:
receiving identification information of a second file block; wherein the second file block is the file block with the change determined by the first scanning;
acquiring directory information of the second file block and/or information of each contained file according to the identification information of the second file block;
determining a file with a change in the second file block according to the directory information of the second file block and/or the information of each included file, and the first directory information of the second file block and/or the first information of each included file; the first directory information of the second file block and/or the first information of each contained file is stored before the current file scanning operation.
Specifically, the identification information refers to information such as the name of the file block that changes after the first scanning is finished, and the file block that needs to be scanned for the second time can be accurately determined according to the identification information.
In the embodiment of the present invention, a second file block that needs to be scanned for the second time is determined according to the identification information, the second file block is scanned for the second time, directory information of the second file block and/or file information included in the second file block are obtained, and compared with first directory information of the second file block and/or file information included in the second file block obtained by last scanning, and a changed file is screened out. The file information is detailed information of all files in the directory, such as file name, file size, modification time, and the like.
For example, as shown in fig. 3, after the first scan of the a folder is finished, the a folder is determined to be a file block with a change, and is sent to the scanning component for the second scan. The scanning component for the second scanning receives the instruction and the corresponding identification information, performs the second scanning processing on the folder A to obtain the directory information of the folder A and the information of each file, compares the scanning result with the scanning result of the folder A last time, namely, screens the scanning result according to the directory information and the detailed information of each file to determine the file with a change, and then generates a corresponding event according to the change and provides the event to other processing modules for processing.
In the embodiment of the invention, the directory information of the second file block and/or the information of each contained file are obtained by scanning the second file block, and the result obtained by the current scanning is compared with the record information stored in the last scanning, so that the changed files are screened out. The invention can accurately determine the changed file by only scanning the changed file blocks for the second time, thereby improving the scanning efficiency.
In another embodiment of the present invention, the scanning the file block corresponding to the file to be scanned for the first time includes: carrying out first scanning on a plurality of file blocks corresponding to a file to be scanned in parallel;
correspondingly, the receiving the identification information of the second file block includes:
and respectively receiving the identification information of the second file block from the results of the plurality of parallel first scans in a message queue mode.
In particular, a message queue is a container that holds messages during their transmission.
In the embodiment of the present invention, the scanning component performs a first scanning on a plurality of file blocks corresponding to a file to be scanned, and pushes a plurality of scanning results obtained by parallel scanning to the scanning component performing a second scanning in a message queue manner. It should be noted that the scanning component uses XFPI as a file monitoring mode, and can support a file transfer protocol, software for implementing SMB protocol, a network file system, VSFTP service, and a local directory, and in addition, registers the monitoring directory in a uniform resource location system mode. The following specific examples are described in detail below.
For example, as shown in fig. 4, when a plurality of file blocks are scanned simultaneously and in parallel for the first time, the identification information of the file block determined by the first scanning and changed is pushed to the scanning component for the second scanning in a message queue manner, and the message queue is mainly used as a hub. The left side of the message queue is used for carrying out first scanning on file block information, the file block information comprises a plurality of directory information and directory hash values obtained by monitoring a plurality of file blocks to be scanned in real time, the right side of the message queue is used for transmitting changed file blocks needing second scanning through the message queue, the changed file block information is determined firstly, and then inquiry comparison is carried out according to the directory file information obtained by second scanning and the file information stored by the last second scanning, so that the changed files are determined. The monitoring directory of each file block has its own storage list, and the directory number (dirid) is used as an identifier, so that 63 directories can be monitored simultaneously and concurrently.
In this embodiment, the identification information of the file block that is determined to be changed in the first scanning is received through the message queue, so that a multi-thread and multi-process mode can be completely covered, and the transmission efficiency is higher.
In another embodiment of the present invention, the first directory information of the second file block and/or the first information of each included file is stored in a red-black tree manner.
Specifically, the Red Black Tree (Red Black Tree) is a self-balancing binary Tree, which is a data structure used in computer science.
In the embodiment of the invention, the first directory information obtained by each scanning and the first information of each file are stored in the red-black tree model, and the detailed information of all files under each file block monitoring directory, such as file names, file sizes, modification time and the like, is stored, so that data support is provided for the next file scanning.
In another embodiment of the present invention, the trigger message of file scanning is generated when the file to be scanned is newly added or modified or deleted, or when the directory information of the file block corresponding to the file to be scanned is newly added or modified or deleted.
In the embodiment of the present invention, the trigger message for triggering the file scanning operation is generated when the directory information of the file to be scanned or the file block corresponding to the file to be scanned changes, where the change may be a case where the directory information is added, deleted, or modified, or a case where the file is added, deleted, or modified, such as a case where the file or directory is added, the scanning component is triggered to perform the first scanning operation. It should be noted that the changes are not limited to addition, modification or deletion, and may be other changes, and are not specifically limited herein.
In the embodiment of the invention, the first scanning operation is triggered by setting the trigger message generated under the condition that the file or directory information to be scanned is newly added or modified or deleted, so that the scanning period is reduced and the working efficiency is improved.
In another embodiment of the present invention, there is provided an application deployment architecture diagram, as shown in fig. 5, comprising: a business logic layer, a scanning component layer, a distribution component layer, a transport component layer, and a file service layer, wherein,
the service logic layer: the system is used for embedded file transmission service, client file transmission service, server file transmission service and the like, and has the functions of performing service configuration, registering callback functions, integrating processing data and the like on a scanning assembly, a distribution assembly and a transmission assembly; and pushing the file processing event acquired by the scanning component to the distribution component, and distributing the file processing event to the transmission component by the distribution component to realize file synchronization logic.
A distribution component layer: the method can be composed of message queues, provides a solution for multi-path transmission, fully utilizes network bandwidth while realizing multi-path transmission, and improves the transmission efficiency of files.
A transmission component layer: and realizing the file transmission function.
Scanning the component layer: the method and the device can realize the first scanning of the file block corresponding to the file to be scanned and the second scanning of the file block with changes, are used for monitoring and capturing events such as addition and deletion of the file directory on a local directory or a remote NAS server and the like, provide the file change for the transmission assembly timely, efficiently and accurately, and perform service processing operation.
A file service layer: and a set of function interfaces is provided to realize the operations of reading, writing, adding, deleting, modifying and the like of the local file and the remote NAS service file.
In the embodiment of the invention, the file scanning detection work can be better realized by providing the application deployment architecture diagram.
In another embodiment of the present invention, the dual thread mode of operation is divided into a first scan and a second scan, wherein,
first scanning: the method is mainly used for calculating the hash value of the file to be scanned, namely the hash value of the root directory and the subdirectories, storing the hash value in a hash table and monitoring the change of the directories and the subdirectories.
And (3) second scanning: and a red and black tree data model is adopted to store detailed information of all files in the monitoring directory, such as file names, file sizes, modification time and other contents, and the detailed information is used for subsequent scanning comparison analysis and accurate searching of changed files.
Fig. 6 is a document scanning apparatus provided in the present invention, and as shown in fig. 6, the document filtering apparatus provided in the present invention includes:
the first scanning module 601 is configured to, after receiving a trigger message for file scanning, perform first scanning on a file block corresponding to a file to be scanned, and determine a file block with a change;
the second scanning module 602 is configured to perform a second scanning on the file block with the change, and determine that the file with the change exists.
In the file scanning device provided in the embodiment of the present invention, the first scanning module is configured to, after receiving a trigger message for file scanning, perform first scanning on a file block corresponding to a file to be scanned to determine a file block with a change, and the second scanning module is configured to perform second scanning on the file block with the change to determine the file with the change. The device provided by the invention can reduce the scanning times, accurately determine the changed files or directories and improve the file detection efficiency.
Since the principle of the apparatus according to the embodiment of the present invention is the same as that of the method according to the above embodiment, further details are not described herein for further explanation.
Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 7, the present invention provides an electronic device, including: a processor (processor)701, a memory (memory)702, and a bus 703;
the processor 701 and the memory 702 complete mutual communication through a bus 703;
processor 701 is configured to call program instructions in memory 702 to perform the methods provided by the above-described method embodiments, including, for example: after a trigger message of file scanning is received, performing first scanning on a file block corresponding to a file to be scanned, and determining the file block with change; and scanning the file blocks with the changes for the second time to determine the files with the changes.
The present embodiment provides a computer program product, which includes computer executable instructions, and is characterized in that the instructions, when executed, are configured to implement the steps of the file filtering method according to any one of the foregoing embodiments, for example, including: after a trigger message of file scanning is received, performing first scanning on a file block corresponding to a file to be scanned, and determining the file block with change; and scanning the file blocks with the changes for the second time to determine the files with the changes.
The present embodiments provide a non-transitory computer-readable storage medium storing computer instructions that cause the computer to perform the methods provided by the above method embodiments, for example, including: after a trigger message of file scanning is received, performing first scanning on a file block corresponding to a file to be scanned, and determining the file block with change; and scanning the file blocks with the changes for the second time to determine the files with the changes.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (11)

1. A method of scanning a document, comprising:
after a trigger message of file scanning is received, performing first scanning on a file block corresponding to a file to be scanned, and determining the file block with change;
and scanning the file blocks with the changes for the second time to determine the files with the changes.
2. The method according to claim 1, wherein the first scanning of the file block corresponding to the file to be scanned to determine the file block with a change comprises:
acquiring directory information of the first file block and/or information of each contained file; the first file block is any one of file blocks corresponding to the file to be scanned;
calculating first summary data of the first file block according to the directory information of the first file block and/or the information of each contained file;
determining whether the first file block is a changed file block according to the first summary data of the first file block and the second summary data of the first file block; the second summary data of the first file block is obtained by scanning or is pre-stored under the condition that the first file block is not changed.
3. The method according to claim 2, wherein said calculating the first summary data of the first file block according to the directory information of the first file block and/or the information of each included file comprises:
and calculating first abstract data of the first file block by a fuzzy hash algorithm or cyclic redundancy check according to the directory information of the first file block and/or the name, the size and the modification time of each contained file.
4. The method of claim 1, wherein the second scanning the changed file blocks to determine the changed file comprises:
receiving identification information of a second file block; wherein the second file block is the file block with the change determined by the first scanning;
acquiring directory information of the second file block and/or information of each contained file according to the identification information of the second file block;
determining a file with a change in the second file block according to the directory information of the second file block and/or the information of each included file, and the first directory information of the second file block and/or the first information of each included file; the first directory information of the second file block and/or the first information of each contained file is stored before the current file scanning operation.
5. The method according to claim 4, wherein the first scanning the file block corresponding to the file to be scanned comprises: carrying out first scanning on a plurality of file blocks corresponding to a file to be scanned in parallel;
correspondingly, the receiving the identification information of the second file block includes:
and respectively receiving the identification information of the second file block from the results of the plurality of parallel first scans in a message queue mode.
6. The method according to claim 4, wherein the first directory information of the second file partition and/or the first information of each included file is stored in a red-black tree.
7. The method according to any one of claims 1 to 6, wherein the trigger message for file scanning is generated when the file to be scanned is newly added or modified or deleted, or when the directory information of the file block corresponding to the file to be scanned is newly added or modified or deleted.
8. A document scanning apparatus, comprising:
the first scanning module is used for scanning a file block corresponding to a file to be scanned for the first time after receiving a trigger message of file scanning, and determining the file block with change;
and the second scanning module is used for scanning the changed file blocks for the second time and determining the changed files.
9. An electronic device, comprising:
a processor, a memory, and a bus, wherein,
the processor and the memory are communicated with each other through the bus;
the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the steps of the file scanning method of any of claims 1 to 7.
10. A computer program product comprising computer executable instructions for performing the steps of the document scanning method of any one of claims 1 to 7 when executed.
11. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the steps of the document scanning method according to any one of claims 1 to 7.
CN202110778728.7A 2021-07-09 2021-07-09 File scanning method, device, electronic equipment and storage medium Active CN113704176B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110778728.7A CN113704176B (en) 2021-07-09 2021-07-09 File scanning method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110778728.7A CN113704176B (en) 2021-07-09 2021-07-09 File scanning method, device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113704176A true CN113704176A (en) 2021-11-26
CN113704176B CN113704176B (en) 2023-10-31

Family

ID=78648382

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110778728.7A Active CN113704176B (en) 2021-07-09 2021-07-09 File scanning method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113704176B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104765740A (en) * 2014-01-03 2015-07-08 腾讯科技(深圳)有限公司 File scanning control method and device
CN107247722A (en) * 2017-04-25 2017-10-13 北京金山安全软件有限公司 File scanning method and device and intelligent terminal
CN108052575A (en) * 2017-12-08 2018-05-18 深圳市创维软件有限公司 File scanning method, equipment and storage medium
CN108446407A (en) * 2018-04-12 2018-08-24 北京百度网讯科技有限公司 Database audit method based on block chain and device
CN108932236A (en) * 2017-05-22 2018-12-04 北京金山云网络技术有限公司 A kind of file management method, scratch file delet method and device
CN111382123A (en) * 2018-12-28 2020-07-07 广州市百果园信息技术有限公司 File storage method, device, equipment and storage medium
CN111769933A (en) * 2020-06-29 2020-10-13 北京天融信网络安全技术有限公司 Method and device for monitoring file change, electronic equipment and storage medium
CN112416787A (en) * 2020-11-27 2021-02-26 平安普惠企业管理有限公司 JAVA-based project source code scanning analysis method, system and storage medium
CN112905539A (en) * 2021-03-25 2021-06-04 芝麻链(北京)科技有限公司 Automatic data storage method and device based on message digest

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104765740A (en) * 2014-01-03 2015-07-08 腾讯科技(深圳)有限公司 File scanning control method and device
CN107247722A (en) * 2017-04-25 2017-10-13 北京金山安全软件有限公司 File scanning method and device and intelligent terminal
US20180307700A1 (en) * 2017-04-25 2018-10-25 Beijing Kingsoft Internet Security Software Co., Ltd. Method and apparatus for scanning files and intelligent terminal
CN108932236A (en) * 2017-05-22 2018-12-04 北京金山云网络技术有限公司 A kind of file management method, scratch file delet method and device
CN108052575A (en) * 2017-12-08 2018-05-18 深圳市创维软件有限公司 File scanning method, equipment and storage medium
CN108446407A (en) * 2018-04-12 2018-08-24 北京百度网讯科技有限公司 Database audit method based on block chain and device
CN111382123A (en) * 2018-12-28 2020-07-07 广州市百果园信息技术有限公司 File storage method, device, equipment and storage medium
CN111769933A (en) * 2020-06-29 2020-10-13 北京天融信网络安全技术有限公司 Method and device for monitoring file change, electronic equipment and storage medium
CN112416787A (en) * 2020-11-27 2021-02-26 平安普惠企业管理有限公司 JAVA-based project source code scanning analysis method, system and storage medium
CN112905539A (en) * 2021-03-25 2021-06-04 芝麻链(北京)科技有限公司 Automatic data storage method and device based on message digest

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
周?;: "基于区块链的对账***设计与实现", 软件工程, no. 02 *

Also Published As

Publication number Publication date
CN113704176B (en) 2023-10-31

Similar Documents

Publication Publication Date Title
CN109034993B (en) Account checking method, account checking equipment, account checking system and computer readable storage medium
CN107506451B (en) Abnormal information monitoring method and device for data interaction
US11429566B2 (en) Approach for a controllable trade-off between cost and availability of indexed data in a cloud log aggregation solution such as splunk or sumo
CN110019873B (en) Face data processing method, device and equipment
CN108228322B (en) Distributed link tracking and analyzing method, server and global scheduler
CN111949611B (en) File processing method, system, device and medium
CN110717130B (en) Dotting method, dotting device, dotting terminal and storage medium
CN114020522A (en) Data backup method and device, electronic equipment and system
US11308212B1 (en) Adjudicating files by classifying directories based on collected telemetry data
CN113704176A (en) File scanning method, file scanning device, electronic equipment, program product and storage medium
CN112688905B (en) Data transmission method, device, client, server and storage medium
CN112199529A (en) Picture processing method and device, electronic equipment and storage medium
CN112395296A (en) Big data archiving method, device, equipment and storage medium
CN111625853B (en) Snapshot processing method, device and equipment and readable storage medium
CN111309689A (en) File duplicate checking method and device
CN114153647B (en) Rapid data verification method, device and system for cloud storage system
CN116521652B (en) Method, system and medium for realizing migration of distributed heterogeneous database based on DataX
CN117579617B (en) Data transmission method and device based on information security
CN112291312B (en) ETL data synchronization method and device, electronic equipment and storage medium
CN111274350B (en) Data processing method, device, computer equipment and storage medium
CN111061888B (en) Image acquisition method and system
CN109547290B (en) Cloud platform garbage data detection processing method, device, equipment and storage medium
CN110109883B (en) File filtering and storing method and device
CN116414772A (en) Data dump method, device, equipment and storage medium
CN118233449A (en) File transfer method, file transfer system, electronic device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Room 332, 3 / F, Building 102, 28 xinjiekouwei street, Xicheng District, Beijing 100088

Applicant after: QAX Technology Group Inc.

Applicant after: Qianxin Wangshen information technology (Beijing) Co.,Ltd.

Address before: Room 332, 3 / F, Building 102, 28 xinjiekouwei street, Xicheng District, Beijing 100088

Applicant before: QAX Technology Group Inc.

Applicant before: LEGENDSEC INFORMATION TECHNOLOGY (BEIJING) Inc.

GR01 Patent grant
GR01 Patent grant