CN112463734A - File retrieval method, system and related device - Google Patents

File retrieval method, system and related device Download PDF

Info

Publication number
CN112463734A
CN112463734A CN202011314505.7A CN202011314505A CN112463734A CN 112463734 A CN112463734 A CN 112463734A CN 202011314505 A CN202011314505 A CN 202011314505A CN 112463734 A CN112463734 A CN 112463734A
Authority
CN
China
Prior art keywords
file
storage
offset
retrieval
reading
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202011314505.7A
Other languages
Chinese (zh)
Inventor
贾伟
赵相如
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202011314505.7A priority Critical patent/CN112463734A/en
Publication of CN112463734A publication Critical patent/CN112463734A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a file retrieval method, which comprises the following steps: receiving a retrieval request, wherein the retrieval request comprises a retrieval keyword; determining the offset of the storage file corresponding to the retrieval key word in the index file; and reading the storage position of the corresponding storage file by using the offset, and responding to the retrieval request according to the storage position. According to the method and the device, the index file is configured in the file system, and the offset of each storage file is recorded in the index file and comprises the storage position of the corresponding storage file. The corresponding storage file can be determined by inputting the search key words, the offset of the storage file is further determined according to the index file, and the file to be searched can be quickly positioned. The operation efficiency of the file and the reading and writing speed of the file are improved, and the file is quickly retrieved and modified. The application also provides a file retrieval system, a computer readable storage medium and an electronic device, which have the beneficial effects.

Description

File retrieval method, system and related device
Technical Field
The present application relates to the field of computers, and in particular, to a method, a system, and a related apparatus for retrieving a file.
Background
When IT is necessary to read these configuration or storage files, IT is time consuming to read them according to the conventional method, and in order to read specific data or fields in the file, the whole file is traversed to retrieve the desired data. Therefore, the project running time is greatly prolonged, and the file reading speed is reduced.
Therefore, how to improve the file retrieval efficiency is a technical problem that needs to be solved urgently by those skilled in the art.
Disclosure of Invention
An object of the present application is to provide a file retrieval method, a file retrieval system, a computer-readable storage medium, and an electronic device, which can improve file retrieval efficiency.
In order to solve the above technical problem, the present application provides a file retrieval method, and the specific technical scheme is as follows:
receiving a retrieval request, wherein the retrieval request comprises a retrieval keyword;
determining the offset of the storage file corresponding to the retrieval key word in the index file;
and reading the storage position of the corresponding storage file by using the offset, and responding to the retrieval request according to the storage position.
Optionally, before determining the offset of the file corresponding to the search keyword in the index file, the method further includes:
circularly acquiring a storage file record and creating an index file, wherein the index file comprises key value pairs, and the key value pairs comprise key words of the storage file record and offset of the key words in the storage file.
Optionally, after determining the offset of the storage file corresponding to the search keyword in the index file, the method further includes:
determining an offset representation mode according to the offset size; the representation mode comprises a file start position offset and a file end position offset;
correspondingly, reading the storage location of the corresponding storage file by using the offset includes:
and reading the storage position of the corresponding storage file by using the file starting position offset or the file ending position offset.
Optionally, reading the storage location of the corresponding storage file by using the offset includes:
determining a start variable and an end variable using the offset;
and reading the storage position of the corresponding storage file according to the starting variable and the ending variable.
Optionally, reading the storage location of the corresponding storage file according to the start variable and the end variable includes:
reading the file by using the IO stream, pointing a reading pointer of the storage file to the start variable, and reading data backwards from the start variable until the end variable is read to obtain a storage position corresponding to the storage file.
Optionally, after responding to the retrieval request according to the storage location, the method further includes:
receiving a file modification request;
determining a modification pointer corresponding to the file modification request according to the offset;
and modifying the storage file corresponding to the modification pointer.
Optionally, modifying the storage file corresponding to the modification pointer includes:
deleting the storage file corresponding to the modification pointer;
and storing a modified file, and adding a key value pair corresponding to the modified file to the index file.
The present application further provides a document retrieval system, comprising:
the request receiving module is used for receiving a retrieval request, and the retrieval request comprises a retrieval keyword;
the offset determining module is used for determining the offset of the storage file corresponding to the retrieval key word in the index file;
and the file confirmation module is used for reading the storage position of the corresponding storage file by using the offset and responding to the retrieval request according to the storage position.
The present application also provides a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method as set forth above.
The present application further provides an electronic device, comprising a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of the method described above when calling the computer program in the memory.
The application provides a file retrieval method, which comprises the following steps: receiving a retrieval request, wherein the retrieval request comprises a retrieval keyword; determining the offset of the storage file corresponding to the retrieval key word in the index file; and reading the storage position of the corresponding storage file by using the offset, and responding to the retrieval request according to the storage position.
According to the method and the device, the index file is configured in the file system, and the offset of each storage file is recorded in the index file and comprises the storage position of the corresponding storage file. The corresponding storage file can be determined by inputting the search key words, the offset of the storage file is further determined according to the index file, and the file to be searched can be quickly positioned. The operation efficiency of the file and the reading and writing speed of the file are improved, and the file is quickly retrieved and modified.
The application also provides a file retrieval system, a computer readable storage medium and an electronic device, which have the beneficial effects and are not repeated herein.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of a file retrieval method according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a document retrieval system according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, fig. 1 is a flowchart of a file retrieval method according to an embodiment of the present application, where the method includes:
s101: receiving a retrieval request, wherein the retrieval request comprises a retrieval keyword;
this step is intended to receive a search request, which includes at least a search keyword. The search keyword refers to one or more of a file name and a file attribute. The file name does not have to be the full name of the file, and the corresponding stored file can be determined by fuzzy search.
S102: determining the offset of the storage file corresponding to the retrieval key word in the index file;
the method aims to determine the offset of the storage file corresponding to the retrieval key word by using the index file. In this embodiment, it is default that before this step is executed, the storage file records need to be cyclically acquired and the index file needs to be created, where the index file includes key value pairs, and the key value pairs include the keywords recorded by the storage file and the offsets of the keywords in the storage file.
The index file is composed of storage files, and is a sequential file with an index. The index itself is very small, the offsets of the index and the stored file are in one-to-one correspondence in the form of key value pairs, the index file can be inquired by bisection, the whole index file is loaded into the memory very small, and the index file is mainly used for recording the address of the content in the stored file, retrieving the data content according to the address and returning the data content to the user. Different keys may form a many-to-one relationship for the same index, for example, a server model may be queried using a resource ID and a serial number, where the server model corresponds to a value in an index file and the resource ID and serial number correspond to keys of the index file.
Therefore, after receiving the search key in step S101, the storage file corresponding to the search key can be determined by traversing the index file. It should be noted that, since the search key may be only a part of the name of the storage file, there may be a plurality of storage files obtained from the index file. Of course, it is easily understood that the more detailed the search keyword is, the higher the accuracy of the obtained stored file is.
The method for setting the offset can include two offset modes: a file start position offset and a file end position offset. The file start position offset refers to an offset set when the file is retrieved, which is retrieved from the head of the file downwards and at least comprises a file start position and a file size; the file end position offset refers to an offset upward from the end of the file, which contains at least the file end position and the file size. Regardless of the file start position offset or the file end position offset, the start position, the file size and the end position of the storage file can be obtained. The offset refers to the number of bytes of each record in the storage file from the beginning or the end of the file, and when a certain piece of information in the storage file is required to be obtained, the position of the storage file can be determined only by reading the offset of the corresponding information in the index file. Of course, those skilled in the art may also set the offset in other manners, and the offset of the current read-write position of the file may be used to indicate the position of the file, and the offset is also within the protection scope of the present application.
S103: and reading the storage position of the corresponding storage file by using the offset, and responding to the retrieval request according to the storage position.
Whether the offset is the file starting position offset or the file ending position offset, the storage position of the corresponding storage file can be directly determined, and therefore the retrieval request can be directly responded according to the storage position.
Specifically, the offset may be used to determine a start variable and an end variable, and then the storage location of the corresponding storage file may be read according to the start variable and the end variable. The start variable and the end variable refer to the start position and the end position of the storage file, that is, the storage position of the storage file is uniquely determined after the start position and the end position are defined. At this time, the file can be read by using the IO stream, the read pointer of the storage file points to the start variable, and the data is read backward from the start variable until the end variable is read, so as to obtain the storage location of the corresponding storage file.
Further, in this embodiment, since there are various ways for setting the offset, after determining the offset of the storage file corresponding to the search keyword in the index file in step S102, the representing way of the offset may be determined according to the size of the offset. Accordingly, step S104 may read the storage location of the corresponding storage file by using the file start position offset or the file end position offset. For example, if the storage file is located at the front part of the system, the file start position offset can be used at this time, i.e. the start position of the storage file is located. If the storage file is located at the rear part of the system, the file end position offset can be used, the offset is set to be a negative value, the reading offset starts to be read from the rear end of the system, and the storage position determining efficiency can be improved.
According to the embodiment of the application, the index file is configured in the file system, and the offset of each storage file is recorded in the index file and comprises the storage position of the corresponding storage file. The corresponding storage file can be determined by inputting the search key words, the offset of the storage file is further determined according to the index file, and the file to be searched can be quickly positioned. The operation efficiency of the file and the reading and writing speed of the file are improved, and the file is quickly retrieved and modified.
Based on the foregoing embodiment, as a preferred embodiment, on the basis of the foregoing file retrieval, a file modification may also be performed on the retrieved storage file, and a specific process may be as follows:
s201: receiving a file modification request;
s202: determining a modification pointer corresponding to the file modification request according to the offset;
the modification instruction is used for pointing to a corresponding position of the file modification request so as to indicate modification.
S203: and modifying the storage file corresponding to the modification pointer.
In this step, the storage file corresponding to the modification pointer may be deleted first, the modification file may be stored, and the key value pair corresponding to the modification file may be added to the index file.
Specifically, the modified part may be deleted first by using a file class deletion method, and then the byte size of the modified part is determined and the stored file is moved forward or backward. And in the process of forward movement and backward movement, the offset of the subsequent storage content is directly acquired from the index file by using algorithm circulation, and the position of the storage file which needs to be modified is expanded to form a space equal to the size of the new storage content byte. And finally, inserting the content to be modified into the corresponding storage file, namely modifying the pointing position of the pointer, so as to quickly release the file reading and writing pointer and the authority and reduce the resource waste.
In addition, after the modified part is deleted, the offset of the file stored after the content of the modified part is modified is not performed, but the modified content can be directly added to any position of the storage system, and the key value pair corresponding to the modified content is also added to the index file. Because the modified content corresponds to the same keyword as the original storage file, that is, belongs to the same storage file name, when the user searches for the storage file, the user can still obtain the corresponding offset and further determine the position of the storage file. However, at this time, the same storage name may correspond to multiple offsets, that is, the storage file is composed of several parts, and the rest of the parts except the original storage file are modified parts corresponding to file modification. This is suitable for the case where the modified file content is larger than the deleted modified part content, because once the modified file content is larger than the deleted modified part content, if the modified file content is forcibly added to the position, all the storage files at the latter position need to be moved forward, which causes a large amount of storage files to be shifted, and simultaneously changes the recorded offset in the system index file.
Taking the offset as the file start position offset as an example, it needs to include the start position and the file size. Let the offset of the original storage file be (100,200), which means that the original storage file starts from 100 and has a data size of 200, i.e. the storage locations are 100 to 300. At this time, a file modification request is received, the content in the storage locations 120 to 150 needs to be modified, and the modified file content needs to occupy 50 units of storage. At this time, the file contents of the storage locations 120 to 150 are deleted first, and the modified file contents are added to the storage locations 400 to 450. After the above file modification process, the offset obtained for the search of the stored file should be (100,119) + (151,300) + (400, 450).
The file modification process provided by this embodiment is implemented based on the file retrieval method provided by the above embodiment, and the storage location of the file to be modified can be quickly located by means of the offset, so as to quickly release the file read-write pointer and the file occupation permission, reduce the resource occupancy rate of the storage system, and effectively improve the resource utilization rate of the storage system.
Of course, on the basis of the modification process, if the storage file is modified for multiple times, the storage file may include multiple sets of offsets, and it may be set that, when the number of sets of offsets of the storage file exceeds a preset value, the sets of offsets of the storage file are merged. The preset value is not particularly limited and may be 5 or 10, and the like.
In the following, a document retrieval system provided by an embodiment of the present application is introduced, and the document retrieval system described below and the document retrieval method described above may be referred to correspondingly.
Referring to fig. 2, fig. 2 is a schematic structural diagram of a document retrieval system according to an embodiment of the present application, where the system includes:
the present application further provides a document retrieval system, comprising:
a request receiving module 100, configured to receive a search request, where the search request includes a search keyword;
an offset determining module 200, configured to determine, in an index file, an offset of a storage file corresponding to the search keyword;
the file confirmation module 300 is configured to read a storage location corresponding to the stored file by using the offset, and respond to the retrieval request according to the storage location.
Based on the above embodiment, as a preferred embodiment, the method further includes:
the index module is used for circularly acquiring storage file records and creating an index file, wherein the index file comprises key value pairs, and the key value pairs comprise key words recorded by the storage file and the offset of the key words in the storage file.
Based on the above embodiment, as a preferred embodiment, the file confirmation module 300 includes:
a first position determination unit for determining a start variable and an end variable using the offset amount;
and the second position determining unit is used for reading the storage position of the corresponding storage file according to the starting variable and the ending variable.
Based on the foregoing embodiment, as a preferred embodiment, the second position determining unit is specifically a unit configured to read a file by using an IO stream, point a read pointer of the storage file to the start variable, and read data backward from the start variable until the end variable is read, so as to obtain a storage position corresponding to the storage file.
Based on the above embodiment, as a preferred embodiment, the method further includes:
the file modification module is used for receiving a file modification request; determining a modification pointer corresponding to the file modification request according to the offset; and modifying the storage file corresponding to the modification pointer.
Based on the above embodiment, as a preferred embodiment, the file modification module includes:
the modification unit is used for deleting the storage file corresponding to the modification pointer; and storing a modified file, and adding a key value pair corresponding to the modified file to the index file.
The present application also provides a computer readable storage medium having stored thereon a computer program which, when executed, may implement the steps provided by the above-described embodiments. The storage medium may include: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The application further provides an electronic device, which may include a memory and a processor, where the memory stores a computer program, and the processor may implement the steps provided by the foregoing embodiments when calling the computer program in the memory. Of course, the electronic device may also include various network interfaces, power supplies, and the like.
The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system provided by the embodiment, the description is relatively simple because the system corresponds to the method provided by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present application are explained herein using specific examples, which are provided only to help understand the method and the core idea of the present application. It should be noted that, for those skilled in the art, it is possible to make several improvements and modifications to the present application without departing from the principle of the present application, and such improvements and modifications also fall within the scope of the claims of the present application.
It is further noted that, in the present specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims (10)

1. A method of retrieving a document, comprising:
receiving a retrieval request, wherein the retrieval request comprises a retrieval keyword;
determining the offset of the storage file corresponding to the retrieval key word in the index file;
and reading the storage position of the corresponding storage file by using the offset, and responding to the retrieval request according to the storage position.
2. The document retrieval method according to claim 1, wherein before determining the offset of the document corresponding to the retrieval key in the index document, the method further comprises:
circularly acquiring a storage file record and creating an index file, wherein the index file comprises key value pairs, and the key value pairs comprise key words of the storage file record and offset of the key words in the storage file.
3. The document retrieval method according to claim 1, wherein after determining an offset of the storage document corresponding to the retrieval key in the index document, further comprising:
determining an offset representation mode according to the offset size; the representation mode comprises a file start position offset and a file end position offset;
correspondingly, reading the storage location of the corresponding storage file by using the offset includes:
and reading the storage position of the corresponding storage file by using the file starting position offset or the file ending position offset.
4. The file retrieval method of claim 1, wherein reading the storage location of the corresponding storage file using the offset comprises:
determining a start variable and an end variable using the offset;
and reading the storage position of the corresponding storage file according to the starting variable and the ending variable.
5. The file retrieval method of claim 4, wherein reading the storage location of the corresponding storage file according to the start variable and the end variable comprises:
reading the file by using the IO stream, pointing a reading pointer of the storage file to the start variable, and reading data backwards from the start variable until the end variable is read to obtain a storage position corresponding to the storage file.
6. The method of claim 1, further comprising, after responding to the retrieval request according to the storage location:
receiving a file modification request;
determining a modification pointer corresponding to the file modification request according to the offset;
and modifying the storage file corresponding to the modification pointer.
7. The file retrieval method of claim 6, wherein modifying the stored file corresponding to the modification pointer comprises:
deleting the storage file corresponding to the modification pointer;
and storing a modified file, and adding a key value pair corresponding to the modified file to the index file.
8. A document retrieval system, comprising:
the request receiving module is used for receiving a retrieval request, and the retrieval request comprises a retrieval keyword;
the offset determining module is used for determining the offset of the storage file corresponding to the retrieval key word in the index file;
and the file confirmation module is used for reading the storage position of the corresponding storage file by using the offset and responding to the retrieval request according to the storage position.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the file retrieval method according to any one of claims 1 to 7.
10. An electronic device, comprising a memory in which a computer program is stored and a processor which, when calling the computer program in the memory, implements the steps of the file retrieval method according to any one of claims 1 to 7.
CN202011314505.7A 2020-11-20 2020-11-20 File retrieval method, system and related device Withdrawn CN112463734A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011314505.7A CN112463734A (en) 2020-11-20 2020-11-20 File retrieval method, system and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011314505.7A CN112463734A (en) 2020-11-20 2020-11-20 File retrieval method, system and related device

Publications (1)

Publication Number Publication Date
CN112463734A true CN112463734A (en) 2021-03-09

Family

ID=74798180

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011314505.7A Withdrawn CN112463734A (en) 2020-11-20 2020-11-20 File retrieval method, system and related device

Country Status (1)

Country Link
CN (1) CN112463734A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113378119A (en) * 2021-06-25 2021-09-10 成都卫士通信息产业股份有限公司 Software authorization method, device, equipment and storage medium
CN115001894A (en) * 2022-05-25 2022-09-02 北京经纬恒润科技股份有限公司 Vehicle-mounted bus signal access method and device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113378119A (en) * 2021-06-25 2021-09-10 成都卫士通信息产业股份有限公司 Software authorization method, device, equipment and storage medium
CN115001894A (en) * 2022-05-25 2022-09-02 北京经纬恒润科技股份有限公司 Vehicle-mounted bus signal access method and device
CN115001894B (en) * 2022-05-25 2023-06-30 北京经纬恒润科技股份有限公司 Vehicle-mounted bus signal access method and device

Similar Documents

Publication Publication Date Title
EP4068070A1 (en) Data storage method and apparatus, and storage system
CN108932236B (en) File management method and device
CN101685468B (en) Content addressable storage systems and methods employing searchable blocks
US9405784B2 (en) Ordered index
KR100856245B1 (en) File system device and method for saving and seeking file thereof
CN105069048A (en) Small file storage method, query method and device
CN101464901B (en) Object search method in object storage device
CN110147204B (en) Metadata disk-dropping method, device and system and computer-readable storage medium
CN111309687A (en) Object storage small file processing method, device, equipment and storage medium
JP2005267600A5 (en)
US20080306949A1 (en) Inverted index processing
CN110888837B (en) Object storage small file merging method and device
CN105373541A (en) Processing method and system for data operation request of database
CN112463734A (en) File retrieval method, system and related device
CN111159130A (en) Small file merging method and electronic equipment
CN111651127A (en) Monitoring data storage method and device based on shingled magnetic recording disk
US8316008B1 (en) Fast file attribute search
CN112416879B (en) NTFS file system-based block-level data deduplication method
CN107741968B (en) Method, system and device for file retrieval and computer readable storage medium
CN111552438B (en) Method, device, server and storage medium for writing object
KR20130053152A (en) Method of file management based on tag and system of the same
CN111984650A (en) Storage method, system and related device of tree structure data
CN108241758B (en) Data query method and related equipment
JP2020160494A (en) Information processing apparatus, document management system and program
CN114416676A (en) Data processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210309