CN106708825B - A kind of data file processing method and system - Google Patents
A kind of data file processing method and system Download PDFInfo
- Publication number
- CN106708825B CN106708825B CN201510454768.0A CN201510454768A CN106708825B CN 106708825 B CN106708825 B CN 106708825B CN 201510454768 A CN201510454768 A CN 201510454768A CN 106708825 B CN106708825 B CN 106708825B
- Authority
- CN
- China
- Prior art keywords
- data
- data file
- shared drive
- file
- directory
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of data file processing method and systems, wherein this method comprises: obtaining the data file on disk, wherein data file is stored with the data structure of shared drive;Data file is loaded into shared drive by the way of memory mapping, and shared drive is initialized;Data more new information is recorded based on the loading procedure, data more new packets include the filename and first time information of data file, and the first time information is time when loading procedure loads;Processing is read out to data file according to data more new information.The embodiment of the present invention is loaded into shared drive by storing the data file of clear data structure on disk with the data structure of shared drive, improves loading efficiency;It supports load at one, many places to use, i.e., by shared drive, realizes that multiple processes are used with a shared drive data, greatly reduce additional EMS memory occupation.
Description
Technical field
The invention belongs to field of communication technology more particularly to a kind of data file processing methods and system.
Background technique
With the rapid development of Internet technology, retrieval string parsing and error correction in searching service, personalized recommendation industry
Recommendation service in business etc. has required a large amount of data in largely servicing to support decision.Performance is then line in a program
Upper service needs to load a large amount of data, and needs per treatment carry out a large amount of table lookup operation.Meanwhile it needing according to certain frequency
Rate is updated data, to adapt to change.The frequency of update can be (second grade), quasi real time (minute grade) or timing in real time
(day grade) updates.
In the prior art, usual each service processes are voluntarily responsible for the load and update of data, and usually process is read
Data file, and oneself building memory data structure, in order to support not withdraw update, general way is to start an independence
Thread carries out data update, maintains old data constant at no point in the update process, after the completion of new data load, deletes old data.
In the research and practice process to the prior art, it was found by the inventors of the present invention that internal storage data in the prior art
It can only be used in individual process, if multiple processes need to be needed to load respectively, be updated respectively using identical data, from
And it will lead to additional EMS memory occupation;And the data are clear data structure, and the load time is longer, so as to cause loading efficiency
It is not high.
Summary of the invention
The purpose of the present invention is to provide a kind of data file processing method and systems, it is intended to improve data documents disposal standard
True rate and recall rate.
In order to solve the above technical problems, the embodiment of the present invention the following technical schemes are provided:
A kind of data file processing method, including:
The data file on disk is obtained, wherein the data file is stored with the data structure of shared drive;
The data file is loaded into the shared drive by the way of memory mapping, and to the shared drive
It is initialized;
Based on the loading procedure record data more new information, the data more new packets include the filename of data file with
And first time information, the first time information are the time when loading procedure loads;
Processing is read out to data file according to the data more new information.
In order to solve the above technical problems, the embodiment of the present invention also the following technical schemes are provided:
A kind of data documents disposal system, including:
Data management module, for obtaining the data file on disk, wherein the data file is with the number of shared drive
It is stored according to structure;The data file is loaded into the shared drive by the way of memory mapping, and to described
Shared drive is initialized;Data more new information is recorded based on the loading procedure, the data more new packets include data text
The filename and first time information of part, the first time information are the time when loading procedure loads;
Data read module, for being read out processing to data file according to the data more new information.
Compared with the existing technology, data file is first stored on disk by the present embodiment with the data structure of shared drive,
And the data file will be loaded into the shared drive by the way of memory mapping, and record more new information, with reality
Loading processing now is carried out to data file, and then data file is updated and is loaded according to the data more new information, with
Just process is used in conjunction with the data in shared drive;The embodiment of the present invention is by by the data file of clear data structure on disk
It is stored, and is loaded into shared drive with the data structure of shared drive, improve loading efficiency;It supports to add at one
It carries, many places use, i.e., by shared drive, realizes that multiple processes are used with a shared drive data, greatly reduce additionally
EMS memory occupation.
Detailed description of the invention
With reference to the accompanying drawing, by the way that detailed description of specific embodiments of the present invention, technical solution of the present invention will be made
And other beneficial effects are apparent.
Fig. 1 a is the schematic diagram of a scenario of data file processing method provided in an embodiment of the present invention;
Fig. 1 b is the flow diagram of data file processing method provided in an embodiment of the present invention;
Fig. 2 is the flow diagram of data file processing method provided in an embodiment of the present invention;
Fig. 3 is that the data structure of data file provided in an embodiment of the present invention is illustrated;
Fig. 4 is the structural schematic diagram of data documents disposal system provided in an embodiment of the present invention;
Fig. 5 is the structural schematic diagram of server provided in an embodiment of the present invention.
Specific embodiment
Schema is please referred to, wherein identical component symbol represents identical component, the principle of the present invention is to implement one
It is illustrated in computing environment appropriate.The following description be based on illustrated by the specific embodiment of the invention, should not be by
It is considered as the limitation present invention other specific embodiments not detailed herein.
In the following description, specific embodiments of the present invention will refer to the step as performed by one or multi-section computer
And symbol illustrates, unless otherwise stating clearly.Therefore, these steps and operation will have to mention for several times is executed by computer, this paper institute
The computer execution of finger includes by representing with the computer processing unit of the electronic signal of the data in a structuring pattern
Operation.This operation is converted at the data or the position being maintained in the memory system of the computer, reconfigurable
Or in addition change the running of the computer in mode known to the tester of this field.The maintained data structure of the data
For the provider location of the memory, there is the specific feature as defined in the data format.But the principle of the invention is with above-mentioned text
Word illustrates that be not represented as a kind of limitation, this field tester will appreciate that plurality of step and behaviour as described below
Also it may be implemented in hardware.
The principle of the present invention is grasped using many other wide usages or specific purpose operation, communication environment or configuration
Make.The known example suitable for arithmetic system of the invention, environment and configuration may include (but being not limited to) hold phone,
Personal computer, server, multicomputer system, system, body frame configuration computer and distributed arithmetic ring based on micro computer
Border, which includes any above system or devices.
Term as used herein " module " can regard the software object to execute in the arithmetic system as.It is described herein
Different components, module, engine and service can regard as the objective for implementation in the arithmetic system.And device as described herein and
Method is preferably implemented in the form of software, can also be implemented on hardware certainly, the scope of the present invention it
It is interior.
The embodiment of the present invention provides a kind of data file processing method and system.
Referring to Fig. 1 a, which is the schematic diagram of a scenario of data documents disposal system provided by the embodiment of the present invention, the number
It specifically can integrate in the equipment such as server according to document handling system, which can specifically include data
Management module, be mainly used for obtain disk on data file, wherein the data file with the data structure of shared drive into
Row storage;Wherein, the data structure of shared drive may include array, hash table, even numbers group word lookup tree etc., adopt thereafter
The data file is loaded into the shared drive with the mode that memory maps, and the shared drive is carried out initial
Change;Data more new information is recorded based on the loading procedure, the data more new packets include the filename and the of data file
One temporal information, the first time information are the time when loading procedure loads;Certainly, which may be used also
Further for being updated processing etc. to data file;Load/update
In addition, the data documents disposal system can also specifically include data read module, it is mainly used for according to
Data more new information is updated and loads to data file, the data being used in conjunction in shared drive so as to process;In addition, should
Data documents disposal system can also specifically include peripheral system, alternatively referred to as update detection module, be mainly used for disk
On data file be updated detection so that data management module is updated and loading processing, data read module according to
Data more new information is updated and loads.
It will be described in detail respectively below.
First embodiment
In the present embodiment, it will be described from the angle of data documents disposal, which specifically may be used
To be integrated in the equipment such as server.
A kind of data file processing method, comprising: obtain the data file on disk, wherein the data file is with shared interior
The data structure deposited is stored;The data file is loaded into shared drive by the way of memory mapping, and total to this
Memory is enjoyed to be initialized;Data more new information is recorded based on the loading procedure, the data more new packets include data file
Filename and first time information, the first time information are the time when loading procedure loads;It is updated according to the data
Information is read out processing to data file.
Fig. 1 b is please referred to, Fig. 1 b is the flow diagram for the data file processing method that first embodiment of the invention provides.
This method comprises:
In step s101, the data file on disk is obtained, wherein the data file is with the data structure of shared drive
It is stored.
It is understood that first the data file on disk can be pre-processed when carrying out data file load,
Such as: data file is stored in disk with the data structure of shared drive, wherein the data structure of shared drive include but
It is not limited to array, hash table, even numbers group word lookup tree (trie) etc., it is not especially limited herein.
In step s 102, the data file is loaded into the shared drive by the way of memory mapping, and to this
Shared drive is initialized.
For example, the operation initialized herein can refer specifically to initialize the lock in internal storage data, main needle
To there is read-write (such as to modify to partial data) internal storage data of demand simultaneously, needs to design in memory data structure and lock
(such as Read-Write Locks, sequence lock) guarantee correctly to access.
Optionally, after being initialized to the shared drive, the data file on disk can also be carried out real-time
Update detection, for example, specifically can be such that
(1) it determines to answer the data file that the needs update there are when the data file of update in need when detecting
It makes under more new directory;
(2) data file under this more new directory is loaded into shared drive, and the shared drive is initialized;
(3) data file under data directory is moved to backup directory;Data file under this more new directory is mobile
To under the data directory.
After data file load updates, original data file can also be mapped and be deleted.
It should be noted that the more new directory, the data directory and the backup directory are to shift to an earlier date in the embodiment of the present invention
It is arranged, and these three catalogues are in same file system, to guarantee data file file system index node when moving
(inode) it remains unchanged, to keep memory mapping relations;The index node can be used to store the basic letter of archives and catalogue
Breath includes time, shelves name, user and group etc..
It is further alternative, cyclic redundancy check code or message digest algorithm (MD5, Message- can be passed through
Digest Algorithm 5) etc. modes detection is updated to data file, be not specifically described herein.
In step s 103, data more new information is recorded based on the loading procedure, the data more new packets include data text
The filename and first time information of part, the first time information are the time when loading procedure loads.
It is changed by the data structure to data in magnetic disk file, which is loaded into shared drive, and
Relevant data more new information is recorded, so that the module of data management also completes the process of data load.
In step S104, processing is read out to data file according to the data more new information.
Such as: it, can be according to pre- using the module of data after the module of data management has recorded data more new information
If time interval, the data more new information is read out and is detected;If it is determined that data more new information middle finger is shown with purpose number
According to the more new information of file, then data file is loaded into the shared drive by the way of memory mapping, and record this to make
Time when mapping load with the module of data is the second temporal information.
It is understood that this is not required to using the module of data since step S102 has performed initialization operation
Operation bidirectional is done again, and operating system can guarantee that the same data file is mapped in identical shared drive, to use
The module of data also completes the process of data load.
It can be seen from the above, data file processing method provided in this embodiment, first by data file with the number of shared drive
It is stored on disk, and the data file will be loaded into the shared drive by the way of memory mapping, and remember according to structure
More new information is recorded, loading processing is carried out to data file to realize, and then carry out to data file according to the data more new information
It updates and loads, the data being used in conjunction with so as to process in shared drive;The embodiment of the present invention is by by clear data on disk
The data file of structure is stored with the data structure of shared drive, and is loaded into shared drive, and load is improved
Efficiency;It supports load at one, many places to use, i.e., by shared drive, realizes that multiple processes are used with a shared drive data,
Greatly reduce additional EMS memory occupation.
Second embodiment
Citing, is described in further detail by described method according to first embodiment below.
In the data documents disposal system include: data management module, data using module and updates detection module;It is first
First, the data file on disk is mapped to shared drive, records the data more new information of the process by data management module.Its
Secondary, data file is also mapped to shared drive according to data more new information using module by data;After the completion of data load, update
Detection module further can be updated detection to disk file in real time, so that data management module and data read module root
It is updated and loads according to data.
Wherein system data for being related in updating load include: data file on disk, shared drive data and
Recorded data more new information, and cyclic redundancy check file (CRC, Cyclic for checking file authentication
Redundancy Check) and indicate mark (flag) file updated etc..
It will be described in more detail below.
As shown in Fig. 2, a kind of data file processing method, detailed process can be such that
In step s 201, it updates detection module and detection is updated to the data file on disk.
Wherein, the mode for updating detection includes but is not limited to crc verification, md5sum verification.Step is triggered after verifying successfully
" data file updated will be needed to copy under more new directory ".
It should be noted that data file is stored on disk with the data structure of shared drive, number is alternatively referred to as stored
According to disk file;System need by data it is regular at can the direct format used in shared drive, that is, it is shared in
The disk storage form of deposit data.
For example, can be specific as follows:
It is stored on disk in the form of binary file, is mapped to shared drive in such a way that memory maps.Its content
It can be any shared drive data structure.Including but not limited to array, hash table, even numbers group trie tree etc..Below with hash
For table, data structure can be as shown in figure 3, hash table data structure may include:
(1) header information HEADER stores the metadata of hash table.Such as data type, version number, control multi-process
The lock of access, hash table statistical information etc..
(2) hash bucket BUCKET, content are directed to the index of NODE array.
(3) node NODE, content include being directed toward index, the key (key) of next node.It is directed toward the index of CHUNK array,
It is elongated hash table for value, further includes the length of value.
(4) data block CHUNK, fixed length hash direct storage value, and elongated hash also stores one and is directed toward next CHUNK's
Index.
It is contemplated that only carrying out analytic explanation by taking hash table data structure as an example herein, do not constitute to of the invention
It limits.
In step S202, when determining to update detection module for the needs there are when the data file of update in need
The data file of update copies under more new directory.
At the same time, flag file can be written for the filename for the data file for needing to update by updating detection module, with
Just data management module can periodically check flag file, read filename therein.
In step S203, the data file under this more new directory is loaded into shared drive by data management module, and
The shared drive is initialized.
In step S204, the data file under data directory is moved to backup directory by data management module.
In step S205, the data file under this more new directory is moved under the data directory by data management module.
It is understood that step S203 to step S205 is that data management module updates a kind of more excellent of data file
The mode of choosing.
It should be noted that the more new directory, the data directory and the backup directory are to shift to an earlier date in the embodiment of the present invention
It is arranged, and these three catalogues are in same file system, to guarantee data file file system index node when moving
(inode) it remains unchanged, to keep memory mapping relations;The index node can be used to store the basic letter of archives and catalogue
Breath includes time, shelves name, user and group etc..
Further, the mode that memory mapping can be used in data management module loads the data file under this more new directory
Into shared drive, wherein can be specific: memory mapping just refers to by the mapping of a file to one piece of memory.Win32 is provided
Allow application program the function (CreateFileMapping) of File Mapping a to process.Memory Mapping File and void
Some are similar for quasi- memory, the region of an address space can be retained by Memory Mapping File, while physical storage being mentioned
Give this region, the file that the physical storage of memory mapping is already present on disk from one, and to this document
File must be mapped first before being operated.It, will not when handling the file being stored on disk using memory mapping
I/O operation must be executed to file again, memory be played the role of considerable when being mapped in the file of processing big data quantity.
In addition, the data file under this more new directory is loaded into shared drive by data management module, it is exactly to " shared
Internal storage data " is updated, wherein shared drive data are stored in shared drive, can be used in conjunction with by multiple processes
Data.The meaning of data is by process interpretation.For example, fixed/variable hash table, general key-value storage organization, trie tree
Deng.It, which is loaded, is mapped to memory for " data file " generally by direct, and does necessary initialization operation and complete.
In step S206, data management module deletes the data file mapping before data file update, and updates the number
According to more new information.
Wherein, the data more new packets include the filename and first time information of data file, this believes at the first time
Breath is current time when data file is loaded into shared drive by data management module.
It is understood that data more new information is stored in shared drive, with record " shared drive data "
Update status.General filename+renewal time the form for using data file.
Such as: when data management module determines that data file has update, data file is mapped to the same of shared drive
Time point when Shi Jilu is loaded is that " TT " can read since data management module can periodically check flag file
Entitled " AA " to the corresponding file of data file currently updated, then recording data more new information is " AA+TT ".
In step S207, data read module is read out the data more new information according to prefixed time interval.
In step S208, if the time in first time information indicated in the data more new information read is later than the
Time in two temporal informations, then data read module determines that data file has update.
In step S209, data file updated under the data directory is mapped to this and shared by data read module
Memory is simultaneously waited.
In step S210, data read module deletes data file more when the waiting time being more than preset time threshold
Data file mapping before new.
It is understood that step S207 to step S210 is the number that data read module is recorded according to data management module
A kind of more preferred mode of data file update and load is carried out according to more new information.
For example, data read module periodic test data more new information, by reading the temporal information in more new information
To determine whether needing more new data.If it find that the time in first time information indicated in data more new information is later than
Time in second temporal information, then data read module determines that data file has update, that is to say, that if the time letter of record
In breath, data management module updates the time in the first time information recorded when data file for " 8:00 ", and reading data
It is " 7:00 " that module last time load data file, which is the time in the second temporal information of record, then can determine that data file
There is update, data read module needs are updated load to the data file of update.
Data read module updates load data file, can be specific as follows:
Such as: the data file being updated under data directory is mapped to shared drive, since this when can be same
When there are two mappings, legacy data file and new data file are simultaneously in shared drive.This is because current legacy data file
There are also access, so the memory mapping of legacy data file cannot be released at once.It is waited after having loaded, waits several seconds, solved
Except the mapping of old data file, such as 2~30S, to ensure that legacy data file is no longer used.
Herein it should be noted that each functional module (such as data management module and data read module) each Self management oneself
Shared drive, after all functional modules all relieve old data file mapping, old data file can just be operated
System recycles.In the processing system, data management module will be responsible for load, initialization and update write-in, these operations are all
It is exclusive, it needs to be done by a process.Data read module then can be used as another process, for reading data.
It is understood that the embodiment is carried out mainly for the renewal process of data management module and data read module
Analysis, the part not being described in detail in this embodiment, e.g., data file is loaded into shared drive by data management module, is gone forward side by side
The partial content of row initialization may refer to the detailed description that first embodiment is directed to data file processing method, herein no longer
It repeats.
It can be seen from the above, data file processing method provided in this embodiment, data management module first by data file with
The data structure of shared drive is stored on disk, and is shared the data file is loaded into this by the way of memory mapping
In memory, and record more new information, data file be updated and loading processing with realizing, so data read module according to
The data more new information is updated and loads to data file, the data being used in conjunction in shared drive so as to process;This hair
Bright embodiment by the way that the data file of clear data structure on disk is stored with the data structure of shared drive, and by its
It is loaded into shared drive, improves loading efficiency;It supports load at one, many places to use, i.e., by shared drive, realizes multiple
Process is used with a shared drive data, greatly reduces additional EMS memory occupation.
3rd embodiment
For convenient for better implementation data file processing method provided in an embodiment of the present invention, the embodiment of the present invention is also provided
A kind of system based on above-mentioned data file processing method.The wherein meaning of noun and phase in the method for above-mentioned data documents disposal
Together, specific implementation details can be with reference to the explanation in embodiment of the method.
Referring to Fig. 4, Fig. 4 is the structural schematic diagram of data documents disposal system provided in an embodiment of the present invention, can have
Body includes data management module 401 and data read module 402.
Wherein, data management module 401, for the data file on disk to be mapped in shared drive, and to this
Shared drive is initialized;Data more new information is recorded based on the loading procedure, to realize the loading processing to data file.
It can be specific as follows:
Data management module 401 obtains the data file on disk, and wherein the data file is with the data knot of shared drive
Structure is stored;The data file is loaded into the shared drive by the way of memory mapping, and to the shared drive into
Row initialization;Based on the loading procedure record data more new information, the data more new packets include the filename of data file with
And first time information, the first time information are the time when loading procedure loads.
For example, the operation initialized herein can refer specifically to initialize the lock in internal storage data, main needle
To there is read-write (such as to modify to partial data) internal storage data of demand simultaneously, needs to design in memory data structure and lock
(such as Read-Write Locks, sequence lock) guarantee correctly to access.
The data management module 401 is used to be read out processing to data file according to the data more new information, can be specific
, data management module 401 is updated and loads to data file according to data more new information, so that process is used in conjunction with
Enjoy the data in memory.
Such as: after the module of data management has recorded data more new information, data read module 402 can be according to pre-
If time interval, the data more new information is read out and is detected;If it is determined that data more new information middle finger is shown with purpose number
According to the more new information of file, then data file is loaded into the shared drive by the way of memory mapping, and record the number
Time when loading is mapped according to read module 402 for the second temporal information.
It is understood that this uses the mould of data since data management module 401 has performed initialization operation
Block does not need to do operation bidirectional again, and operating system can guarantee that the same data file is mapped in identical shared drive, from
And the process of data load is also completed using the module of data.
Optionally, the data management module 401 can be also used for before obtaining the data file on disk with shared
Data file is stored in disk by the data structure of memory, and the data structure of the shared drive includes array, hash table, even numbers
Group word lookup tree.
It is understood that first the data file on disk can be pre-processed when carrying out data file load,
Such as: data file is stored in disk with the data structure of shared drive, wherein the data structure of shared drive include but
It is not limited to array, hash table, even numbers group word lookup tree (trie) etc., it is not especially limited herein.
It is stored on disk in the form of binary file, is mapped to shared drive in such a way that memory maps.Its content
It can be any shared drive data structure.Including but not limited to array, hash table, even numbers group trie tree etc..Below with hash
For table, data structure can be as shown in figure 3, hash table data structure may include:
(1) header information HEADER stores the metadata of hash table.Such as data type, version number, control multi-process visit
The lock asked, hash table statistical information etc..
(2) hash bucket BUCKET, content are directed to the index of NODE array.
(3) node NODE, content include being directed toward index, the key (key) of next node.It is directed toward the index of CHUNK array,
It is elongated hash table for value, further includes the length of value.
(4) data block CHUNK, fixed length hash direct storage value, and elongated hash also stores one and is directed toward next CHUNK's
Index.
It is contemplated that only carrying out analytic explanation by taking hash table data structure as an example herein, do not constitute to of the invention
It limits.
Optionally, after being initialized to the shared drive, the data file on disk can also be carried out real-time
Update detection, for example, specifically can be such that
As shown in figure 4, the system can also include updating detection module 403, for carrying out to the data file on disk
Update detection;Wherein, the mode for updating detection includes but is not limited to crc verification, md5sum verification.When determining that there are in need
When the data file of update, the data file which updates is copied under more new directory;
The data management module 401 can be also used for for the data file under this more new directory being loaded into shared drive,
And the shared drive is initialized;Data file under data directory is moved to backup directory;It will be under this more new directory
Data file be moved under the data directory.
After data file load updates, original data file can also be mapped and be deleted.
Such as: the data management module 401, in the case where the data file under this more new directory to be moved to the data directory it
Afterwards, it can be also used for deleting the data file mapping before data file update.The data management module 401 is deleting data text
After data file mapping before part update, it can be also used for updating the data more new information.
It should be noted that the more new directory, the data directory and the backup directory are to shift to an earlier date in the embodiment of the present invention
It is arranged, and these three catalogues are in same file system, to guarantee data file file system index node when moving
(inode) it remains unchanged, to keep memory mapping relations;The index node can be used to store the basic letter of archives and catalogue
Breath includes time, shelves name, user and group etc..
It is further alternative, it can be by the modes such as cyclic redundancy check code or message digest algorithm MD5 to data text
Part is updated detection, is not specifically described herein.
Further, the mode of memory mapping can be used by the data file under this more new directory in data management module 401
Be loaded into shared drive, wherein can be specific: memory mapping just refers to by the mapping of a file to one piece of memory.Win32
Provide the function (CreateFileMapping) for allowing application program File Mapping a to process.Memory Mapping File
Some are similar with virtual memory, can retain the region of an address space by Memory Mapping File, while by physical store
Device submits to this region, the file that the physical storage of memory mapping is already present on disk from one, and to this
File must first map file before being operated.When handling the file being stored on disk using memory mapping,
I/O operation need not be executed to file again, memory be played considerable when being mapped in the file of processing big data quantity
Effect.
In addition, the data file under this more new directory is loaded into shared drive by data management module 401, it is exactly right
" shared drive data " are updated, wherein shared drive data are stored in shared drive, can be made jointly by multiple processes
Data.The meaning of data is by process interpretation.For example, fixed/variable hash table, general key-value storage organization,
Trie tree etc..It, which is loaded, is mapped to memory for " data file " generally by direct, and does necessary initialization operation and complete.
At this point, the data read module 402, can be also used for according to prefixed time interval, to the data more new information into
Row is read;If it is determined that data more new information middle finger is shown with the more new information of purpose data file, then using the side of memory mapping
Data file is loaded into the shared drive by formula, and records time when load as the second temporal information.
Further, data read module 402, after more new information is read out to the data, if being also used to
Time in the data of reading more new information in indicated first time information is later than the time in the second temporal information, then really
Determining data file has update;Data file updated under the data directory is mapped to the shared drive and is waited;When
Data file mapping when waiting time is more than preset time threshold, before deleting data file update.
Herein it should be noted that each functional module (such as data management module 401 and data read module 402) is respectively managed
The shared drive for managing oneself, after all functional modules all relieve old data file mapping, the just meeting of old data file
It is recycled by operating system.In the processing system, data management module will be responsible for load, initialization and update write-in, these behaviour
All be it is exclusive, need to be done by a process.Data read module then can be used as another process, for reading number
According to.
When it is implemented, the above modules can be used as independent entity to realize, any combination can also be carried out, is made
It is realized for same or several entities, for example, reference can be made to second embodiment, before the specific implementation of the above modules can be found in
The embodiment of the method in face, details are not described herein.
It can be seen from the above, the processing system of data file provided in this embodiment, first by data file with shared drive
Data structure is stored on disk, and the data file will be loaded into the shared drive by the way of memory mapping, and
Record more new information, with realize to data file carry out loading processing, and then according to the data more new information to data file into
Row updates and load, the data being used in conjunction in shared drive so as to process;The embodiment of the present invention is by by plaintext number on disk
It is stored, and is loaded into shared drive with the data structure of shared drive according to the data file of structure, improved and add
Carry efficiency;It supports load at one, many places to use, i.e., by shared drive, realizes that multiple processes are used with a shared drive number
According to greatly reducing additional EMS memory occupation.
Fourth embodiment
The embodiment of the present invention also provides a kind of server, wherein can integrate the data documents disposal system of the embodiment of the present invention
System, as shown in figure 5, it illustrates the structural schematic diagrams of server involved in the embodiment of the present invention, specifically:
The server may include one or processor 501, one or more meters of more than one processing core
Memory 502, radio frequency (Radio Frequency, RF) circuit 503, power supply 504, input unit of calculation machine readable storage medium storing program for executing
The components such as 505 and display unit 506.It will be understood by those skilled in the art that the not structure of server architecture shown in Fig. 5
The restriction of pairs of server may include perhaps combining certain components or different portions than illustrating more or fewer components
Part arrangement.Wherein:
Processor 501 is the control centre of the server, utilizes each of various interfaces and the entire server of connection
Part by running or execute the software program and/or module that are stored in memory 502, and calls and is stored in memory
Data in 502, the various functions and processing data of execute server, to carry out integral monitoring to server.Optionally,
Processor 501 may include one or more processing cores;Preferably, processor 501 can integrate application processor and modulation /demodulation
Processor, wherein the main processing operation system of application processor, user interface and application program etc., modem processor master
Handle wireless communication.It is understood that above-mentioned modem processor can not also be integrated into processor 501.
Memory 502 can be used for storing software program and module, and processor 501 is stored in memory 502 by operation
Software program and module, thereby executing various function application and data processing.Memory 502 can mainly include storage journey
Sequence area and storage data area, wherein storing program area can the (ratio of application program needed for storage program area, at least one function
Such as sound-playing function, image player function) etc.;Storage data area, which can be stored, uses created data according to server
Deng.In addition, memory 502 may include high-speed random access memory, it can also include nonvolatile memory, for example, at least
One disk memory, flush memory device or other volatile solid-state parts.Correspondingly, memory 502 can also include
Memory Controller, to provide access of the processor 501 to memory 502.
During RF circuit 503 can be used for receiving and sending messages, signal is sended and received, and particularly, the downlink of base station is believed
After breath receives, one or the processing of more than one processor 501 are transferred to;In addition, the data for being related to uplink are sent to base station.It is logical
Often, RF circuit 503 includes but is not limited to antenna, at least one amplifier, tuner, one or more oscillators, user identity
Module (SIM) card, transceiver, coupler, low-noise amplifier (LNA, Low Noise Amplifier), duplexer etc..This
Outside, RF circuit 503 can also be communicated with network and other equipment by wireless communication.Any communication can be used in the wireless communication
Standard or agreement, including but not limited to global system for mobile communications (GSM, Global System of Mobile
Communication), general packet radio service (GPRS, General Packet Radio Service), CDMA
(CDMA, Code Division Multiple Access), wideband code division multiple access (WCDMA, Wideband Code
Division Multiple Access), long term evolution (LTE, Long Term Evolution), Email, short message clothes
Be engaged in (SMS, Short Messaging Service) etc..
Server further includes the power supply 504 (such as battery) powered to all parts, it is preferred that power supply can pass through power supply
Management system and processor 501 are logically contiguous, to realize management charging, electric discharge and power consumption pipe by power-supply management system
The functions such as reason.Power supply 504 can also include one or more direct current or AC power source, recharging system, power failure
The random components such as detection circuit, power adapter or inverter, power supply status indicator.
The server may also include input unit 505, which can be used for receiving the number or character letter of input
Breath, and generation keyboard related with user setting and function control, mouse, operating stick, optics or trackball signal are defeated
Enter.
The server may also include display unit 506, the display unit 506 can be used for showing information input by user or
Be supplied to the information of user and the various graphical user interface of server, these graphical user interface can by figure, text,
Icon, video and any combination thereof are constituted.Display unit 508 may include display panel, optionally, can use liquid crystal display
Device (LCD, Liquid Crystal Display), Organic Light Emitting Diode (OLED, Organic Light-Emitting
) etc. Diode forms configure display panel.
Specifically in the present embodiment, the processor 501 in server can be according to following instruction, by one or more
The corresponding executable file of process of application program be loaded into memory 502, and run and be stored in by processor 501
Application program in reservoir 502, thus realize various functions, it is as follows:
The data file on disk is obtained, wherein the data file is stored with the data structure of shared drive;
The data file is loaded into the shared drive by the way of memory mapping, and the shared drive is carried out just
Beginningization;
Based on the loading procedure record data more new information, the data more new packets include data file filename and
First time information, the first time information are the time when loading procedure loads;
Processing is read out to data file according to the data more new information.
Preferably, which can be also used for: before obtaining the data file on disk,
Data file is stored in disk with the data structure of shared drive, the data structure of the shared drive includes number
Group, hash table, even numbers group word lookup tree.
Preferably, which can be also used for, which is loaded into the shared drive, and total to this
It enjoys after memory initialized:
Detection is updated to the data file on disk;It, will when determining there are when the data file of update in need
The data file that the needs update copies under more new directory;Data file under this more new directory is loaded into shared drive
In, and the shared drive is initialized;Data file under data directory is moved to backup directory;By the more new directory
Under data file be moved under the data directory, wherein the more new directory, the data directory and the backup directory are same
In file system.
Based on this, which can be also used for deleting the data file mapping before data file update.
Preferably, which can be also used for, after the data file mapping before deleting data file update:
Update the data more new information;According to prefixed time interval, the data more new information is read out;If it is determined that
Data more new information middle finger is shown with the more new information of purpose data file, then is loaded data file by the way of memory mapping
Into the shared drive, and time when load is recorded as the second temporal information.
Based on this, which be can be also used for: to the data after more new information is read out, if the number read
It is later than the time in the second temporal information according to the time in first time information indicated in more new information, it is determined that data text
Part has update;Data file updated under the data directory is mapped to the shared drive and is waited;Work as the waiting time
Data file mapping when more than preset time threshold, before deleting data file update.
It can be seen from the above, first data file is deposited with the data structure of shared drive in server provided in this embodiment
It is placed on disk, and the data file will be loaded into the shared drive by the way of memory mapping, and record more
New information carries out loading processing to data file to realize, and then is carried out more according to the data more new information to data file
New and load, the data being used in conjunction with so as to process in shared drive;The embodiment of the present invention is by by clear data knot on disk
The data file of structure is stored with the data structure of shared drive, and is loaded into shared drive, and load effect is improved
Rate;It supports load at one, many places to use, i.e., by shared drive, realizes that multiple processes are used with a shared drive data, greatly
Additional EMS memory occupation is reduced greatly.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment
Point, it may refer to the detailed description above with respect to data file processing method, details are not described herein again.
The data documents disposal system provided in an embodiment of the present invention is for example computer, tablet computer, has touching
The mobile phone etc. of function is touched, the data documents disposal system belongs to same with the data file processing method in foregoing embodiments
Design can run either offer in the data file processing method embodiment in the data documents disposal system
Method, specific implementation process are detailed in the data file processing method embodiment, and details are not described herein again.
It should be noted that for data file processing method of the present invention, this field common test personnel can be with
Understand all or part of the process for realizing data file processing method described in the embodiment of the present invention, is that can pass through computer program
It is completed to control relevant hardware, the computer program can be stored in a computer-readable storage medium, such as store
It is executed in the memory of terminal, and by least one processor in the terminal, in the process of implementation may include such as the number
According to the process of the embodiment of document handling method.Wherein, the storage medium can for magnetic disk, CD, read-only memory (ROM,
Read Only Memory), random access memory (RAM, Random Access Memory) etc..
For the data documents disposal system of the embodiment of the present invention, each functional module be can integrate at one
It manages in chip, is also possible to modules and physically exists alone, a module can also be integrated in two or more modules
In.Above-mentioned integrated module both can take the form of hardware realization, can also be realized in the form of software function module.Institute
If stating integrated module to realize in the form of software function module and when sold or used as an independent product, can also deposit
In a computer readable storage medium, the storage medium is for example read-only memory, disk or CD etc. for storage.
It is provided for the embodiments of the invention a kind of data file processing method above and system is described in detail, this
Apply that a specific example illustrates the principle and implementation of the invention in text, the explanation of above example is only intended to
It facilitates the understanding of the method and its core concept of the invention;Meanwhile for those skilled in the art, according to the thought of the present invention,
There will be changes in the specific implementation manner and application range, in conclusion the content of the present specification should not be construed as to this
The limitation of invention.
Claims (7)
1. a kind of data file processing method characterized by comprising
The data file on disk is obtained, wherein the data file is stored with the data structure of shared drive;
The data file is loaded into the shared drive by the way of memory mapping, and the shared drive is carried out
Initialization;Based on the loading procedure record data more new information, the data more new packets include the filename of data file with
And first time information, the first time information are the time when loading procedure loads;
Processing is read out to data file according to the data more new information;
Detection is updated to the data file on disk;It, will be described when determining there are when the data file of update in need
The data file for needing to update copies under more new directory;Data file under the more new directory is loaded into shared drive
In, and the shared drive is initialized;Data file under data directory is moved to backup directory;By the update
Data file under catalogue is moved under the data directory, wherein the more new directory, the data directory and described standby
Part catalogue is in same file system;Data file mapping before deleting data file update;Update the data more new information;
According to prefixed time interval, the data more new information is read out;If it is determined that data more new information middle finger is shown with
Data file is then loaded into the shared drive by the way of memory mapping by the more new information of purpose data file, and
Time when record load is the second temporal information.
2. data file processing method according to claim 1, which is characterized in that the data file obtained on disk
Before, further includes:
Data file is stored in disk with the data structure of shared drive, the data structure of the shared drive includes number
Group, hash table, even numbers group word lookup tree.
3. data file processing method according to claim 1, which is characterized in that it is described to the data more new information into
After row is read, further includes:
If the time in the data more new information read in indicated first time information be later than in the second temporal information when
Between, it is determined that data file has update;
Data file updated under the data directory is mapped to the shared drive and is waited;
Data file mapping when the waiting time being more than preset time threshold, before deleting data file update.
4. a kind of data documents disposal system characterized by comprising
Data management module, for obtaining the data file on disk, wherein the data file is with the data knot of shared drive
Structure is stored;The data file is loaded into the shared drive by the way of memory mapping, and to described shared
Memory is initialized;Data more new information is recorded based on the loading procedure, the data more new packets include data file
Filename and first time information, the first time information are the time when loading procedure loads;
Data read module, for being read out processing to data file according to the data more new information;
Detection module is updated, for being updated detection to the data file on disk;When determining that there are updates in need
When data file, the data file updated is needed to copy under more new directory by described;
The data management module is also used to for the data file under the more new directory being loaded into shared drive, and to institute
Shared drive is stated to be initialized;Data file under data directory is moved to backup directory;It will be under the more new directory
Data file is moved under the data directory, wherein the more new directory, the data directory and the backup directory exist
In same file system;Data file mapping before deleting data file update, updates the data more new information;
The data read module is also used to be read out the data more new information according to prefixed time interval;If it is determined that
Data more new information middle finger is shown with the more new information of purpose data file out, then is added data file by the way of memory mapping
It is downloaded in the shared drive, and records time when load as the second temporal information.
5. data documents disposal system according to claim 4, which is characterized in that the data management module is obtaining
Before data file on disk, it is also used to that data file is stored in disk with the data structure of shared drive, it is described total
The data structure for enjoying memory includes array, hash table, even numbers group word lookup tree.
6. data documents disposal system according to claim 4, which is characterized in that the data read module, to institute
After stating data more new information being read out, if in first time information indicated in the data more new information for being also used to read
Time be later than the time in the second temporal information, it is determined that data file has update;It will be updated under the data directory
Data file is mapped to the shared drive and is waited;When the waiting time being more than preset time threshold, data text is deleted
Data file before part updates maps.
7. a kind of computer readable storage medium, is stored with computer program, wherein the computer program can be by processor
It executes to realize method as described in any one of claims 1 to 3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510454768.0A CN106708825B (en) | 2015-07-29 | 2015-07-29 | A kind of data file processing method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510454768.0A CN106708825B (en) | 2015-07-29 | 2015-07-29 | A kind of data file processing method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106708825A CN106708825A (en) | 2017-05-24 |
CN106708825B true CN106708825B (en) | 2019-09-27 |
Family
ID=58894947
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510454768.0A Active CN106708825B (en) | 2015-07-29 | 2015-07-29 | A kind of data file processing method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106708825B (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107908798A (en) * | 2017-12-20 | 2018-04-13 | 浙江煮艺文化科技有限公司 | The processing method and system of a kind of data file |
CN108958732A (en) * | 2018-06-28 | 2018-12-07 | 上海恺英网络科技有限公司 | A kind of data load method and equipment based on PHP |
CN109359005B (en) * | 2018-09-14 | 2022-04-19 | 厦门天锐科技股份有限公司 | Cross-process data acquisition and processing method |
CN109542911B (en) * | 2018-12-03 | 2021-10-29 | 郑州云海信息技术有限公司 | Metadata organization method, system, equipment and computer readable storage medium |
CN110716939B (en) * | 2019-10-16 | 2023-05-09 | 深圳市网心科技有限公司 | Data management method, electronic device, system and medium |
CN111158611A (en) * | 2020-03-26 | 2020-05-15 | 长春师范大学 | New energy automobile controller memory management method |
CN113806593A (en) * | 2020-06-17 | 2021-12-17 | 新疆金风科技股份有限公司 | Communication abnormity detection method and device for wind power plant and plant controller |
CN111736973A (en) * | 2020-06-24 | 2020-10-02 | 北京奇艺世纪科技有限公司 | Service starting method, device, server and storage medium |
CN113110944A (en) * | 2021-03-31 | 2021-07-13 | 北京达佳互联信息技术有限公司 | Information searching method, device, server, readable storage medium and program product |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101082928A (en) * | 2007-06-25 | 2007-12-05 | 腾讯科技(深圳)有限公司 | Method for accessing database and data-base mapping system |
CN101296157A (en) * | 2007-04-26 | 2008-10-29 | 北京师范大学珠海分校 | Multi-network card coordination communication method |
CN101551808A (en) * | 2009-05-13 | 2009-10-07 | 山东中创软件商用中间件股份有限公司 | Technology supporting multi-process embedded tree-based databases |
CN101986649A (en) * | 2010-11-29 | 2011-03-16 | 深圳天源迪科信息技术股份有限公司 | Shared data center used in telecommunication industry billing system |
CN102890679A (en) * | 2011-07-20 | 2013-01-23 | 中兴通讯股份有限公司 | Method and system for processing data version |
-
2015
- 2015-07-29 CN CN201510454768.0A patent/CN106708825B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101296157A (en) * | 2007-04-26 | 2008-10-29 | 北京师范大学珠海分校 | Multi-network card coordination communication method |
CN101082928A (en) * | 2007-06-25 | 2007-12-05 | 腾讯科技(深圳)有限公司 | Method for accessing database and data-base mapping system |
CN101551808A (en) * | 2009-05-13 | 2009-10-07 | 山东中创软件商用中间件股份有限公司 | Technology supporting multi-process embedded tree-based databases |
CN101986649A (en) * | 2010-11-29 | 2011-03-16 | 深圳天源迪科信息技术股份有限公司 | Shared data center used in telecommunication industry billing system |
CN102890679A (en) * | 2011-07-20 | 2013-01-23 | 中兴通讯股份有限公司 | Method and system for processing data version |
Also Published As
Publication number | Publication date |
---|---|
CN106708825A (en) | 2017-05-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106708825B (en) | A kind of data file processing method and system | |
US11281632B2 (en) | Object information processing method and apparatus, and storage medium | |
US8745063B2 (en) | Hashing with hardware-based reorder using duplicate values | |
KR102127522B1 (en) | Computer program stored in computer readable medium, database server and audit performing server | |
US11093472B2 (en) | Using an LSM tree file structure for the on-disk format of an object storage platform | |
US11176099B2 (en) | Lockless synchronization of LSM tree metadata in a distributed system | |
WO2015078370A1 (en) | Method, device, node and system for managing file in distributed data warehouse | |
US10310904B2 (en) | Distributed technique for allocating long-lived jobs among worker processes | |
CN102984357B (en) | Contact person information managing method and managing device | |
CN110737682A (en) | cache operation method, device, storage medium and electronic equipment | |
US9928178B1 (en) | Memory-efficient management of computer network resources | |
CN109710185A (en) | Data processing method and device | |
CN111177143B (en) | Key value data storage method and device, storage medium and electronic equipment | |
CN101983376A (en) | Access device, information recording device, information recording system, file management method, and program | |
US8296270B2 (en) | Adaptive logging apparatus and method | |
US20160139980A1 (en) | Erasure-coding extents in an append-only storage system | |
CN112363871A (en) | Data file returning method, device and storage medium | |
US10210067B1 (en) | Space accounting in presence of data storage pre-mapper | |
CN109597707A (en) | Clone volume data copying method, device and computer readable storage medium | |
CN107408055B (en) | Code cache system | |
US20220342888A1 (en) | Object tagging | |
EP3343395A1 (en) | Data storage method and apparatus for mobile terminal | |
CN115705151A (en) | System, method and apparatus for managing device local memory | |
KR102214697B1 (en) | A computer program for providing space managrment for data storage in a database management system | |
US9967310B2 (en) | Using an RPC framework to facilitate out-of-band data transfers |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |