CN112905106A - Data processing method, device, system, equipment and storage medium - Google Patents

Data processing method, device, system, equipment and storage medium Download PDF

Info

Publication number
CN112905106A
CN112905106A CN201911228984.8A CN201911228984A CN112905106A CN 112905106 A CN112905106 A CN 112905106A CN 201911228984 A CN201911228984 A CN 201911228984A CN 112905106 A CN112905106 A CN 112905106A
Authority
CN
China
Prior art keywords
file
data processing
current reading
reading pointer
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911228984.8A
Other languages
Chinese (zh)
Other versions
CN112905106B (en
Inventor
陈亚川
符立佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Baishancloud Technology Co Ltd
Original Assignee
Guizhou Baishancloud Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou Baishancloud Technology Co Ltd filed Critical Guizhou Baishancloud Technology Co Ltd
Priority to CN201911228984.8A priority Critical patent/CN112905106B/en
Publication of CN112905106A publication Critical patent/CN112905106A/en
Application granted granted Critical
Publication of CN112905106B publication Critical patent/CN112905106B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data acquisition method, a device, equipment and a storage medium, wherein the method comprises the following steps: the data processing method comprises the following steps: s1, generating a first file with an empty state and storing the first file in a first storage area; s2, writing target data into the first file; s3, renaming the first file which is written in according to a preset mode when the preset state is reached; s4, the renamed file is moved to the second storage area, and then the process returns to the step S1. The data acquisition method provided by the invention is used for acquiring data, even if network blockage or process abnormity occurs, the data acquisition method only moves the data into different storage areas, so that an IO interface is not occupied, the problem of data loss due to high writing speed is avoided, and the real-time performance and data integrity of data acquisition are ensured.

Description

Data processing method, device, system, equipment and storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data acquisition method, apparatus, device, and storage medium.
Background
Currently, log file collection is divided into two collection modes, namely batch collection and streaming collection. The batch acquisition mode adopts a mode of periodically pulling log file data to realize acquisition, the acquisition process is strictly carried out according to a preset period every time, and although the integrity of the acquired data is good, the problem of poor real-time performance of data acquisition exists because each acquisition is influenced by the acquisition period.
The streaming collection adopts a real-time connection and real-time data pulling mode to collect data, and the data is collected immediately as long as log file data is generated, so that the streaming collection has the advantage of good real-time performance. The existing stream-type collection mode comprises two modes of tracking a log file in real time and receiving the log file in real time by a monitoring port, wherein the specific implementation mode of the collection mode of tracking the log file in real time is that a log generation unit generates the log file and transmits the log file to an intermediate storage unit, and the collection unit acquires the log file from the intermediate storage unit. In the prior art, in order to avoid an oversize single log file, a rotate mode is adopted to process the file, that is, the single log file is cut into a plurality of subfiles. When the cut log file is acquired by tracking the log file in real time, the log file generated by the log generation unit needs to be copied and renamed, and then the original log file is emptied. In this case, when the log file is large, the IO performance of the storage unit needs to be occupied, if the log is updated frequently, the acquisition unit cannot completely acquire the previous log file, and the next log file is generated, but the log file is easily lost due to the limited storage capacity of the storage unit.
The specific implementation manner of the acquisition mode that the monitoring port receives the log file in real time is that the acquisition unit acquires the log file from the log generation unit through a special port. The acquisition mode of the monitoring port for receiving the log file in real time depends on the availability of the monitoring process, and if the monitoring process is abnormal, the log is lost. Therefore, due to the characteristics of the stream acquisition, when a program or a network is abnormal in the stream acquisition process, the stream acquisition mode easily loses data, and the problem of poor data integrity exists when the stream acquisition mode is used for acquiring log file data.
For example, in the process of data acquisition by using a streaming acquisition mode, a program abnormal condition occurs, so that the acquisition unit for streaming acquisition does not transmit the acquired log file data to the downstream in time, and after the program abnormality is eliminated, the data acquired again by the acquisition unit can be covered by the log file data which is not transmitted to the downstream, thereby causing data loss. For another example, in the process of data acquisition by using a streaming acquisition mode, when a network is abnormal, the acquisition unit cannot acquire corresponding log file data in time, and if the network is recovered after abnormal duration is long, the data acquired by the acquisition unit is incomplete, that is, there is a data loss condition.
However, in some feature service scenarios, it is necessary to ensure both integrity and real-time performance of the acquired data, and both batch acquisition and streaming acquisition in the prior art are difficult to achieve the above requirements.
Disclosure of Invention
In order to solve the technical problem, the invention provides a data processing method, a device, a system and a storage medium.
The invention provides a data processing method applied to a data generation process, which comprises the following steps:
the data processing method comprises the following steps:
s1, generating a first file with an empty state and storing the first file in a first storage area;
s2, writing target data into the first file;
s3, renaming the first file which is written in according to a preset mode when the preset state is reached;
s4, the renamed file is moved to the second storage area, and then the process returns to the step S1.
The data processing method also has the following characteristics: the first storage area and the second storage area belong to the same storage unit.
The data processing method also has the following characteristics: renaming the first file which is completely written according to a preset mode comprises the following steps:
renaming the first file which is completed to be written according to a time stamp format, wherein the time contained in the time stamp format is the time reaching the preset data volume.
The invention provides a data processing method applied to a data generation process, which comprises the following steps:
the data processing method comprises the following steps:
receiving an acquisition instruction, and acquiring the position information of a current reading pointer;
judging whether the file corresponding to the current reading pointer is the first file or not according to the position information of the current reading pointer and the identification of the first file, if so, waiting for a first preset time period, reading the regenerated first file,
if not, moving the current reading pointer to a file next to the file corresponding to the current reading pointer, and reading;
the first file is stored in the first storage area.
The data processing method also has the following characteristics: before moving the current reading pointer to a file next to the file corresponding to the current reading pointer, the method further includes:
judging whether a file next to the file corresponding to the current reading pointer exists or not, if so, moving the current reading pointer to the file next to the file corresponding to the current reading pointer, and reading;
if not, after waiting for the first preset time length, moving the current reading pointer to the first file, and reading.
The data processing method also has the following characteristics: the current reading pointer moves to a file next to the file corresponding to the current reading pointer, and the reading comprises:
judging whether the next file is a renamed file stored in a second storage area, if so, moving the current reading pointer to the next file, and reading;
if not, the next file is the first file, and the current reading pointer is moved to the first file and read.
The data processing method also has the following characteristics: the data processing method further comprises:
and after the reading of the file corresponding to the current reading pointer is finished, the identifier of the first file is obtained again.
The data processing method also has the following characteristics: arranging the file corresponding to the current reading pointer and the next file of the file corresponding to the current reading pointer according to a preset mode, wherein the arranging according to the preset mode comprises arranging according to a timestamp format; and/or the presence of a gas in the gas,
the position information of the current reading pointer comprises the name of the file being read by the current reading pointer and the identifier of the file, and the name of the file is the name obtained by renaming the first file which is written in according to the preset mode.
The invention provides a data processing device, which is applied to a data generation process and comprises the following steps:
the data processing apparatus includes:
the generating module is used for generating a first file with an empty state and storing the first file in a first storage area;
the writing module is used for writing target data into the first file;
and the renaming module is used for renaming the first file which is written in according to a preset mode when the preset state is reached, and is used for moving the renamed file to the second storage area.
The data processing device also has the following characteristics: the data processing device further comprises a storage unit, and the first storage area and the second storage area belong to the same storage unit.
The data processing device also has the following characteristics: the renaming module is further configured to rename the first file that is completely written according to a timestamp format, where a time included in the timestamp format is a time when a preset state is reached.
The invention provides a data processing device, which is applied to the data acquisition process and comprises the following steps:
the data processing apparatus includes:
the positioning module is used for receiving the acquisition instruction and acquiring the position information of the current reading pointer;
a judging execution module, configured to judge whether the file corresponding to the current reading pointer is the first file according to the position information of the current reading pointer and the identifier of the first file, if so, wait for a first predetermined time period, and then read the regenerated first file,
if not, moving the current reading pointer to a file next to the file corresponding to the current reading pointer, and reading;
the first file is stored in the first storage area.
The data processing device also has the following characteristics: the judgment execution module is further configured to judge whether a file next to the file corresponding to the current reading pointer exists, and if so, the current reading pointer moves to and reads the file next to the file corresponding to the current reading pointer;
if not, after waiting for the first preset time length, moving the current reading pointer to the first file, and reading.
The data processing device also has the following characteristics: the judgment execution module is further configured to judge whether the next file is a renamed file stored in a second storage area, and if so, the current reading pointer moves to the next file and reads the file;
if not, the next file is the first file, and the current reading pointer is moved to the first file and read.
The data processing device also has the following characteristics: the obtaining module is further configured to obtain the identifier of the first file again after the file corresponding to the current reading pointer is read.
The data processing device also has the following characteristics: arranging the file corresponding to the current reading pointer and the next file of the file corresponding to the current reading pointer according to a preset mode, wherein the arranging according to the preset mode comprises arranging according to a timestamp format; and/or the presence of a gas in the gas,
the position information of the current reading pointer comprises the name of the file being read by the current reading pointer and the identifier of the file, and the name of the file is the name obtained by renaming the first file which is written in according to the preset mode.
The invention provides a data processing system comprising a data processing apparatus as described above.
The transmission device provided by the present invention comprises: a transceiver, a memory, a processor;
the transceiver is used for receiving and transmitting messages;
the memory is used for storing instructions and data;
the processor is used for reading the instructions and data stored in the memory to execute the data processing method.
The invention also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a data processing method as described above.
The data processing method, the device and the system further improve the real-time performance of data acquisition and simultaneously improve the reliability of the data acquisition process. The data acquisition method is used for acquiring data, even if network blockage or process abnormity occurs, the operation of deleting the original file after copying the first file written with the target data is not used, the first file is only moved to different storage areas, an IO interface is not occupied, the problem of data loss due to high writing speed is avoided, and the real-time performance and data integrity of data acquisition are ensured.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate an embodiment of the invention and, together with the description, serve to explain the invention and not to limit the invention. In the drawings:
FIG. 1 is a flow chart of a data processing method provided by an exemplary embodiment of the present invention;
FIG. 2 is another flow chart of a data processing method provided by an exemplary embodiment of the present invention;
FIG. 3 is a first state diagram of an acquisition process provided by an exemplary embodiment of the present invention;
FIG. 4 is a second state diagram of an acquisition process provided by an exemplary embodiment of the present invention;
FIG. 5 is a third state diagram of an acquisition process provided by an exemplary embodiment of the present invention;
FIG. 6 is a fourth state diagram of an acquisition process provided by an exemplary embodiment of the present invention;
FIG. 7 is a flowchart of an acquisition process provided by an exemplary embodiment of the present invention;
fig. 8 is a schematic view of a connection structure of a data acquisition apparatus according to an exemplary embodiment of the present invention;
fig. 9 is a schematic view of another connection structure of the data acquisition device according to the exemplary embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention. It should be noted that the embodiments and features of the embodiments in the present application may be arbitrarily combined with each other without conflict.
The application provides a data processing method which is applied to a data acquisition process, for example, website logs or server logs can be acquired, so that websites or servers can be better monitored. The data processing method in the application further improves the real-time performance of data acquisition and simultaneously improves the reliability of the data acquisition process. The data acquisition method is used for acquiring data, even if network blockage or process abnormity occurs, the operation of deleting the original file after copying the first file written with the target data is not used, the first file is only moved to different storage areas, an IO interface is not occupied, the problem of data loss due to high writing speed is avoided, and the real-time performance and data integrity of data acquisition are ensured.
The data processing method in the application collects target data, such as log data, in real time, mainly includes a data generation process and a data collection process, and the data processing method in the two processes is described in detail below.
Fig. 1 shows an exemplary embodiment of a data processing method provided in the present application, which is applied in a data generation process. When the data processing method is applied to a process of processing a generated log file, the data processing method may be considered to be operated on a server, and in the operation process of the server, server log data is target data, and in the generation process of the target data, the data processing method includes:
s1, generating a first file with an empty state and storing the first file in a first storage area;
s2, writing target data into the first file;
s3, renaming the first file which is written in according to a preset mode when the preset state is reached;
s4, moving the renamed file to a second storage area, and returning to the step S1;
the first storage area and the second storage area can be understood as file directories, and the first storage area and the second storage area can be different, that is, the file directory of the first file and the file directory of the renamed file can be different and are stored in different file directories; of course, the first storage area and the second storage area may also be the same area, for example, located under the same storage disk and located under the same file directory, and this application is only for convenience of description and is distinguished in terms of names.
In step S1, the first file is a file for recording the target data, for example, the first file may be an access. The server itself has a storage unit, which may be a region of the overall storage device of the server, or an external storage device, such as a magnetic disk. When the server receives the generation instruction, step S1 is executed to generate the first file with an empty status, and since the generation process of the first file is repeated (as described in detail later), in order to distinguish the generated files, each generated first file has a unique identifier to ensure the accuracy and reliability of data processing, and each generated first file is stored in the first storage area.
In step S2, when the user accesses the website to log in, or logs in a game, for example, the website and the game server automatically generate a log, and the generated log is automatically written into the first file as target data.
In step S3, the preset status may be that the target data writing operation is completed for a preset duration, for example, each time the target data writing operation is performed for 1 minute, the first file is renamed for each time the target data writing operation is completed for 1 minute. The preset state may also be that the target data written into the first file reaches a preset data volume, and the logs of the website and the logs of the server are continuously generated, but the storage capacity of the first file is fixed, and when the data volume of the target data written into the first file, that is, the data volume of the log data, reaches the preset data volume, the writing of the first file is completed, at this time, the first file is renamed according to a preset mode, the unique identifier of the first file is unchanged, and only the name of the first file is changed. In a specific embodiment, the preset state is a time when the target data writing operation of a preset duration is completed, the definition file which is completed writing is renamed according to a preset mode, and the first file which is completed writing is renamed according to a timestamp format. The timestamp format may include a time that is a time to reach a preset state, and may be accurate to seconds, for example, the timestamp format may be yyymmdddhhmm, or yyymmdhhmmss, where y represents year, M represents month, d represents day, H represents hour, M represents minute, and S represents seconds. Log, the renamed name becomes in the form of 20190101000000 or 201901010000. The minimum timing unit of the timestamp format can be determined according to a preset time length, for example, the preset time length is 1 minute, the timestamp format is accurate to minutes, and if the preset time length is 5 seconds, the timestamp format is accurate to seconds. When the preset state is that the target data written into the first file reaches the preset data volume, the time represented by the timestamp format is the moment when the data volume of the target data reaches the preset data volume. Of course, it can be understood that renaming according to the timestamp format is not the only way, but also can be according to a naming way set by a user, for example, in the log data generation process, renaming a first file according to a time sequence generated by the log data by using arabic numbers in a sequence from small to large, so that the renamed file names can be 00001, 00002, and 00003, and as long as the renaming is guaranteed to rename the first file according to a certain rule or a certain sequence, a situation that two files are the same name does not occur, so as to avoid problems in the data processing process.
In step S4, in order to avoid that the generated log data file is lost when an abnormal program or process occurs on the network, the data processing method in the present application sets a second storage area during the data generation process, and stores the renamed file in the second storage area. In addition, in a specific implementation process, the first storage area and the second storage area belong to the same storage unit, so that after the first file is written in, the renaming process of the first file is performed in a storage device for storage, such as a local disk, and is not transmitted between the two devices, and an IO interface is not occupied, so that the situation that log data is lost due to the fact that the writing speed of the log data into the first file is too high and the later data acquisition and reading process (detailed description later) is slow can be avoided. It should be further noted that, a designated directory may be set in the storage unit as a second storage area for storing the renamed file, but a first storage area is not required to be specially set for storing the first file, and the first storage area is understood to mean that all areas except the second storage area in the storage unit are the first storage area.
According to the data processing method applied to the data generation process, in order to avoid overlarge log data files generated within a period of time, a script is required to be used for executing a rotate instruction, the instruction is a log data file cutting instruction, and the overlarge single file is effectively avoided. In the running process of the rotate instruction, when the first file is moved to the second storage area after renaming, the rotate instruction sends a regeneration signal, so that the process is triggered to recreate the first file, and the first file is written into the first file when log data are generated.
According to the data generation method applied to the data generation process, even if the acquisition speed is too low due to abnormality in a subsequent acquisition process (namely the acquisition process) or the acquisition process cannot be executed within a period of time due to network blockage, the rotate instruction can still be normally executed, and continuously generated log data are stored in the renamed file and stored in the second storage area of the storage unit and cannot be lost. And because the identification and the name of each file are unique, when the acquisition process is eliminated abnormally or the network is unobstructed, the acquisition process can be read again from the position where the acquisition process quits last time, so that the reliability and the integrity of the data acquired by the acquisition process are ensured.
As shown in fig. 2, the present application provides a data processing method applied to a data acquisition process, where the data processing method includes:
receiving an acquisition instruction, and acquiring the position information of a current reading pointer;
judging whether the file corresponding to the current reading pointer is a first file or not according to the position information of the current reading pointer and the identification of the first file, if so, waiting for a first preset time period, and then reading the regenerated first file; here, the reason why the waiting is performed is that since the reading speed is larger than the writing speed, the reading speed needs to be lowered by waiting to wait until the state where the reading speed and the writing speed are constant.
If not, moving the current reading pointer to a file next to the file corresponding to the current reading pointer, and reading;
the first file is stored in the first storage area.
In a specific embodiment, when the acquisition process (i.e., the acquisition process) is abnormal or the network is blocked, the data acquisition process may stop, and the acquisition process may continue to be performed after the acquisition process returns to normal or the network is unobstructed. Because the duration of the process anomaly or the network blocking cannot be determined, when the acquisition process cannot acquire data for a long time, log data is continuously generated, a first file with a unique identifier is continuously generated, and a renamed file is continuously stored in a second storage area, so that the following four states can occur after the acquisition process is recovered according to the difference of the duration of the process anomaly or the network blocking time, as shown in fig. 3 to 6. In the normal collection process, log data is continuously generated, and the collection process is continuously read, so that the current reading pointer corresponds to the first file, as shown in fig. 3. As shown in fig. 4, if the network is blocked for a short time, the first file being read before the blocking is renamed and stored in the second storage area, and at this time, a first file is regenerated, but the previous renamed file is not read completely. As shown in fig. 5, if the network has been blocked for a long time, two first files are renamed and then saved in the second storage area within the duration of the blocking, and the current reading pointer will be behind a little bit. If the time of the blocking or process exception is very short, as shown in fig. 6, the situation shown in fig. 6 will occur, i.e. the previous first file, just renamed to be saved in the second storage area, and a new first file has not yet been generated.
Therefore, in order to avoid data loss and ensure the integrity of the acquired data, after receiving the acquisition instruction, the data processing method in the application determines the position of the current reading pointer at first. The position information of the current reading pointer comprises the name of the file which is currently read and the unique identifier of the file, so that the position of the file can be accurately determined, and whether the file is the first file or not can be further determined. Because the first file generated each time has a unique identifier, the identifiers of the two files can be compared according to the unique identifier of the currently read file contained in the position information of the current reading pointer, and if the identifiers of the two files are consistent, the fact that the first file is currently read and collected means that the first file is currently read. If the two identifications do not match, it means that the first file is not currently being read. If the first file is being read, then waiting for a first predetermined time period before beginning to read from the header of the first file in order to ensure the integrity of the data. If the current read pointer does not correspond to the first file, reading is started from the head of the file to which the current read pointer corresponds.
Further, before moving the current reading pointer to the next file of the file corresponding to the current reading pointer, the method further includes:
judging whether a file next to the file corresponding to the current reading pointer exists or not, if so, moving the current reading pointer to the file next to the file corresponding to the current reading pointer, and reading;
if not, after waiting for the first preset time, moving the current reading pointer to the first file, and reading. At this time, if there is no file next to the file corresponding to the current read pointer, it indicates that the first file does not exist, and the first predetermined time is waited for when the first file is successfully created during the entire creation process.
If the file corresponding to the current reading pointer is not the first file, it may be in the state shown in fig. 4 to 6, and at this time, it is necessary to determine whether there is a file next to the file that has been read. In the determination, when the renamed file is named according to the timestamp, the next file is a file generated later than the file that has been read. That is, the file corresponding to the current reading pointer and the next file of the file corresponding to the current reading pointer are arranged according to a preset mode, wherein the arranging according to the preset mode includes arranging according to a timestamp format. If there is a next file, the current read pointer is moved to the file next to the file to which the current read pointer corresponds, and read, corresponding to both cases shown in fig. 4 and 5. Specifically, in the case of fig. 4, or in the case of fig. 5, a one-step determination (described in detail later) is required. If the next file does not exist, corresponding to the situation in fig. 6, after waiting for the first predetermined time, moving the current reading pointer to the first file, and reading.
Further, if there is a next file, corresponding to the case in fig. 4 and 5, the current read pointer is moved to the next file of the file corresponding to the current read pointer, and the reading includes:
judging whether the next file is the renamed file stored in the second storage area, if so, moving the current reading pointer to the next file, and reading;
if not, the next file is the first file, and the current reading pointer is moved to the first file and read.
If the next file is stored in the second storage area, corresponding to the case shown in fig. 5, at this time, the current read pointer is moved to the next file and read. If the next file is not stored in the second storage area, indicating that the next file is the first file, at this time, corresponding to the state shown in fig. 4, the current read pointer is moved to the first file and read.
Here, it should be noted that, no matter whether the file corresponding to the current read pointer is determined to be the first file or not, or whether the next file is determined to be the first file or not, the unique identifier of each file and the name of the file are used for determination. In addition, in the process of reading the file, reading is started from the head of the file, whether the tail is read or not needs to be judged in each reading process, and corresponding symbols are arranged at the head and the tail of each file so that a current reading pointer can determine the position being read. And when the reading process is carried out to the tail part, the judgment process is triggered to ensure that the next reading operation is carried out. In addition, because the identifier of the first file regenerated every time is unique, in order to accurately determine the first file and the renamed file, the identifier of the first file is obtained again after the file corresponding to the current reading pointer is read every time.
As shown in fig. 7, a detailed description is given of the data processing method in the present invention, after the acquisition process starts, the position information of the current reading pointer is first obtained, whether the current reading pointer reads the tail is judged, and if yes, the identifier of the first file is obtained again; if not, continuing to read. And after the first file identification is obtained again, judging whether the file corresponding to the current reading pointer is the first file or not, if so, indicating that the reading speed at the moment is higher than the writing speed, reducing the reading speed to keep the reading speed consistent with the writing speed, waiting for a first preset time period, then reading the regenerated first file, and if not, moving the current reading pointer to the next file of the file corresponding to the current reading pointer and reading. Then, judging whether a next file of the file corresponding to the current reading pointer exists or not, if so, moving the current reading pointer to the next file of the file corresponding to the current reading pointer, and reading; if not, the current reading pointer is moved to the first file and is read. Then, judging whether the next file is the renamed file stored in the second storage area, if so, moving the current reading pointer to the next file and reading; if not, the next file is the first file, and the current reading pointer is moved to the first file and read.
By using the data processing method in the data acquisition process, when the process or the network is abnormal and is recovered again, the current reading pointer can be read from the position where the abnormality occurs in real time, so that the real-time performance and the accuracy are ensured.
The application also provides a data processing device, which is applied to the data generation process and used for realizing the data processing method shown in fig. 1. As shown in fig. 8, the data processing apparatus includes:
the generating module is used for generating a first file with an empty state and storing the first file in a first storage area;
the writing module is used for writing target data into the first file;
the renaming module is used for renaming the first file which is written in according to a preset mode when the preset state is reached, and moving the renamed file to a second storage area;
and the first storage area and the second storage area belong to the same storage unit.
Further, the renaming module is further configured to rename the first file that has been written according to a timestamp format, where the timestamp format includes a time when the preset state is reached.
The application also provides a data processing device, which is applied to the data generation process and used for realizing the data processing method shown in fig. 2 to 7. As shown in fig. 9, the data processing apparatus includes:
the positioning module is used for receiving the acquisition instruction and acquiring the position information of the current reading pointer;
a judging and executing module for judging whether the file corresponding to the current reading pointer is the first file or not according to the position information of the current reading pointer and the identification of the first file, if so, reading the regenerated first file after waiting for a first preset time length,
if not, moving and reading the current reading pointer to a next file of the file corresponding to the current reading pointer, and reading;
the first file is stored in the first storage area.
Further, the judgment execution module is further configured to judge whether a file next to the file corresponding to the current reading pointer exists, and if so, the current reading pointer moves to the file next to the file corresponding to the current reading pointer and reads the file;
if not, after waiting for the first preset time length, moving the current reading pointer to the first file, and reading.
Further, the judgment execution module is further configured to judge whether the next file is the renamed file stored in the second storage area, and if so, the current reading pointer is moved to the next file and read;
if not, the next file is the first file, and the current reading pointer is moved to the first file and read.
Further, the obtaining module is further configured to obtain the identifier of the first file again after the file corresponding to the current reading pointer is read.
The invention also provides a data processing system comprising the data processing device to execute the data processing method.
The present invention also provides a transmission device, comprising: a transceiver, a memory, a processor; the transceiver is used for receiving and transmitting messages; the memory is used for storing instructions and data; the processor is used for reading the instructions and data stored in the memory so as to execute the data processing method.
The present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described data processing method.
The above-described aspects may be implemented individually or in various combinations, and such variations are within the scope of the present invention.
It will be understood by those skilled in the art that all or part of the steps of the above methods may be implemented by instructing the relevant hardware through a program, and the program may be stored in a computer readable storage medium, such as a read-only memory, a magnetic or optical disk, and the like. Alternatively, all or part of the steps of the foregoing embodiments may also be implemented by using one or more integrated circuits, and accordingly, each module/unit in the foregoing embodiments may be implemented in the form of hardware, and may also be implemented in the form of a software functional module. The present invention is not limited to any specific form of combination of hardware and software.
It is to be noted that, in this document, the terms "comprises", "comprising" or any other variation thereof are intended to cover a non-exclusive inclusion, so that an article or apparatus including a series of elements includes not only those elements but also other elements not explicitly listed or inherent to such article or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of additional like elements in the article or device comprising the element.
The above embodiments are merely to illustrate the technical solutions of the present invention and not to limit the present invention, and the present invention has been described in detail with reference to the preferred embodiments. It will be understood by those skilled in the art that various modifications and equivalent arrangements may be made without departing from the spirit and scope of the present invention and it should be understood that the present invention is to be covered by the appended claims.

Claims (19)

1. A data processing method is applied to a data generation process, and is characterized by comprising the following steps:
s1, generating a first file with an empty state and storing the first file in a first storage area;
s2, writing target data into the first file;
s3, renaming the first file which is written in according to a preset mode when the preset state is reached;
s4, the renamed file is moved to the second storage area, and then the process returns to the step S1.
2. The data processing method of claim 1, wherein the first storage area and the second storage area belong to a same storage unit.
3. The data processing method of claim 1, wherein the renaming the first file that completed writing in a predetermined manner comprises:
renaming the first written file according to a timestamp format, wherein the time contained in the timestamp format is the time reaching a preset state.
4. A data processing method is applied to a data acquisition process, and is characterized by comprising the following steps:
receiving an acquisition instruction, and acquiring the position information of a current reading pointer;
judging whether the file corresponding to the current reading pointer is the first file or not according to the position information of the current reading pointer and the identification of the first file, if so, waiting for a first preset time period, reading the regenerated first file,
if not, moving the current reading pointer to a file next to the file corresponding to the current reading pointer, and reading;
the first file is stored in the first storage area.
5. The data processing method of claim 4, wherein before moving the current read pointer to a file next to a file corresponding to the current read pointer, further comprising:
judging whether a file next to the file corresponding to the current reading pointer exists or not, if so, moving the current reading pointer to the file next to the file corresponding to the current reading pointer, and reading;
if not, after waiting for the first preset time, moving the current reading pointer to the first file, and reading.
6. The data processing method of claim 5, wherein the current read pointer is moved to a file next to the file corresponding to the current read pointer, and reading comprises:
judging whether the next file is a renamed file stored in a second storage area, if so, moving the current reading pointer to the next file, and reading;
if not, the next file is the first file, and the current reading pointer is moved to the first file and read.
7. The data processing method of any of claims 4 to 6, wherein the data processing method further comprises:
and after the reading of the file corresponding to the current reading pointer is finished, the identifier of the first file is obtained again.
8. The data processing method according to claim 5 or 6, wherein the file corresponding to the current reading pointer and a file next to the file corresponding to the current reading pointer are arranged according to a preset manner, wherein the arranging according to the preset manner includes arranging according to a timestamp format; and/or the presence of a gas in the gas,
the position information of the current reading pointer comprises the name of the file being read by the current reading pointer and the identifier of the file, and the name of the file is the name obtained by renaming the first file which is written in according to the preset mode.
9. A data processing apparatus applied to a data generation process, the data processing apparatus comprising:
the generating module is used for generating a first file with an empty state and storing the first file in a first storage area;
the writing module is used for writing target data into the first file;
and the renaming module is used for renaming the first file which is written in according to a preset mode when the preset state is reached, and is used for moving the renamed file to the second storage area.
10. The data processing apparatus according to claim 9, wherein the data processing apparatus further comprises a storage unit, and the first storage area and the second storage area belong to the same storage unit.
11. The data processing apparatus according to claim 9, wherein the renaming module is further configured to rename the first file that has completed writing according to a timestamp format, wherein the timestamp format includes a time when a preset state is reached.
12. A data processing device applied to a data acquisition process is characterized by comprising:
the positioning module is used for receiving the acquisition instruction and acquiring the position information of the current reading pointer;
a judging execution module, configured to judge whether the file corresponding to the current reading pointer is the first file according to the position information of the current reading pointer and the identifier of the first file, if so, wait for a first predetermined time period, and then read the regenerated first file,
if not, moving the current reading pointer to a file next to the file corresponding to the current reading pointer, and reading;
the first file is stored in the first storage area.
13. The data processing apparatus according to claim 12, wherein the determining executing module is further configured to determine whether a file next to the file corresponding to the current reading pointer exists, and if so, the current reading pointer moves to the file next to the file corresponding to the current reading pointer and reads the file;
if not, after waiting for the first preset time length, moving the current reading pointer to the first file, and reading.
14. The data processing apparatus according to claim 13, wherein the determining module is further configured to determine whether the next file is a renamed file stored in a second storage area, and if so, the current read pointer is moved to the next file and read;
if not, the next file is the first file, and the current reading pointer is moved to the first file and read.
15. The data processing apparatus according to any one of claims 12 to 14, wherein the obtaining module is further configured to obtain the identifier of the first file again after completing reading the file corresponding to the current reading pointer.
16. The data processing apparatus according to claims 13 to 14, wherein the file corresponding to the current reading pointer and a file next to the file corresponding to the current reading pointer are arranged according to a preset manner, wherein the arranging according to the preset manner includes arranging according to a timestamp format; and/or the presence of a gas in the gas,
the position information of the current reading pointer comprises the name of the file being read by the current reading pointer and the identifier of the file, and the name of the file is the name obtained by renaming the first file which is written in according to the preset mode.
17. A data processing system comprising a data processing apparatus as claimed in claims 9 to 11 and a data processing apparatus as claimed in claims 12 to 16.
18. A transmission device, characterized in that the transmission device comprises: a transceiver, a memory, a processor;
the transceiver is used for receiving and transmitting messages;
the memory is used for storing instructions and data;
the processor is used for reading the instructions and data stored in the memory to execute the data processing method of any one of claims 1 to 3 and 4 to 8.
19. A computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements a data processing method according to any one of claims 1 to 3 and 4 to 8.
CN201911228984.8A 2019-12-04 2019-12-04 Data processing method, device, system, equipment and storage medium Active CN112905106B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911228984.8A CN112905106B (en) 2019-12-04 2019-12-04 Data processing method, device, system, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911228984.8A CN112905106B (en) 2019-12-04 2019-12-04 Data processing method, device, system, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112905106A true CN112905106A (en) 2021-06-04
CN112905106B CN112905106B (en) 2023-04-18

Family

ID=76111035

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911228984.8A Active CN112905106B (en) 2019-12-04 2019-12-04 Data processing method, device, system, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112905106B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113609532A (en) * 2021-08-13 2021-11-05 阳光电源股份有限公司 Data integrity checking method and device, computer equipment and storage medium
CN113630442A (en) * 2021-07-14 2021-11-09 远景智能国际私人投资有限公司 Data transmission method, device and system

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7890469B1 (en) * 2002-12-30 2011-02-15 Symantec Operating Corporation File change log
CN102929733A (en) * 2012-10-18 2013-02-13 北京奇虎科技有限公司 Method and device for processing error files and client-side equipment
CN104714878A (en) * 2013-12-11 2015-06-17 阿里巴巴集团控股有限公司 Method and device for collecting log data
CN106095959A (en) * 2016-06-16 2016-11-09 北京中电普华信息技术有限公司 A kind of collecting method, Apparatus and system
CN106815363A (en) * 2017-01-24 2017-06-09 郑州云海信息技术有限公司 One kind rotates management method and device based on linux daily records
CN107526674A (en) * 2017-08-31 2017-12-29 郑州云海信息技术有限公司 A kind of method and apparatus of embedded system log recording
CN107609133A (en) * 2017-09-18 2018-01-19 郑州云海信息技术有限公司 Journal file dump method, device, equipment and its computer-readable recording medium
CN107704478A (en) * 2017-01-16 2018-02-16 贵州白山云科技有限公司 A kind of method and system for writing daily record
CN109299052A (en) * 2018-09-03 2019-02-01 平安普惠企业管理有限公司 Log cutting method, device, computer equipment and storage medium
CN109960686A (en) * 2019-03-26 2019-07-02 北京百度网讯科技有限公司 The log processing method and device of database
CN110162448A (en) * 2018-02-13 2019-08-23 北京京东尚科信息技术有限公司 The method and apparatus of log collection
US10474656B1 (en) * 2017-02-21 2019-11-12 Nutanix, Inc. Repurposing log files

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7890469B1 (en) * 2002-12-30 2011-02-15 Symantec Operating Corporation File change log
CN102929733A (en) * 2012-10-18 2013-02-13 北京奇虎科技有限公司 Method and device for processing error files and client-side equipment
CN104714878A (en) * 2013-12-11 2015-06-17 阿里巴巴集团控股有限公司 Method and device for collecting log data
CN106095959A (en) * 2016-06-16 2016-11-09 北京中电普华信息技术有限公司 A kind of collecting method, Apparatus and system
CN107704478A (en) * 2017-01-16 2018-02-16 贵州白山云科技有限公司 A kind of method and system for writing daily record
CN106815363A (en) * 2017-01-24 2017-06-09 郑州云海信息技术有限公司 One kind rotates management method and device based on linux daily records
US10474656B1 (en) * 2017-02-21 2019-11-12 Nutanix, Inc. Repurposing log files
CN107526674A (en) * 2017-08-31 2017-12-29 郑州云海信息技术有限公司 A kind of method and apparatus of embedded system log recording
CN107609133A (en) * 2017-09-18 2018-01-19 郑州云海信息技术有限公司 Journal file dump method, device, equipment and its computer-readable recording medium
CN110162448A (en) * 2018-02-13 2019-08-23 北京京东尚科信息技术有限公司 The method and apparatus of log collection
CN109299052A (en) * 2018-09-03 2019-02-01 平安普惠企业管理有限公司 Log cutting method, device, computer equipment and storage medium
CN109960686A (en) * 2019-03-26 2019-07-02 北京百度网讯科技有限公司 The log processing method and device of database

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113630442A (en) * 2021-07-14 2021-11-09 远景智能国际私人投资有限公司 Data transmission method, device and system
CN113630442B (en) * 2021-07-14 2023-09-12 远景智能国际私人投资有限公司 Data transmission method, device and system
CN113609532A (en) * 2021-08-13 2021-11-05 阳光电源股份有限公司 Data integrity checking method and device, computer equipment and storage medium
CN113609532B (en) * 2021-08-13 2024-04-12 阳光电源股份有限公司 Data integrity checking method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN112905106B (en) 2023-04-18

Similar Documents

Publication Publication Date Title
US9176803B2 (en) Collecting data from a system in response to an event based on an identification in a file of the data to collect
US11687488B2 (en) Directory deletion method and apparatus, and storage server
CN110311831B (en) Container cloud-based system resource monitoring method and related equipment
US8893111B2 (en) Event evaluation using extrinsic state information
CN112131237B (en) Data synchronization method, device, equipment and computer readable medium
CN112905106B (en) Data processing method, device, system, equipment and storage medium
CN110704173A (en) Task scheduling method, scheduling system, electronic device and computer storage medium
CN110647460B (en) Test resource management method and device and test client
CN110187995B (en) Method for fusing opposite end node and fusing device
CN110569085A (en) configuration file loading method and system
CN107040576B (en) Information pushing method and device and communication system
US20090157767A1 (en) Circular log amnesia detection
CN114968966A (en) Distributed metadata remote asynchronous replication method, device and equipment
CN106294470A (en) The method that real-time incremental log information based on cutting daily record reads
JP5956064B2 (en) Computer system, data management method, and computer
CN116633766A (en) Fault processing method and device, electronic equipment and storage medium
CN113849328B (en) Management method and device of disaster recovery system
CN113392006B (en) Method and equipment for monitoring automatic test logs by using capsules
CN111708780B (en) Distributed form system, partition master selection method, device, server and medium
CN112650613B (en) Error information processing method and device, electronic equipment and storage medium
CN114218058A (en) Method and system for recording user operation sequence by multi-task user operation log
CN112363675A (en) Control method and system based on distributed storage system
CN111737385B (en) Electronic map data error detection method and device
CN116582453B (en) Monitoring data migration method and system in multi-service cluster scene
CN115855119B (en) Navigation system fault analysis method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant