CN111694793A - Log storage method and device and log query method and device - Google Patents

Log storage method and device and log query method and device Download PDF

Info

Publication number
CN111694793A
CN111694793A CN202010538309.1A CN202010538309A CN111694793A CN 111694793 A CN111694793 A CN 111694793A CN 202010538309 A CN202010538309 A CN 202010538309A CN 111694793 A CN111694793 A CN 111694793A
Authority
CN
China
Prior art keywords
log
node
service
log file
service node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010538309.1A
Other languages
Chinese (zh)
Inventor
赵宇
徐寅斐
侯雪峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Cloud Network Technology Co Ltd
Original Assignee
Beijing Kingsoft Cloud Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Cloud Network Technology Co Ltd filed Critical Beijing Kingsoft Cloud Network Technology Co Ltd
Priority to CN202010538309.1A priority Critical patent/CN111694793A/en
Publication of CN111694793A publication Critical patent/CN111694793A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • G06F16/134Distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1734Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the invention provides a log storage method and device and a log query method and device, wherein the log storage method is applied to service nodes in a distributed system, the distributed system also comprises log collection nodes, and the method comprises the following steps: generating a log file according to the service processing condition; and sending the log file to a corresponding log collection node according to a mapping relation between the service node and the log collection node which is established in advance, so that the log collection node stores the log file, and an index is established for the stored log file. Therefore, when log query is carried out, the log file is stored in the log collection node, so that the log query is carried out only in the log collection node, the processing of the service node on normal service is not influenced, and meanwhile, the log query efficiency is greatly improved as thousands of nodes in a distributed system do not need to be traversed.

Description

Log storage method and device and log query method and device
Technical Field
The invention relates to the technical field of log processing, in particular to a log storage method and device and a log query method and device.
Background
With the development of the big data technology industry, a large-scale clustered distributed system becomes a basic composition of the big data technology. A distributed system is often composed of thousands of service devices, i.e. with thousands of service nodes in which the running logs of the computer programs are distributed and in very large numbers. Log query is very important for operation and maintenance of large-scale clustered distributed systems because logs can record the running condition of computer programs, so that the logs can be analyzed to determine some problems of service nodes.
At present, log query for a distributed system is based on batch operation instructions such as an ansable automation operation and maintenance tool, and is matched with linux system commands such as a grep (global search a Regular Expression and print) text search tool, so that all logs of all service nodes are traversed in the whole distributed system, and further required logs are found.
Because the log query mode is to scan and traverse each service node in the whole distributed system, the scanning process needs to consume a large amount of cpu (Central Processing Unit) and memory, which affects the Processing of normal services, and meanwhile, the log query efficiency is very low, and the consumed time is often hour-level.
Disclosure of Invention
The embodiment of the invention aims to provide a log storage method and device and a log query method and device, so as to avoid influencing the processing of normal services and improve the log query efficiency. The specific technical scheme is as follows:
in a first aspect, an embodiment of the present invention provides a log storage method, where the method is applied to a service node in a distributed system, where the distributed system further includes a log aggregation node, and the method includes:
generating a log file according to the service processing condition;
and sending the log file to a corresponding log collection node according to a mapping relation between the service node and the log collection node which is established in advance, so that the log collection node stores the log file, and an index is established for the stored log file.
Optionally, the step of generating a log file according to the service processing condition includes:
determining log content and a log name according to the service processing condition;
and generating a log file of the service node with a key-value pair structure by taking the log name and the node identifier of the service node as keys and taking the log content as a value.
Optionally, the establishing manner of the mapping relationship includes:
acquiring the numbers of the service nodes and the log collection nodes in the distributed system;
and establishing a mapping relation between the service node and the log collection node according to a Hash algorithm based on the number.
In a second aspect, an embodiment of the present invention provides a log storage method, where the method is applied to a log aggregation node in a distributed system, where the distributed system further includes a service node, and the method includes:
receiving a log file sent by the service node, wherein the log file is generated by the service node according to a service processing condition and is sent according to a pre-established mapping relation between the service node and the log collection node;
and storing the log file, and establishing an index aiming at the stored log file.
Optionally, the step of establishing an index for the stored log file includes:
and aiming at the stored log files, establishing a full-text index based on a full-text search engine according to a preset time interval.
In a third aspect, an embodiment of the present invention provides a log query method, where the method is applied to a log collection node in a distributed system, where the distributed system further includes a service node, and the method includes:
receiving a log retrieval request sent by query equipment, wherein the log retrieval request comprises retrieval keywords;
based on a pre-established index, inquiring log content corresponding to the retrieval key word from a log file, wherein the log file is generated by the stored service node according to a service processing condition and is sent according to a pre-established mapping relation between the service node and the log collection node;
and sending the log content to the inquiry equipment.
Optionally, the index establishing method includes:
and aiming at the stored log files, establishing a full-text index based on a full-text search engine according to a preset time interval.
In a fourth aspect, an embodiment of the present invention provides a log storage apparatus, where the apparatus is applied to a service node in a distributed system, where the distributed system further includes a log aggregation node, and the apparatus includes:
the log file generation module is used for generating a log file according to the service processing condition;
and the log file sending module is used for sending the log file to the corresponding log collection node according to the mapping relation between the service node and the log collection node which is established in advance so as to enable the log collection node to store the log file and establish an index aiming at the stored log file.
Optionally, the log file generating module includes:
the content name determining unit is used for determining the log content and the log name according to the service processing condition;
and the log file generating unit is used for generating the log file of the service node with a key value pair structure by taking the log name and the node identifier of the service node as keys and taking the log content as a value.
Optionally, the apparatus further comprises:
the mapping relation establishing module is used for acquiring serial numbers of the service nodes and the log collection nodes in the distributed system; and establishing a mapping relation between the service node and the log collection node according to a Hash algorithm based on the number.
In a fifth aspect, an embodiment of the present invention provides a log storage apparatus, where the apparatus is applied to a log aggregation node in a distributed system, where the distributed system further includes a service node, and the apparatus includes:
a log file receiving module, configured to receive a log file sent by the service node, where the log file is generated by the service node according to a service processing condition and sent according to a pre-established mapping relationship between the service node and the log collecting node;
and the log file storage module is used for storing the log file and establishing an index aiming at the stored log file.
Optionally, the log file storage module includes:
and the index establishing unit is used for establishing a full-text index based on a full-text search engine according to a preset time interval aiming at the stored log files.
In a sixth aspect, an embodiment of the present invention provides a log query apparatus, where the apparatus is applied to a log aggregation node in a distributed system, where the distributed system further includes a service node, and the apparatus includes:
the system comprises a retrieval request receiving module, a log retrieval request sending module and a log retrieval processing module, wherein the log retrieval request comprises a retrieval keyword;
a log content query module, configured to query, based on a pre-established index, log content corresponding to the search keyword from a log file, where the log file is a log file that is generated by the stored service node according to a service processing condition and is sent according to a pre-established mapping relationship between the service node and the log collection node;
and the log content sending module is used for sending the log content to the query equipment.
Optionally, the apparatus further comprises:
and the index establishing module is used for establishing a full-text index based on a full-text search engine according to a preset time interval aiming at the stored log files.
In a seventh aspect, an embodiment of the present invention provides a service node, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete mutual communication through the communication bus;
the memory is used for storing a computer program;
the processor is configured to implement the method steps of any of the first aspect described above when executing the program stored in the memory.
In an eighth aspect, an embodiment of the present invention provides a log collection node, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;
the memory is used for storing a computer program;
the processor is configured to implement the method steps of any one of the second aspect or the third aspect when executing the program stored in the memory.
In a ninth aspect, an embodiment of the present invention provides a distributed system, where the distributed system includes:
the service node according to the seventh aspect and the log collection node according to the eighth aspect.
In a tenth aspect, an embodiment of the present invention provides a computer-readable storage medium, where instructions, when executed by a processor of an electronic device, enable the electronic device to perform the method steps of any one of the first, second and third aspects.
In the scheme provided by the embodiment of the invention, the log collection node is added in the distributed system, the service node in the distributed system can generate the log file according to the service processing condition, and the log file is sent to the corresponding log collection node according to the mapping relation between the pre-established service node and the log collection node, so that the log collection node stores the log file, and the full-text index is established for the stored log file. Therefore, when log query is carried out, the log file is stored in the log collection node, so that the log query is carried out only in the log collection node, the processing of the service node on normal service is not influenced, and meanwhile, the log query efficiency is greatly improved as thousands of nodes in a distributed system do not need to be traversed.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic structural diagram of a distributed system according to an embodiment of the present invention;
fig. 2 is a flowchart of a first log storage method according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating a specific step S201 in the embodiment shown in FIG. 2;
fig. 4 is a flowchart of a second log storage method according to an embodiment of the present invention;
fig. 5 is a flowchart of a log query method according to an embodiment of the present invention;
FIG. 6 is a schematic structural diagram of a first log storage device according to an embodiment of the present invention;
FIG. 7 is a schematic structural diagram of a second log storage device according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a log query apparatus according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of a service node according to an embodiment of the present invention;
fig. 10 is a schematic structural diagram of a log aggregation node according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to avoid influencing normal service processing and improve log query efficiency, embodiments of the present invention provide a log storage method and apparatus, a log query method and apparatus, a distributed system, a service node, a log collection node, and a computer-readable storage medium.
The log storage method and the log query method provided by the embodiment of the invention can be applied to a distributed system, particularly, the log storage method can be applied to service nodes and log collection nodes in the distributed system, and the log query method can be applied to the log collection nodes in the distributed system. The service node refers to a server running with a service; the log collection node is used for storing log files and running a server with an application for managing the log files; the server may be a cloud host (i.e., a virtual machine or virtual server) or may be a physical server. As shown in fig. 1, the distributed system may include a service node 110 and a log collection node 120, where each of the service node 110 and the log collection node 120 may be multiple. The service node 110 and the log collection node 120 establish a communication connection according to a mapping relationship established in advance to transmit log data.
First, a first log storage method provided by an embodiment of the present invention is described below.
As shown in fig. 2, a log storage method is applied to a service node in a distributed system, where the distributed system further includes a log collection node, and the method includes:
s201, generating a log file according to the service processing condition;
s202, according to the mapping relation between the service node and the log collection node which is established in advance, the log file is sent to the corresponding log collection node, so that the log collection node stores the log file, and an index is established for the stored log file.
It can be seen that, in the scheme provided in the embodiment of the present invention, a log collecting node is added in a distributed system, a service node in the distributed system may generate a log file according to a service processing condition, and the log file is sent to a corresponding log collecting node according to a mapping relationship between the service node and the log collecting node, which is established in advance, so that the log collecting node stores the log file, and a full-text index is established for the stored log file. Therefore, when log query is carried out, the log file is stored in the log collection node, so that the log query is carried out only in the log collection node, the processing of the service node on normal service is not influenced, and meanwhile, the log query efficiency is greatly improved as thousands of nodes in a distributed system do not need to be traversed.
In the distributed system, each service node is configured to process various services and generate a log file according to a service processing condition. The specific manner of generating the log file may be any log file generation manner in the log generation field, and is not specifically limited and described herein.
After the log file is generated, the service node may execute step S202, that is, send the log file to the corresponding log collecting node according to the mapping relationship between the service node and the log collecting node established in advance. In order to store the log file conveniently, the mapping relationship between the plurality of service nodes and the plurality of log collection nodes can be established in advance.
In an embodiment, since the log collection node is configured to store the log file and does not perform service processing, each log collection node may correspond to a plurality of service nodes and be configured to store the log files of the plurality of service nodes, that is, in the mapping relationship, one log collection node may correspond to a plurality of service nodes. Therefore, in the distributed system, the number of the log collection nodes can be far smaller than that of the service nodes, and log query can be conveniently carried out.
The service node can send the log file to the corresponding log collection node in real time, that is, one log file is not generated, and the service node can send the log file to the log collection node, so that the log collection node can store the log file and establish an index as soon as possible.
After receiving the log file sent by the service node, the log collection node can store the log file and establish an index for the stored log file, so as to facilitate the subsequent log query. As an embodiment, the log aggregating node may store the log file in the form of a table, and each column in the table may be a log data, for example, as shown in the following table:
Figure BDA0002537847590000071
Figure BDA0002537847590000081
in this case, the log collection node may establish an index according to the column name of the table, so that when log query is required, a row where a log to be queried is located may be determined according to the index, and then log content to be queried may be obtained.
In the log storage method provided by the embodiment of the present invention, since the distributed systems are generally deployed in the intranet, and the resources consumed by the service nodes are mainly IO (Input/Output) resources for log file transmission and intranet bandwidth, network consumption can be substantially ignored, and network delay can be also ignored.
As an implementation manner of the embodiment of the present invention, as shown in fig. 3, the step of generating the log file according to the service processing condition may include:
s301, determining log content and log name according to the service processing condition;
when a service node processes each service, a log file is generated according to the service condition, specifically, the service node may determine a log content and a log name according to the service processing condition, where the log name may be determined according to factors such as the specific content of the service, the log type, and the like, and the log content is the specific content of the currently generated log, and may include information such as log creation time, service node IP, and the log type.
S302, the log name and the node identification of the service node are used as keys, the log content is used as a value, and a log file of the service node with a key-value pair structure is generated.
Furthermore, the service node may use the log name and the node identifier of the service node as keys, use the log content as a value, generate a log file with a key-value pair structure, and send the log file to the corresponding log collection node. The log file is the log file of the service node, and each service node can generate a corresponding log file according to the above method and send each log file to a corresponding log collection node. The node identifier of the service node may be any information capable of uniquely identifying the service node, for example, a number, an IP address, and the like, which is not limited herein.
After the log collection node receives the log file of the key value pair structure, the log file can be stored quickly. In one embodiment, the log collection node may store the log file as a new line of content in the table.
As can be seen, in this embodiment, the service node may determine the log content and the log name according to the service processing condition, use the log name and the node identifier of the service node as keys, use the log content as a value, and generate a log file of the service node in a key-value pair structure. Therefore, the log file can be conveniently stored by the log collection node, and the log storage efficiency is improved.
As an implementation manner of the embodiment of the present invention, the establishing manner of the mapping relationship may include:
acquiring the numbers of the service nodes and the log collection nodes in the distributed system; and establishing a mapping relation between the service node and the log collection node according to a Hash algorithm based on the number.
Because the number of service nodes in the distributed system is generally very large, in order to facilitate log storage, a mapping relationship between the service nodes and the log collection nodes can be established in advance. Specifically, the service nodes and the log collection nodes in the distributed system may be numbered manually, for example, starting from 0, and numbering may be performed in order of natural numbers.
Then, a mapping relationship between the service node and the log collection node can be established according to a hash algorithm based on the numbers of the service node and the log collection node, specifically, the service node can obtain the numbers of the service node and the log collection node in the distributed system, further, hash operation can be performed on the number of each service node to obtain a hash value, and the log collection node with the same log collection node number as the hash value is used as the log collection node corresponding to the service node. Of course, the establishment based on the mapping relationship may also be implemented manually, and is not limited specifically herein.
Therefore, in the embodiment, the numbers of the service nodes and the log collection nodes in the distributed system can be obtained, and the mapping relationship between the service nodes and the log collection nodes is established according to the hash algorithm based on the numbers, so that the mapping relationship between the service nodes and the log collection nodes can be quickly established, each service node can be ensured to have the corresponding log collection node, and the log file can be ensured to be smoothly stored.
Corresponding to the first log storage method, an embodiment of the present invention further provides another log storage method, and a second log storage method provided in the embodiment of the present invention is described below. The second log storage method provided by the embodiment of the invention can be applied to the log collection node in the distributed system.
As shown in fig. 4, a log storage method is applied to a log collection node in a distributed system, where the distributed system further includes a service node, and the method includes:
s401, receiving a log file sent by the service node;
and the log file is generated by the service node according to the service processing condition and is sent according to the mapping relation between the service node and the log collection node which is established in advance.
S402, storing the log file, and establishing an index for the stored log file.
It can be seen that in the scheme provided in the embodiment of the present invention, a log collecting node is added in a distributed system, and the log collecting node in the distributed system can receive a log file sent by a service node, where the log file is generated by the service node according to a service processing condition, and is sent according to a mapping relationship between the service node and the log collecting node established in advance, stores the log file, and establishes an index for the stored log file. Therefore, when log query is carried out, the log file is stored in the log collection node, so that the log query is carried out only in the log collection node, the processing of the service node on normal service is not influenced, and meanwhile, the log query efficiency is greatly improved as thousands of nodes in a distributed system do not need to be traversed.
Since the specific implementation of the step S401 and the step S402 has already been described in the section of the first log storage method, detailed description thereof is omitted here.
As an implementation manner of the embodiment of the present invention, the step of establishing an index for the stored log file may include:
and aiming at the stored log files, establishing a full-text index based on a full-text search engine according to a preset time interval.
After receiving the log file sent by the service node, the log collection node can store the log file, and establish a full-text index based on a full-text search engine according to a preset time interval for the stored log file.
The preset time interval may be set according to a real-time requirement of the log query, and if the real-time requirement of the log query is higher, the preset time interval may be shorter, for example, may be 1 minute, 2 minutes, 100 seconds, and the like. If the real-time performance of the log query is relatively high, the resources consumed for establishing the full-text index are reduced, and the preset time interval can be relatively long, for example, 5 minutes, 10 minutes, half an hour and the like.
In an embodiment, the full-text search engine lucene can select a compression mode to reduce the utilization rate of a disk, and when a log collection node performs data cleaning and other operations, the availability of the disk can be ensured, log query is not affected, so that a full-text index can be established by using the full-text search engine lucene. It is also reasonable to use the full text search engine elastic search or the open source analysis database clickhouse to build the full text index.
As can be seen, in this embodiment, the log collection node may establish a full-text index based on a full-text search engine according to a preset time interval for the stored log file. The method can meet the real-time requirement of log query, and can ensure that a disk is available when the log collection node performs data cleaning and other operations, and log query is not influenced.
As an implementation manner of the embodiment of the present invention, the method may further include:
and performing timed cleaning and/or maintenance on the stored log file.
In order to improve the performance of the log collection node and avoid resource waste, the log collection node can clean the log file at regular time, delete expired historical data and ensure the speed of storing and querying the log file. Meanwhile, the log collection node can maintain logs at regular time according to actual requirements, so that log file storage and query can be guaranteed to be performed smoothly.
Therefore, in this embodiment, the log collection node can perform regular cleaning and/or maintenance on the stored log file, and can ensure that the log file is stored and queried smoothly, improve the performance of the log collection node, and avoid resource waste.
Corresponding to the log storage method, an embodiment of the present invention further provides a log query method, and the log query method provided by the embodiment of the present invention is introduced below. The log query method provided by the embodiment of the invention can be applied to the log collection node in the distributed system.
As shown in fig. 5, a log query method is applied to a log collection node in a distributed system, where the distributed system further includes a service node, and the method includes:
s501, receiving a log retrieval request sent by query equipment;
wherein the log retrieval request comprises a retrieval key.
S502, based on a pre-established index, inquiring log contents corresponding to the retrieval keywords from a log file;
the log file is generated by the stored service node according to the service processing condition and is sent according to a mapping relation established in advance between the service node and the log collection node.
S503, sending the log content to the query device.
It can be seen that in the scheme provided in the embodiment of the present invention, a log collecting node is added in a distributed system, and the log collecting node in the distributed system can receive a log retrieval request sent by a query device, where the log retrieval request includes a retrieval key word, and based on a pre-established index, log content corresponding to the retrieval key word is queried from a log file, where the log file is a log file that is generated by a stored service node according to a service processing condition and sent according to a pre-established mapping relationship between the service node and the log collecting node, and then the log content is sent to the query device. Therefore, when log query is carried out, the log file is stored in the log collection node, so that the log query is carried out only in the log collection node, the processing of the service node on normal service is not influenced, and meanwhile, the log query efficiency is greatly improved as thousands of nodes in a distributed system do not need to be traversed.
After receiving the log retrieval request sent by the query device, the log aggregation node may query the log content corresponding to the retrieval key from the log file based on the pre-established index, that is, execute step S502. The log retrieval request at least includes a retrieval key, which may be one or more, and this is reasonable, for example, service node identification + service node IP, log name, etc.
The log file is generated by the stored service node according to the service processing condition and is sent according to the mapping relation between the pre-established service node and the log collection node, namely the log file stored according to the log storage method.
The log collection node may locate the log content corresponding to the search key according to a pre-established index, for example, a log file stored by the log collection node is as shown in the above table, and the search key included in the log search request is a service node identifier: 00004+ Log name: and 1, the log collection node can locate the log content corresponding to the search key as the log content corresponding to the 9 th row and the 10 th row in the table according to the pre-established index.
As an implementation manner of the embodiment of the present invention, the establishment manner of the index may include:
and aiming at the stored log files, establishing a full-text index based on a full-text search engine according to a preset time interval.
Since the establishment method of the index has already been introduced in the above-mentioned embodiment of the log storage method, it is not described herein again.
Corresponding to the first log storage method, an embodiment of the present invention further provides a log storage device, and the first log storage device provided in the embodiment of the present invention is described below.
As shown in fig. 6, a log storage apparatus, which is applied to a service node in a distributed system, where the distributed system further includes a log collection node, includes:
a log file generating module 610, configured to generate a log file according to a service processing condition;
and a log file sending module 620, configured to send the log file to a corresponding log collection node according to a mapping relationship between the service node and the log collection node, where the mapping relationship is established in advance, so that the log collection node stores the log file, and establishes an index for the stored log file.
It can be seen that, in the scheme provided in the embodiment of the present invention, a log collecting node is added in a distributed system, a service node in the distributed system may generate a log file according to a service processing condition, and the log file is sent to a corresponding log collecting node according to a mapping relationship between the service node and the log collecting node, which is established in advance, so that the log collecting node stores the log file, and a full-text index is established for the stored log file. Therefore, when log query is carried out, the log file is stored in the log collection node, so that the log query is carried out only in the log collection node, the processing of the service node on normal service is not influenced, and meanwhile, the log query efficiency is greatly improved as thousands of nodes in a distributed system do not need to be traversed.
As an implementation manner of the embodiment of the present invention, the log file generating module 610 may include:
the content name determining unit is used for determining the log content and the log name according to the service processing condition;
and the log file generating unit is used for generating the log file of the service node with a key value pair structure by taking the log name and the node identifier of the service node as keys and taking the log content as a value.
As an implementation manner of the embodiment of the present invention, the apparatus may further include:
a mapping relationship establishing module (not shown in fig. 6) configured to obtain numbers of the service nodes and the log collection nodes in the distributed system; and establishing a mapping relation between the service node and the log collection node according to a Hash algorithm based on the number.
Corresponding to the second log storage method, an embodiment of the present invention further provides another log storage device, and the second log storage device provided in the embodiment of the present invention is described below.
As shown in fig. 7, a log storage apparatus, which is applied to a log collection node in a distributed system, where the distributed system further includes a service node, includes:
a log file receiving module 710, configured to receive a log file sent by the service node;
and the log file is generated by the service node according to the service processing condition and is sent according to the mapping relation between the service node and the log collection node which is established in advance.
And a log file storage module 720, configured to store the log file, and build an index for the stored log file.
It can be seen that in the scheme provided in the embodiment of the present invention, a log collecting node is added in a distributed system, and the log collecting node in the distributed system can receive a log file sent by a service node, where the log file is generated by the service node according to a service processing condition, and is sent according to a mapping relationship between the service node and the log collecting node established in advance, stores the log file, and establishes an index for the stored log file. Therefore, when log query is carried out, the log file is stored in the log collection node, so that the log query is carried out only in the log collection node, the processing of the service node on normal service is not influenced, and meanwhile, the log query efficiency is greatly improved as thousands of nodes in a distributed system do not need to be traversed.
As an implementation manner of the embodiment of the present invention, the log file storage module 720 may include:
and the index establishing unit is used for establishing a full-text index based on a full-text search engine according to a preset time interval aiming at the stored log files.
Corresponding to the log query method, an embodiment of the present invention further provides a log query device, and a description is given below of the log query device provided in the embodiment of the present invention.
As shown in fig. 8, a log query apparatus, which is applied to a log collection node in a distributed system, where the distributed system further includes a service node, includes:
a retrieval request receiving module 810, configured to receive a log retrieval request sent by a query device;
wherein the log retrieval request comprises a retrieval key.
And a log content query module 820, configured to query, based on a pre-established index, log content corresponding to the search keyword from a log file.
The log file is generated by the stored service node according to the service processing condition and is sent according to a mapping relation established in advance between the service node and the log collection node.
A log content sending module 830, configured to send the log content to the querying device.
It can be seen that in the scheme provided in the embodiment of the present invention, a log collecting node is added in a distributed system, and the log collecting node in the distributed system can receive a log retrieval request sent by a query device, where the log retrieval request includes a retrieval key word, and based on a pre-established index, log content corresponding to the retrieval key word is queried from a log file, where the log file is a log file that is generated by a stored service node according to a service processing condition and sent according to a pre-established mapping relationship between the service node and the log collecting node, and then the log content is sent to the query device. Therefore, when log query is carried out, the log file is stored in the log collection node, so that the log query is carried out only in the log collection node, the processing of the service node on normal service is not influenced, and meanwhile, the log query efficiency is greatly improved as thousands of nodes in a distributed system do not need to be traversed.
As an implementation manner of the embodiment of the present invention, the apparatus may further include:
and an index establishing module (not shown in fig. 8) for establishing a full-text index based on the full-text search engine according to a preset time interval for the stored log files.
An embodiment of the present invention further provides a distributed system, as shown in fig. 1, where the distributed system includes a service node 110 and a log collecting node 120, where:
a service node 110, configured to perform the first log storage method steps according to any of the foregoing embodiments;
the log collection node 120 is configured to perform the second log storage method step and/or the log query method step according to any of the above embodiments.
It can be seen that, in the scheme provided in the embodiment of the present invention, the distributed system includes a service node and a log collecting node, and the log collecting node in the distributed system can receive a log retrieval request sent by the query device, where the log retrieval request includes a retrieval key word, and based on a pre-established index, log content corresponding to the retrieval key word is queried from a log file, where the log file is generated by the stored service node according to a service processing condition, and is sent according to a mapping relationship between the pre-established service node and the log collecting node, so as to send the log content to the query device. Therefore, when log query is carried out, the log file is stored in the log collection node, so that the log query is carried out only in the log collection node, the processing of the service node on normal service is not influenced, and meanwhile, the log query efficiency is greatly improved as thousands of nodes in a distributed system do not need to be traversed.
The number of the log collection nodes can be determined according to the log qps (Query Per Second, Query rate Per Second) of the distributed system, and as an implementation mode, one log collection node can be added every ten thousand qps. The log collection node does not carry service processing, and is only responsible for storing the log and responding to the log query request, so that the service processing of the service node is not influenced.
It should be noted that the number and the connection relationship of the log collection nodes and the service nodes in fig. 1 are only an example, and are used to illustrate the number and the connection relationship of the log collection nodes and the service nodes in an embodiment, and cannot constitute a limitation on the number and the connection relationship of the log collection nodes and the service nodes in the present invention.
The embodiment of the present invention further provides a service node, as shown in fig. 9, the service node may include a processor 901, a communication interface 902, a memory 903 and a communication bus 904, where the processor 901, the communication interface 902, and the memory 903 complete mutual communication through the communication bus 904,
a memory 903 for storing computer programs;
the processor 901 is configured to implement the first log storage method steps described in any of the above embodiments when executing the program stored in the memory 903.
It can be seen that, in the scheme provided in the embodiment of the present invention, a log collecting node is added in a distributed system, a service node in the distributed system may generate a log file according to a service processing condition, and the log file is sent to a corresponding log collecting node according to a mapping relationship between the service node and the log collecting node, which is established in advance, so that the log collecting node stores the log file, and a full-text index is established for the stored log file. Therefore, when log query is carried out, the log file is stored in the log collection node, so that the log query is carried out only in the log collection node, the processing of the service node on normal service is not influenced, and meanwhile, the log query efficiency is greatly improved as thousands of nodes in a distributed system do not need to be traversed.
The communication bus mentioned in the service node may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the service node and other devices.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component.
The embodiment of the present invention further provides a log collection node, as shown in fig. 10, the log collection node may include a processor 1001, a communication interface 1002, a memory 1003 and a communication bus 1004, where the processor 1001, the communication interface 1002 and the memory 1003 complete mutual communication through the communication bus 1004,
a memory 1003 for storing a computer program;
the processor 1001 is configured to implement the second log storage method step and/or the log query method step described in any of the above embodiments when executing the program stored in the memory 1003.
It can be seen that in the scheme provided in the embodiment of the present invention, a log collecting node is added in a distributed system, and the log collecting node in the distributed system can receive a log file sent by a service node, where the log file is generated by the service node according to a service processing condition, and is sent according to a mapping relationship between the service node and the log collecting node established in advance, stores the log file, and establishes an index for the stored log file. Therefore, when log query is carried out, the log file is stored in the log collection node, so that the log query is carried out only in the log collection node, the processing of the service node on normal service is not influenced, and meanwhile, the log query efficiency is greatly improved as thousands of nodes in a distributed system do not need to be traversed.
The communication bus mentioned in the above log collection node may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
And the communication interface is used for communication between the log collection node and other devices.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component.
The embodiment of the present invention further provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements the method steps described in any of the above embodiments.
It can be seen that, in the solution provided in the embodiment of the present invention, when the computer program is executed by the processor of the service node, the log file may be generated according to the service processing condition, and the log file is sent to the corresponding log collecting node according to the mapping relationship between the service node and the log collecting node that is established in advance, so that the log collecting node stores the log file, and establishes the full-text index for the stored log file. When being executed by a processor of the log collection node, the computer program can receive a log file sent by the service node, wherein the log file is generated by the service node according to the service processing condition and sent according to a mapping relation between the service node and the log collection node which is established in advance, the log file is stored, an index is established for the stored log file, or a log retrieval request sent by the query device is received, log contents corresponding to retrieval keywords are queried from the log file based on the index which is established in advance, and then the log contents are sent to the query device. Therefore, when log query is carried out, the log file is stored in the log collection node, so that the log query is carried out only in the log collection node, the processing of the service node on normal service is not influenced, and meanwhile, the log query efficiency is greatly improved as thousands of nodes in a distributed system do not need to be traversed.
It should be noted that, for the above-mentioned apparatus, system, service node, log collecting node and computer-readable storage medium embodiments, since they are basically similar to the corresponding method embodiments, the description is relatively simple, and for relevant points, refer to the partial description of the method embodiments.
It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (18)

1. A log storage method is applied to service nodes in a distributed system, the distributed system further comprises a log collection node, and the method comprises the following steps:
generating a log file according to the service processing condition;
and sending the log file to a corresponding log collection node according to a mapping relation between the service node and the log collection node which is established in advance, so that the log collection node stores the log file, and an index is established for the stored log file.
2. The method of claim 1, wherein the step of generating a log file according to the business process condition comprises:
determining log content and a log name according to the service processing condition;
and generating a log file of the service node with a key-value pair structure by taking the log name and the node identifier of the service node as keys and taking the log content as a value.
3. The method of claim 1 or 2, wherein the mapping relationship is established in a manner that includes:
acquiring the numbers of the service nodes and the log collection nodes in the distributed system;
and establishing a mapping relation between the service node and the log collection node according to a Hash algorithm based on the number.
4. A log storage method is applied to a log collection node in a distributed system, the distributed system further comprises a service node, and the method comprises the following steps:
receiving a log file sent by the service node, wherein the log file is generated by the service node according to a service processing condition and is sent according to a pre-established mapping relation between the service node and the log collection node;
and storing the log file, and establishing an index aiming at the stored log file.
5. The method of claim 4, wherein the step of indexing against the stored log files comprises:
and aiming at the stored log files, establishing a full-text index based on a full-text search engine according to a preset time interval.
6. A log query method is applied to a log collection node in a distributed system, the distributed system further comprises a service node, and the method comprises the following steps:
receiving a log retrieval request sent by query equipment, wherein the log retrieval request comprises retrieval keywords;
based on a pre-established index, inquiring log content corresponding to the retrieval key word from a log file, wherein the log file is generated by the stored service node according to a service processing condition and is sent according to a pre-established mapping relation between the service node and the log collection node;
and sending the log content to the inquiry equipment.
7. The method of claim 6, wherein the index is established in a manner comprising:
and aiming at the stored log files, establishing a full-text index based on a full-text search engine according to a preset time interval.
8. A log storage apparatus applied to a service node in a distributed system, the distributed system further including a log collection node, the apparatus comprising:
the log file generation module is used for generating a log file according to the service processing condition;
and the log file sending module is used for sending the log file to the corresponding log collection node according to the mapping relation between the service node and the log collection node which is established in advance so as to enable the log collection node to store the log file and establish an index aiming at the stored log file.
9. The apparatus of claim 8, wherein the log file generation module comprises:
the content name determining unit is used for determining the log content and the log name according to the service processing condition;
and the log file generating unit is used for generating the log file of the service node with a key value pair structure by taking the log name and the node identifier of the service node as keys and taking the log content as a value.
10. The apparatus of claim 8 or 9, wherein the apparatus further comprises:
the mapping relation establishing module is used for acquiring serial numbers of the service nodes and the log collection nodes in the distributed system; and establishing a mapping relation between the service node and the log collection node according to a Hash algorithm based on the number.
11. A log storage apparatus, wherein the apparatus is applied to a log collection node in a distributed system, the distributed system further includes a service node, and the apparatus comprises:
a log file receiving module, configured to receive a log file sent by the service node, where the log file is generated by the service node according to a service processing condition and sent according to a pre-established mapping relationship between the service node and the log collecting node;
and the log file storage module is used for storing the log file and establishing an index aiming at the stored log file.
12. The apparatus of claim 11, wherein the log file storage module comprises:
and the index establishing unit is used for establishing a full-text index based on a full-text search engine according to a preset time interval aiming at the stored log files.
13. A log query apparatus, applied to a log collection node in a distributed system, the distributed system further including a service node, the apparatus comprising:
the system comprises a retrieval request receiving module, a log retrieval request sending module and a log retrieval processing module, wherein the log retrieval request comprises a retrieval keyword;
a log content query module, configured to query, based on a pre-established index, log content corresponding to the search keyword from a log file, where the log file is a log file that is generated by the stored service node according to a service processing condition and is sent according to a pre-established mapping relationship between the service node and the log collection node;
and the log content sending module is used for sending the log content to the query equipment.
14. The apparatus of claim 13, wherein the apparatus further comprises:
and the index establishing module is used for establishing a full-text index based on a full-text search engine according to a preset time interval aiming at the stored log files.
15. A service node, comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory communicate with each other via the communication bus;
the memory is used for storing a computer program;
the processor, when executing the program stored in the memory, implementing the method steps of any of claims 1-3.
16. The log collection node is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
the memory is used for storing a computer program;
the processor, when executing the program stored in the memory, implementing the method steps of any of claims 4-7.
17. A distributed system, comprising:
a service node as claimed in claim 15 and a log collection node as claimed in claim 16.
18. A computer-readable storage medium, wherein instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the method steps of any of claims 1-7.
CN202010538309.1A 2020-06-12 2020-06-12 Log storage method and device and log query method and device Pending CN111694793A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010538309.1A CN111694793A (en) 2020-06-12 2020-06-12 Log storage method and device and log query method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010538309.1A CN111694793A (en) 2020-06-12 2020-06-12 Log storage method and device and log query method and device

Publications (1)

Publication Number Publication Date
CN111694793A true CN111694793A (en) 2020-09-22

Family

ID=72480845

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010538309.1A Pending CN111694793A (en) 2020-06-12 2020-06-12 Log storage method and device and log query method and device

Country Status (1)

Country Link
CN (1) CN111694793A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112291085A (en) * 2020-10-10 2021-01-29 北京金山云网络技术有限公司 Fault positioning method, device, equipment and medium
CN112613853A (en) * 2020-12-31 2021-04-06 平安养老保险股份有限公司 Data aggregation method and device, computer equipment and readable storage medium
CN113794640A (en) * 2021-08-20 2021-12-14 新华三信息安全技术有限公司 Message processing method, device, equipment and machine readable storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007071343A2 (en) * 2005-12-22 2007-06-28 Sap Ag Systems and methods for finding log files generated by a distributed computer
CN101163265A (en) * 2007-11-20 2008-04-16 中兴通讯股份有限公司 Distributed database based on multimedia message log inquiring method and system
CN105045905A (en) * 2015-08-07 2015-11-11 北京思特奇信息技术股份有限公司 Log maintenance method and system based on full-text retrieval
CN107239382A (en) * 2017-06-23 2017-10-10 深圳市冬泉谷信息技术有限公司 The log processing method and system of a kind of container application
CN107291928A (en) * 2017-06-29 2017-10-24 国信优易数据有限公司 A kind of daily record storage system and method
CN109376136A (en) * 2018-10-19 2019-02-22 郑州云海信息技术有限公司 A kind of distributed information log processing system, the network equipment and method
CN109684279A (en) * 2017-10-18 2019-04-26 中移(苏州)软件技术有限公司 A kind of data processing method and system
CN109800223A (en) * 2018-12-12 2019-05-24 平安科技(深圳)有限公司 Log processing method, device, electronic equipment and storage medium
CN110413586A (en) * 2019-08-05 2019-11-05 山东浪潮通软信息科技有限公司 Distributed information log management method and system

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007071343A2 (en) * 2005-12-22 2007-06-28 Sap Ag Systems and methods for finding log files generated by a distributed computer
CN101163265A (en) * 2007-11-20 2008-04-16 中兴通讯股份有限公司 Distributed database based on multimedia message log inquiring method and system
CN105045905A (en) * 2015-08-07 2015-11-11 北京思特奇信息技术股份有限公司 Log maintenance method and system based on full-text retrieval
CN107239382A (en) * 2017-06-23 2017-10-10 深圳市冬泉谷信息技术有限公司 The log processing method and system of a kind of container application
CN107291928A (en) * 2017-06-29 2017-10-24 国信优易数据有限公司 A kind of daily record storage system and method
CN109684279A (en) * 2017-10-18 2019-04-26 中移(苏州)软件技术有限公司 A kind of data processing method and system
CN109376136A (en) * 2018-10-19 2019-02-22 郑州云海信息技术有限公司 A kind of distributed information log processing system, the network equipment and method
CN109800223A (en) * 2018-12-12 2019-05-24 平安科技(深圳)有限公司 Log processing method, device, electronic equipment and storage medium
CN110413586A (en) * 2019-08-05 2019-11-05 山东浪潮通软信息科技有限公司 Distributed information log management method and system

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112291085A (en) * 2020-10-10 2021-01-29 北京金山云网络技术有限公司 Fault positioning method, device, equipment and medium
CN112291085B (en) * 2020-10-10 2023-01-20 北京金山云网络技术有限公司 Fault positioning method, device, equipment and medium
CN112613853A (en) * 2020-12-31 2021-04-06 平安养老保险股份有限公司 Data aggregation method and device, computer equipment and readable storage medium
CN113794640A (en) * 2021-08-20 2021-12-14 新华三信息安全技术有限公司 Message processing method, device, equipment and machine readable storage medium
CN113794640B (en) * 2021-08-20 2022-11-18 新华三信息安全技术有限公司 Message processing method, device, equipment and machine readable storage medium

Similar Documents

Publication Publication Date Title
CN108009236B (en) Big data query method, system, computer and storage medium
CN109741060B (en) Information inquiry system, method, device, electronic equipment and storage medium
US9378053B2 (en) Generating map task output with version information during map task execution and executing reduce tasks using the output including version information
CN111694793A (en) Log storage method and device and log query method and device
CN110555012B (en) Data migration method and device
CN101902505A (en) Distributed DNS inquiry log real-time statistic device and method thereof
CN111680108B (en) Data storage method and device and data acquisition method and device
CN111258978B (en) Data storage method
US20200159841A1 (en) Approach for a controllable trade-off between cost and availability of indexed data in a cloud log aggregation solution such as splunk or sumo
CN104090901A (en) Method, device and server for processing data
CN111400288A (en) Data quality inspection method and system
CN110955704A (en) Data management method, device, equipment and storage medium
CN111488377A (en) Data query method and device, electronic equipment and storage medium
CN111782692A (en) Frequency control method and device
CN102955802A (en) Method and device for acquiring data from data reports
CN112835885B (en) Processing method, device and system for distributed form storage
CN109885729B (en) Method, device and system for displaying data
CN103412883A (en) Semantic intelligent information publishing and subscribing method based on P2P technology
CN110737432A (en) script aided design method and device based on root list
Wang et al. Sublinear algorithms for big data applications
Aslam et al. Pre‐filtering based summarization for data partitioning in distributed stream processing
CN110888840A (en) File query method, device, equipment and medium in distributed file system
CN111159135A (en) Data processing method and device, electronic equipment and storage medium
JP2003316811A (en) Inquiry optimization processing device in different kind of database integration system, method and program making computer execute the method
CN111881086B (en) Big data storage method, query method, electronic device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination