CN113849329A - Log analysis and integration method and system for operating system - Google Patents

Log analysis and integration method and system for operating system Download PDF

Info

Publication number
CN113849329A
CN113849329A CN202110989919.8A CN202110989919A CN113849329A CN 113849329 A CN113849329 A CN 113849329A CN 202110989919 A CN202110989919 A CN 202110989919A CN 113849329 A CN113849329 A CN 113849329A
Authority
CN
China
Prior art keywords
log
suspicious
operating system
logs
analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110989919.8A
Other languages
Chinese (zh)
Other versions
CN113849329B (en
Inventor
张旭芳
匡志鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202110989919.8A priority Critical patent/CN113849329B/en
Publication of CN113849329A publication Critical patent/CN113849329A/en
Application granted granted Critical
Publication of CN113849329B publication Critical patent/CN113849329B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0766Error or fault reporting or storing
    • G06F11/0781Error filtering or prioritizing based on a policy defined by the user or on a policy defined by a hardware/software module, e.g. according to a severity level
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a log analysis and integration method and a system of an operating system, wherein the log analysis and integration method of the operating system comprises the following steps: screening out suspicious logs from log files of an operating system; searching the fault problem of the operating system from the suspicious log, and determining the problem type of the fault problem; selecting a log collection tool corresponding to the configuration information according to the configuration information of the operating system contained in the suspicious log; collecting a problem analysis log corresponding to the problem type by using a log collection tool; integrating and storing the problem analysis logs. The technical scheme of the invention can solve the problems that the processing process of the log problem processing mode in the prior art is time-consuming and labor-consuming, the processing process is passive, the problem is delayed and the time efficiency is solved.

Description

Log analysis and integration method and system for operating system
Technical Field
The invention relates to the technical field of servers, in particular to a log analysis and integration method and system for an operating system.
Background
In the field of servers, the stable operation of an operating system cannot leave the guarantee of a log system. For example: the Linux operating system has a strong log function, logs of the system are collected and managed through a jounald log service and an rsyslog log service, the logs can be used for system auditing and troubleshooting system problems, and the collected logs are permanently stored in a/var/log directory of the Linux operating system; the messages log file under the var/log/directory records the common system and service error information of the Linux operating system, and is the most important log file for positioning the system fault of the server.
In addition to the os's own logging service being able to perform default logging, the os vendor typically provides additional tools (e.g., sosreport tools available to rhol, CentOS, deboan, Ubuntu, etc., support-configuration tools available to SUSE systems, etc.) to collect more system logs and configuration information. Manually executing these tools can gather more information to help locate the failure problem when the operating system fails or checks the messages log file for errors.
However, the tools provided by the operating system itself or manufacturers are not enough to locate the fault problem, because the operating system has no warning mechanism for the error in the log, and the related operators need to actively check the system log to check the operation condition of the system, or until the server has a serious fault, such as a downtime, a restart, or a slow operation, the system log is not analyzed by the organizer; and according to the log analysis result, further log collection action is performed. The log problem processing mode is time-consuming and labor-consuming, the processing process is very passive, and the time effectiveness of problem solving is delayed.
Disclosure of Invention
The invention provides a log analysis and integration method and a log analysis and integration system for an operating system, and aims to solve the problems that in the prior art, a log problem processing mode is time-consuming and labor-consuming in a processing process, the processing process is passive, the problem of delay is solved, and the timeliness is solved.
According to a first aspect of the present invention, the present invention provides a log analysis and integration method for an operating system, including:
screening out suspicious logs from log files of an operating system;
searching the fault problem of the operating system from the suspicious log, and determining the problem type of the fault problem;
selecting a log collection tool corresponding to the configuration information according to the configuration information of the operating system contained in the suspicious log;
collecting a problem analysis log corresponding to the problem type by using a log collection tool;
integrating and storing the problem analysis logs.
Preferably, the step of screening the suspicious log from the log file of the operating system includes:
screening a first suspicious log set from a log file by using a preset keyword;
and filtering the first suspicious log set to obtain a second suspicious log set.
Preferably, the step of filtering the first suspicious log set to obtain the second suspicious log set includes:
the method comprises the steps of retrieving negligible log contents in a first suspicious log set, and removing the negligible logs from the first suspicious log set;
retrieving common problems in the first suspicious log set, and removing the common problems from the first suspicious log set;
and integrating the first suspicious log set with the negligible logs and the common problems removed into a second suspicious log set.
Preferably, the step of searching the suspected log for the failure problem of the operating system includes:
searching the fault problem from the second suspicious log set by using the fault problem keyword corresponding to the fault type;
when the fault problem is found, dividing the fault problem into corresponding problem types;
determining the problem type requires a corresponding problem analysis log.
Preferably, the step of selecting the log collection tool corresponding to the configuration information includes:
analyzing the content of the log file to obtain the configuration information of the operating system;
selecting a log collection tool corresponding to the configuration information;
the log collection tool is installed to a predetermined location of the operating system.
Preferably, the integrating and storing the problem analysis log includes:
identifying and removing repeated contents in all collected problem analysis logs;
packaging and compressing all the problem analysis logs with the repeated contents removed to obtain compressed problem analysis logs;
and storing the compressed problem analysis log.
According to a second aspect of the present invention, the present invention further provides a log analysis and integration system of an operating system, including:
the log screening module is used for screening out suspicious logs from log files of the operating system;
the problem searching module is used for searching the fault problem of the operating system from the suspicious log and determining the problem type of the fault problem;
the tool selection module is used for selecting a log collection tool corresponding to the configuration information according to the configuration information of the operating system contained in the suspicious log;
the log collection module is used for collecting a problem analysis log corresponding to the problem type by using a log collection tool;
the log integration module is used for integrating the problem analysis logs;
and the log storage module is used for storing the problem analysis log.
Preferably, the log screening module includes:
the first log screening submodule is used for screening a first suspicious log set from the log file by using a preset keyword;
and the second log screening submodule is used for filtering the first suspicious log set to obtain a second suspicious log set.
Preferably, the problem finding module includes:
the fault problem searching submodule searches a fault problem from the second suspicious log set by using a fault problem keyword corresponding to the fault type;
the problem type division submodule is used for dividing the fault problem into corresponding problem types when the fault problem is found out;
and the log determining submodule is used for determining a problem analysis log corresponding to the problem type.
Preferably, the tool selecting module includes:
the content analysis submodule is used for analyzing the content of the log file and acquiring the configuration information of the operating system;
the tool selection submodule is used for selecting a log collection tool corresponding to the configuration information;
and the tool installation submodule is used for installing the log collection tool to a preset position of the operating system.
According to the log analysis and integration scheme of the operating system, suspicious logs are screened from log files of the operating system, fault problems of the operating system are searched from the suspicious logs, problem types of the fault problems are determined, a log collection tool corresponding to configuration information is selected according to the configuration information of the operating system contained in the suspicious logs, the log collection tool is used for collecting problem analysis logs corresponding to the problem types, the problem analysis logs are integrated and stored, detailed and comprehensive problem analysis logs can be provided for relevant operators, specific fault problems are located, and the relevant operators are helped to quickly locate and solve the problems. In conclusion, according to the technical scheme provided by the invention, the error logs or faults in the log file are automatically detected in the operation process of the server, the preliminary diagnosis of the problems or faults is carried out after common problems and ignorable logs are filtered, the corresponding log collection tool is started according to the direction of the problems, and the logs with specific problems are collected, integrated and stored, so that a detailed and comprehensive log output can be provided for an administrator, the problems are more quickly positioned and solved, and the problems that the processing process is time-consuming and labor-consuming, the processing process is passive, the problems are delayed and the timeliness is solved in the conventional log problem processing mode are solved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the structures shown in the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a log analysis and integration method for an operating system according to a first embodiment of the present invention;
FIG. 2 is a schematic flow chart diagram illustrating a suspicious log screening method according to the embodiment shown in FIG. 1;
FIG. 3 is a flow chart illustrating a troubleshooting method provided by the embodiment shown in FIG. 1;
fig. 4 is a schematic flowchart of a log collection tool selecting method according to an embodiment of the present invention;
FIG. 5 is a flow chart of a problem analysis log integration and storage method according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a log analysis and integration system of a first operating system according to an embodiment of the present invention;
FIG. 7 is a schematic structural diagram of a log screening module provided in the embodiment shown in FIG. 6;
FIG. 8 is a block diagram of an issue lookup module provided in the embodiment of FIG. 6;
fig. 9 is a schematic structural diagram of a tool selection module provided in the embodiment shown in fig. 6.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The main technical problems of the embodiment of the invention are as follows:
in the prior art, the fault problem is counted by a log tool provided by an operating system or a manufacturer, however, the fault problem is not sufficiently located by the tool provided by the operating system or the manufacturer, because the operating system has no early warning mechanism for errors in the log, and a relevant operator needs to actively check the system log to check the operation condition of the system or until a server has a serious fault. For example, when the system is down, restarted or slowly operated, the system logs are analyzed by the organization personnel; and according to the log analysis result, further log collection action is performed. The log problem processing mode is time-consuming and labor-consuming, the processing process is very passive, and the time effectiveness of problem solving is delayed.
To solve the above problem, referring to fig. 1 in particular, fig. 1 is a schematic flowchart of a log analysis and integration method of an operating system according to an embodiment of the present invention. As shown in fig. 1, the log analysis and integration method of the operating system includes:
s110: and screening suspicious logs from log files of the operating system. In the Linux operating system, the log file mainly refers to a messages log file in a/var/log directory.
As a preferred embodiment, as shown in fig. 2, the step of screening the suspicious log from the log file of the operating system specifically includes:
s111: screening a first suspicious log set from a log file by using a preset keyword; specifically, a relevant suspicious log filtering module is used for filtering log files at regular time, and logs which may have problems and log contexts of associated contents are screened out, so that a first suspicious log set is generated.
S112: and filtering the first suspicious log set to obtain a second suspicious log set. Filtering the first suspicious log set, wherein common problems and ignorable logs are mainly filtered; the common problems comprise log content, retrieval content, problem reasons and solutions; negligible logs include log content, search content, and cause of problem. Wherein, the log content comprises a service name and specific log content; the search content includes search fields and the logical relationship of each search field. The above-described common problem and negligible logs may be stored in the relevant storage module and invoked when the first set of suspect logs needs to be filtered.
Specifically, taking a Linux operating system as an example, filtering a/var/log/messages file of the Linux operating system at regular time, screening logs which may have problems and log contexts of associated contents by keywords such as error, failed, Kernel Panic, Oops, Bug and the like, and generating a first suspicious log set. Pre-establishing a common problem and ignorable log storage unit, butting a messages log filtering function with the common problem and ignorable log function, retrieving a first suspicious log set through the ignorable log function, and removing the ignorable log; the content of the first suspect log collection is retrieved by a commonality problems function, identifying commonality problems that exist in the first suspect log collection.
While removing the common problem and the negligible logs, creating a common file with a host name and a date ending in the Linux system, and recording log occurrence time, the host name, log contents, retrieval contents, problem reasons and solutions corresponding to the common problem in the common file; wherein, the log occurrence time, the host name and the log content are extracted from the first suspicious log set; the retrieval content, problem cause and solution are extracted from the common problem and negligible log storage unit. And removing the filtered common problems and negligible logs from the first suspicious log set, and storing the rest contents as a second suspicious log set.
After the step S110, the log analysis and integration method of the operating system shown in fig. 1 further includes the following steps:
s120: and searching the fault problem of the operating system from the suspicious log, and determining the problem type of the fault problem. Generally, after common problems and negligible logs are removed, most of fault problems existing in suspicious logs need to call key problem analysis logs for analysis, so that the problem types of the fault problems need to be determined, and specifically, fault problem keywords corresponding to the fault types can be used for calling.
In a preferred embodiment, as shown in fig. 3, the step of searching the suspected log for the failure problem of the operating system includes the following steps:
s121: searching the fault problem from the second suspicious log set by using the fault problem keyword corresponding to the fault type; the fault problem keywords comprise the contents of 'Kernel Panic' or 'Hard Lockup' or 'Softlockup' or 'Kernel Bug'.
S122: when the fault problem is found, dividing the fault problem into corresponding problem types; the problem types include a very serious kernel level down problem, a firmware Bug, a problem caused by a specific hardware or hardware link, and an error or Bug problem related to an application, service or driver in the system.
When the fault problem is found, the step of dividing the fault problem into the corresponding problem types is as follows: the method comprises the steps of retrieving negligible log contents in a first suspicious log set, and removing the negligible logs from the first suspicious log set; retrieving common problems in the first suspicious log set, and removing the common problems from the first suspicious log set; and integrating the first suspicious log set with the negligible logs and the common problems removed into a second suspicious log set.
S123: determining the problem type requires a corresponding problem analysis log.
Specifically, with the above contents, taking Linux operating system as an example, a suspicious log analysis module can be defined first, the suspicious log analysis module is used to analyze the second suspicious log set, and keyword search is performed on the contents in the second suspicious log set to confirm that the problem types are the following second types of problems and log types that need to be collected:
if the second suspicious log set comprises keywords of 'Kernel Panic' or 'Hard Lockup' or 'Softlockup' or 'Kernel Bug', the first type of problem is defined. Such problems are typically very serious kernel-level down problems, and analysis of such problems requires the use of sosreport logs and vmcore logs.
After the first type of problem is eliminated, if the second suspicious log set contains the keyword 'Firmware Bug', the type of problem is defined as a second type of problem. Such problems are typically bugs of firmware, the analysis of which requires the assistance of BIOS and BMC logs and system sosreport logs
After the first and second types of questions are eliminated, if the second suspicious log set contains the keyword 'Hardware error', the type of questions is defined as a third type of questions. A third class of problems is generally problems caused by specific hardware or hardware links, and analysis of such problems requires the use of a sosreport log and associated hardware logs. The method for determining the hardware problem is to continuously search the associated device _ id in the log downwards, and compare the device _ id with the device id in the lspci-vv instruction to find all devices on the uplink and the downlink of the corresponding hardware device.
After the first, second and third types of problems are eliminated, the remaining problems are error or bug type problems related to the application, service or drive in the system, and the problems are defined as a fourth type of problems, and the analysis of the problems needs to be performed by means of the sosreport log and the application or drive related log. The specific software concerned is determined by retrieving the third part of the log content, service, process name.
After determining the problem type of the failure problem, the log analysis and integration method of the operating system shown in fig. 1 further includes the following steps:
s130: and selecting a log collection tool corresponding to the configuration information according to the configuration information of the operating system contained in the suspicious log. Specifically, in the Linux operating system, by setting a log collection preparation module, hardware, services and applications which are started and loaded in the dmesg log are analyzed, and configuration information related to the software and the hardware of the server system is determined.
As a preferred embodiment, as shown in fig. 4, the step of selecting the log collection tool corresponding to the configuration information specifically includes:
s131: analyzing the content of the log file to obtain the configuration information of the operating system; the log file here is mainly referred to as a dmesg log file. By designing a log collection preparation module to analyze the contents of the dmesg log file, the relevant configuration information includes: the server is configured with XX types of Raid cards, network cards, display cards and HBA cards of XX manufacturers; the drives are XX respectively; the type of hard disk configured, the manufacturer, the model, the service started, the software installed and the like. And storing a log collecting tool specific to the corresponding hardware equipment manufacturer into a/home directory according to the configuration information of the software and the hardware of the server and installing the log collecting tool. The corresponding tools of the hardware equipment manufacturer include but are not limited to the following: the method comprises the steps of obtaining a database by using a network interface, wherein the database comprises an nvidia-smi corresponding to an nvidia GPU, a Raid card log collection tool lsigetlinux of a broadcom/lsi/avago manufacturer, a bnxtmt tool corresponding to a broadcom network card, an mft tool corresponding to a mellonox, a lanconf tool corresponding to an intel network card, a seacheste tool corresponding to a Seagate hard disk, a WDDT tool corresponding to a West hard disk and the like.
S132: selecting a log collection tool corresponding to the configuration information; generally, for the applications installed in the Linux system, the applications are provided with log functions, and the log collection preparation module needs to retrieve and find the log file positions of all the applications through a log collection tool.
S133: the log collection tool is installed to a predetermined location of the operating system. By installing the log collection tool to a predetermined location of the operating system, the operating system can be triggered to generate a specific log.
Specifically, in the Linux operating system, a special log collection preparation module is used for analyzing hardware, services and applications which are started and loaded in a dmasg log (the dmasg log usually records software and hardware configuration of the operating system), determining configuration information related to the software and hardware of the operating system, and storing a log collection tool specific to a hardware equipment manufacturer into a home directory and installing the log collection tool according to the configuration information related to the software and hardware of the server system. For an operating system which does not have a built-in collection sosreport, such as debian or ubuntu, a sos installation package needs to be prepared and installed in advance. And performing kdump configuration on all operating systems according to the kdump configuration requirement so as to trigger the kdump to generate the vmcore log after the first kind of problems are met.
After selecting the corresponding log collection tool, the log analysis and integration method of the operating system shown in fig. 1 further includes the following steps:
s140: a problem analysis log corresponding to the problem type is collected using a log collection tool. Specifically, the log collection module runs a related log collection tool command according to the problem type, the related equipment, the application and the log type to be collected, which are confirmed by the suspicious log analysis module, so as to collect the log.
For example: in the Lunix operating system, for the first kind of problems, a sosreiport log and a vmcore log need to be collected, and then a root user executes a sosreiport command and generates a sosreiport log compression file identified by the log and a host name under a/var/tmp path of the system; when a root user executes echo c >/proc/sysrq-trigger, two files at the beginning of vmcore are generated under a/var/crash directory;
for the second kind of problems, collecting BIOS and BMC logs and a system sosreport log, wherein the collection of the sosreport log is the same as that of the first kind of problems, and the BIOS and the BMC logs both have independent log collection tools;
for the third kind of problems, collecting sosreport logs and related hardware logs is needed, wherein the collection of the sosreport logs is like the collection of the first kind of problems, the collection of the related hardware logs is carried out, and different hardware log collection programs provided by a log collection preparation module are operated according to different hardware manufacturers and different hardware types;
for the fourth kind of problem, the collection of the sosreport log and the log related to the application or the drive is needed, wherein the collection of the sosreport log is like the first kind of problem, the application log is copied under the application log file path provided by the log collection preparation module, the driven log corresponds to the hardware log of the drive related hardware equipment, and the collection method is the same as the third kind of problem.
S150: integrating and storing the problem analysis logs.
As a preferred embodiment, as shown in fig. 5, the step of integrating and storing the problem analysis log includes:
s151: duplicate content in all collected problem analysis logs is identified and removed.
S152: and packaging and compressing all the problem analysis logs with the repeated contents removed to obtain compressed problem analysis logs.
S153: and storing the compressed problem analysis log.
Specifically, a log integration module is designed, the log integration module collects logs according to the third and fourth problems, identifies and integrates redundant log contents, removes repeated contents in the sosreport log and the hardware log, packs and compresses all the collected logs of each problem together with a common file generated by the suspicious log filtering module, and the compressed problem analysis log file takes a host name and collection time as identification. And then storing the log compressed file generated by the log integration module, wherein the log compressed file can be stored on a separate log disk except a local non-system disk or can be stored on an NFS server through a network. In order to reduce the storage pressure, the invention adopts an increment storage mode commonly used in the industry, reserves the storage content of the log for the first time, and only transmits and stores the increment part in the log in each subsequent storage.
In summary, the log analysis and integration method for the operating system provided by the application screens out the suspicious logs from the log files of the operating system, then searches the fault problems of the operating system from the suspicious logs, determines the problem types of the fault problems, then selects the log collection tool corresponding to the configuration information according to the configuration information of the operating system contained in the suspicious logs, collects the problem analysis logs corresponding to the problem types by using the log collection tool, and then integrates and stores the problem analysis logs, so that detailed and comprehensive problem analysis logs can be provided for relevant operators, specific fault problems can be located, and the relevant operators can be helped to quickly locate and solve the problems. In conclusion, according to the technical scheme provided by the invention, the error logs or faults in the log file are automatically detected in the operation process of the server, the preliminary diagnosis of the problems or faults is carried out after common problems and ignorable logs are filtered, the corresponding log collection tool is started according to the direction of the problems, and the logs with specific problems are collected, integrated and stored, so that a detailed and comprehensive log output can be provided for an administrator, the problems are more quickly positioned and solved, and the problems that the processing process is time-consuming and labor-consuming, the processing process is passive, the problems are delayed and the timeliness is solved in the conventional log problem processing mode are solved.
In order to implement the foregoing method, the following embodiments of the present application further provide a log analysis and integration system of an operating system, where the functions of the foregoing method can be implemented by the following log analysis and integration system of an operating system, and since the foregoing method is already mentioned in specific operation steps, repeated descriptions are omitted here.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a log analysis and integration system of an operating system according to an embodiment of the present invention. As shown in fig. 6, the log analysis and integration system of the operating system includes:
the log screening module 110 is configured to screen suspicious logs from log files of an operating system;
the problem searching module 120 is configured to search a suspicious log for a problem of the operating system, and determine a problem type of the problem;
a tool selection module 130, configured to select, according to configuration information of the operating system included in the suspicious log, a log collection tool corresponding to the configuration information;
a log collection module 140 for collecting a problem analysis log corresponding to the problem type using a log collection tool;
a log integration module 150 for integrating the problem analysis logs;
and the log storage module 160 is used for storing the problem analysis log.
According to the log analysis and integration system of the operating system, the log screening module 110 is used for screening out suspicious logs from log files of the operating system, then the problem searching module 120 is used for searching for fault problems of the operating system from the suspicious logs, the problem types of the fault problems are determined, the tool selecting module 130 is used for selecting a log collecting tool corresponding to configuration information according to the configuration information of the operating system contained in the suspicious logs, the log collecting module 140 is used for collecting problem analysis logs corresponding to the problem types through the log collecting tool, and the problem analysis logs are integrated and stored through the log integration module 150 and the log storage module 160. In conclusion, according to the technical scheme provided by the invention, the error logs or faults in the log file are automatically detected in the operation process of the server, the preliminary diagnosis of the problems or faults is carried out after common problems and ignorable logs are filtered, the corresponding log collection tool is started according to the direction of the problems, and the logs with specific problems are collected, integrated and stored, so that a detailed and comprehensive log output can be provided for an administrator, the problems are more quickly positioned and solved, and the problems that the processing process is time-consuming and labor-consuming, the processing process is passive, the problems are delayed and the timeliness is solved in the conventional log problem processing mode are solved.
As a preferred embodiment, as shown in fig. 7, the log filtering module 110 specifically includes:
the first log screening submodule 111 is used for screening a first suspicious log set from a log file by using a preset keyword;
and the second log screening submodule 112 is configured to filter the first suspicious log set to obtain a second suspicious log set.
As a preferred embodiment, as shown in fig. 8, the problem finding module 120 specifically includes:
the fault problem searching submodule 121 is configured to search for a fault problem from the second suspicious log set by using a fault problem keyword corresponding to the fault type;
a problem type division submodule 122, which divides the fault problem into corresponding problem types when the fault problem is found;
the log determining submodule 123 determines a problem analysis log corresponding to the problem type.
As a preferred embodiment, as shown in fig. 9, the tool selecting module 130 specifically includes:
the content analysis submodule 131 is configured to analyze the content of the log file and obtain configuration information of the operating system;
the tool selection submodule 132 is used for selecting a log collection tool corresponding to the configuration information;
a tool mounting submodule 133 for mounting the log collection tool to a predetermined position of the operating system.
According to the technical scheme provided by the invention, in the operation process of the server, error logs or faults in log files (such as messages logs) can be automatically detected at regular time, after common problems and ignorable logs are filtered, the problem types are automatically classified in the contents in the log files according to problem type keywords, software and hardware related to the problems are confirmed, and the collection, redundancy filtering, packaging and storage of corresponding logs are started according to the problem types and the related software and hardware. By the scheme provided by the invention, an administrator does not need to actively check the system logs to check the operation condition of the system, and a detailed and comprehensive log containing problem positioning correlation is automatically generated according to the system logs and the software and hardware configuration condition before serious problems of downtime, restart or slow operation occur in the daily operation process or the server, so that the problem is positioned and solved more quickly.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be noted that in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A log analysis and integration method of an operating system is characterized by comprising the following steps:
screening out suspicious logs from log files of the operating system;
searching the fault problem of the operating system from the suspicious log, and determining the problem type of the fault problem;
selecting a log collection tool corresponding to the configuration information according to the configuration information of the operating system contained in the suspicious log;
collecting a problem analysis log corresponding to the problem type using the log collection tool;
integrating and storing the problem analysis log.
2. The method of claim 1, wherein the step of screening suspicious logs from log files of the operating system comprises:
screening a first suspicious log set from the log file by using a preset keyword;
and filtering the first suspicious log set to obtain a second suspicious log set.
3. The method of claim 2, wherein the step of filtering the first suspicious log set to obtain a second suspicious log set comprises:
retrieving the contents of the negligible logs in the first suspicious log set, and removing the negligible logs from the first suspicious log set;
retrieving common problems in the first suspicious log set, and removing the common problems from the first suspicious log set;
and integrating the first suspicious log set with the negligible logs and the common problems removed into a second suspicious log set.
4. The method for analyzing and integrating the logs of the operating system according to claim 2 or 3, wherein the step of searching the suspected log for the failure problem of the operating system comprises:
searching the fault problem from the second suspicious log set by using a fault problem keyword corresponding to the fault type;
when the fault problem is found, dividing the fault problem into corresponding problem types;
and determining that the problem type needs a corresponding problem analysis log.
5. The method of claim 1, wherein the step of selecting the log collection tool corresponding to the configuration information comprises:
analyzing the content of the log file to acquire the configuration information of the operating system;
selecting a log collection tool corresponding to the configuration information;
installing the log collection tool to a predetermined location of the operating system.
6. The method of claim 1, wherein the step of integrating and storing the problem analysis log comprises:
identifying and removing repeated contents in all collected problem analysis logs;
packaging and compressing all the problem analysis logs with the repeated contents removed to obtain compressed problem analysis logs;
and storing the compressed problem analysis log.
7. A log analysis integration system for an operating system, comprising:
the log screening module is used for screening out suspicious logs from log files of the operating system;
the problem searching module is used for searching the fault problem of the operating system from the suspicious log and determining the problem type of the fault problem;
the tool selection module is used for selecting a log collection tool corresponding to the configuration information according to the configuration information of the operating system contained in the suspicious log;
a log collection module for collecting a problem analysis log corresponding to the problem type using the log collection tool;
the log integration module is used for integrating the problem analysis logs;
and the log storage module is used for storing the problem analysis log.
8. The system of claim 7, wherein the log filter module comprises:
the first log screening submodule is used for screening the log file by using a preset keyword to obtain a first suspicious log set;
and the second log screening submodule is used for filtering the first suspicious log set to obtain a second suspicious log set.
9. The system of claim 7 or 8, wherein the problem-finding module comprises:
the fault problem searching submodule searches the fault problem from the second suspicious log set by using a fault problem keyword corresponding to the fault type;
the problem type dividing submodule is used for dividing the fault problem into corresponding problem types when the fault problem is found out;
and the log determining submodule is used for determining a problem analysis log corresponding to the problem type.
10. The system of claim 7, wherein the tool selection module comprises:
the content analysis submodule is used for analyzing the content of the log file and acquiring the configuration information of the operating system;
the tool selection submodule is used for selecting a log collection tool corresponding to the configuration information;
a tool installation submodule for installing the log collection tool to a predetermined location of the operating system.
CN202110989919.8A 2021-08-26 2021-08-26 Log analysis integration method and system of operating system Active CN113849329B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110989919.8A CN113849329B (en) 2021-08-26 2021-08-26 Log analysis integration method and system of operating system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110989919.8A CN113849329B (en) 2021-08-26 2021-08-26 Log analysis integration method and system of operating system

Publications (2)

Publication Number Publication Date
CN113849329A true CN113849329A (en) 2021-12-28
CN113849329B CN113849329B (en) 2023-07-14

Family

ID=78976215

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110989919.8A Active CN113849329B (en) 2021-08-26 2021-08-26 Log analysis integration method and system of operating system

Country Status (1)

Country Link
CN (1) CN113849329B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116841836A (en) * 2023-09-01 2023-10-03 四川华鲲振宇智能科技有限责任公司 One-key log collecting tool

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007096959A1 (en) * 2006-02-22 2007-08-30 Fujitsu Limited Event log managing program, event log managing device, and event log managing method
CN104636242A (en) * 2015-02-06 2015-05-20 浪潮电子信息产业股份有限公司 Method for automatically deleting repeated content in system logs on basis of Linux operating system
US20190079818A1 (en) * 2017-09-08 2019-03-14 Oracle International Corporation Techniques for managing and analyzing log data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007096959A1 (en) * 2006-02-22 2007-08-30 Fujitsu Limited Event log managing program, event log managing device, and event log managing method
CN104636242A (en) * 2015-02-06 2015-05-20 浪潮电子信息产业股份有限公司 Method for automatically deleting repeated content in system logs on basis of Linux operating system
US20190079818A1 (en) * 2017-09-08 2019-03-14 Oracle International Corporation Techniques for managing and analyzing log data

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116841836A (en) * 2023-09-01 2023-10-03 四川华鲲振宇智能科技有限责任公司 One-key log collecting tool
CN116841836B (en) * 2023-09-01 2023-11-07 四川华鲲振宇智能科技有限责任公司 One-key log collecting tool

Also Published As

Publication number Publication date
CN113849329B (en) 2023-07-14

Similar Documents

Publication Publication Date Title
US7219266B2 (en) Method and system for categorizing failures of a program module
US10810074B2 (en) Unified error monitoring, alerting, and debugging of distributed systems
US6629267B1 (en) Method and system for reporting a program failure
US7043505B1 (en) Method variation for collecting stability data from proprietary systems
US10922164B2 (en) Fault analysis and prediction using empirical architecture analytics
US7483970B2 (en) Method and apparatus for managing components in an IT system
US20100185590A1 (en) Autonomic information management system (ims) mainframe database pointer error diagnostic data extraction
US9256417B2 (en) Automatic quality assurance for software installers
US7877360B2 (en) Recovery point identification in CDP environments
CN107688531B (en) Geo-database integration test method, device, computer equipment and storage medium
JP6048038B2 (en) Information processing apparatus, program, and information processing method
US20030208593A1 (en) Uniquely identifying a crashed application and its environment
US6944849B1 (en) System and method for storing and reporting information associated with asserts
CN1606002A (en) System and method of generating trouble tickets to document computer failures
US7624127B2 (en) Apparatus, system, and method for automating VTOC driven data set maintenance
WO1996038733A1 (en) Remote monitoring of computer programs
US20060150008A1 (en) Expert software diagnostic tool
Thakur et al. Analyze-NOW-an environment for collection and analysis of failures in a network of workstations
US9256509B1 (en) Computing environment analyzer
CN113553238A (en) Cloud platform resource exception automatic processing system and method
US7478283B2 (en) Provisional application management with automated acceptance tests and decision criteria
CN113849329B (en) Log analysis integration method and system of operating system
JP7299499B2 (en) Information processing program, information processing method, and information processing apparatus
CN110750416A (en) Method and device for automatically processing fault information
US8261122B1 (en) Estimation of recovery time, validation of recoverability, and decision support using recovery metrics, targets, and objectives

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant