CN117033056A - Method, system, storage medium and equipment for diagnosing faults of server system - Google Patents

Method, system, storage medium and equipment for diagnosing faults of server system Download PDF

Info

Publication number
CN117033056A
CN117033056A CN202311054955.0A CN202311054955A CN117033056A CN 117033056 A CN117033056 A CN 117033056A CN 202311054955 A CN202311054955 A CN 202311054955A CN 117033056 A CN117033056 A CN 117033056A
Authority
CN
China
Prior art keywords
picture
kvm
log information
system fault
server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311054955.0A
Other languages
Chinese (zh)
Inventor
郑国伟
王兴隆
叶笑夕
谭艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202311054955.0A priority Critical patent/CN117033056A/en
Publication of CN117033056A publication Critical patent/CN117033056A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/079Root cause analysis, i.e. error or fault diagnosis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0766Error or fault reporting or storing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/321Display for diagnostics, e.g. diagnostic result display, self-test user interface
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • G06F11/328Computer systems status display
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Human Computer Interaction (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application provides a server system fault diagnosis method, a system, a storage medium and equipment, wherein the method comprises the following steps: performing fault monitoring on a system of the server through KVM; in response to the detection of the system fault, intercepting a currently displayed system fault picture through KVM, and acquiring the latest log information of a preset number, wherein the log is used for recording the running state of the system in real time; and merging the system fault picture and the latest log information into the same picture, and intercepting the merged picture through the KVM so as to diagnose the system fault based on the merged picture. The application can improve the efficiency of server fault positioning and the accuracy of server fault diagnosis.

Description

Method, system, storage medium and equipment for diagnosing faults of server system
Technical Field
The present application relates to the field of server technologies, and in particular, to a method, a system, a storage medium, and an apparatus for diagnosing a server system fault.
Background
In practice, the environment of the server room is noisy, the position is remote, and the supervision and maintenance of the server on site by an administrator are not facilitated; the number of servers in the machine room is numerous, and if different servers need to be operated, the servers need to be connected with the different servers through a mouse, a keyboard and a screen, so that the operation is very troublesome; the advent of KVM (Keyboard Video Mouse, keyboard, video, mouse) technology has enabled administrators to manage and operate servers in a computer room through a remote PC (personal computer) terminal.
In order to realize 24-hour monitoring of the server, at rest time, an administrator can set a KVM to capture the screen content of the system according to a certain time interval, name the captured images according to time sequence and store the images in a specific folder, so that the administrator can know the running condition of the server in a corresponding time period by viewing the captured images after the rest is finished. However, accurate fault diagnosis is difficult to perform only by screenshot, and judgment is needed by combining with the SEL log (system event log) at the corresponding moment, but the storage positions of the SEL logs are different from the storage positions of the screenshot pictures, the quantity of the SEL logs is huge, the SEL logs at the fault moment are difficult to quickly locate, and the fault diagnosis efficiency is low.
In the prior art, screen capture pictures and SEL logs are respectively stored under different folders in a BMC (Baseboard Management Controller ), when a system fails, maintenance personnel respectively check the screen capture pictures and recorded SEL logs when the system fails to locate the problem, but a large number of SEL logs are difficult to correspond to the screenshots at the same moment, so that the efficiency of locating the failure is low; and the system display that probably causes under the condition that different faults appear is the same, but the SEL log is refreshed again in real time, and the update rate is faster, if not in time with the SEL log with the system display under the state at that time corresponds, most likely cause fault analysis error, the degree of difficulty of location problem will greatly increased.
Disclosure of Invention
In view of the above, the present application aims to provide a method, a system, a storage medium and a device for diagnosing a server system fault, which are used for solving the problem in the prior art that the system screen capturing picture at the moment of occurrence of the fault is difficult to correspond to the SEL log recorded at the same moment due to huge quantity and real-time update of the SEL log, and the fault positioning efficiency is low.
Based on the above object, the present application provides a server system fault diagnosis method, comprising the steps of:
performing fault monitoring on a system of the server through KVM;
in response to the detection of the system fault, intercepting a currently displayed system fault picture through KVM, and acquiring the latest log information of a preset number, wherein the log is used for recording the running state of the system in real time;
and merging the system fault picture and the latest log information into the same picture, and intercepting the merged picture through the KVM so as to diagnose the system fault based on the merged picture.
In some embodiments, intercepting, by the KVM, a currently displayed system failure picture, and obtaining a preset number of latest log information includes:
intercepting a currently displayed system fault picture through KVM, and storing the intercepted system fault picture into a first cache;
and acquiring a preset number of latest log information through the KVM, and storing the latest log information into the second cache.
In some embodiments, merging the system failure picture and the latest log information into the same picture, and intercepting the merged picture by KVM includes:
obtaining a system fault picture from a first cache, scaling the system fault picture in an equal proportion by using an image scaling algorithm to obtain a scaled picture, and storing the scaled picture in a part of space of a third cache, wherein the size of the space of the third cache is the same as that of the space occupied by the picture;
acquiring the latest log information from the second cache, and storing the acquired latest log information in the rest space of the third cache;
and displaying the content in the third cache to the KVM, and intercepting the current display content in the third cache through the KVM.
In some embodiments, the image scaling algorithm is a quadratic linear interpolation image scaling algorithm.
In some embodiments, diagnosing a system failure based on the merged pictures includes:
and determining the log information corresponding to the system fault picture in the latest log information of the preset number through the combined picture, and diagnosing the system fault based on the corresponding log information and the system fault picture.
In some embodiments, obtaining a preset number of up-to-date logs includes:
and acquiring a preset number of latest logs through a D-bus interface of the BMC of the calling server.
In some embodiments, the method further comprises:
and responding to the system faults, and generating corresponding log information by the CPLD of the server according to the bit state of each running module.
In another aspect of the present application, there is also provided a server system fault diagnosis system, including:
the monitoring module is configured to monitor faults of the system of the server through the KVM;
the system comprises an acquisition module, a control module and a control module, wherein the acquisition module is configured to respond to the detection of the occurrence of a fault of a system, intercept a currently displayed system fault picture through KVM, acquire the latest log information of a preset number, and log the running state of the system in real time; and
and the merging module is configured to merge the system fault picture and the latest log information into the same picture, intercept the merged picture through the KVM, and diagnose the system fault based on the merged picture.
In yet another aspect of the present application, there is also provided a computer readable storage medium storing computer program instructions which, when executed by a processor, implement the above-described method.
In yet another aspect of the present application, there is also provided a computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, performs the above method.
The application has at least the following beneficial technical effects:
according to the server system fault diagnosis method, the screen capturing function of the KVM is utilized, when a system fault is monitored, the currently displayed system fault image is captured through the KVM, the latest log information of the preset number is obtained, the system fault image and the latest log information are combined to the same image, the combined image is captured through the KVM, so that the system fault is diagnosed based on the combined image, the logs when the system fault and the fault occur can be corresponding, the concrete performance of the system fault can be known, the concrete faults under the same system performance can be distinguished, the efficiency of positioning the server fault can be improved, the accuracy of diagnosing the server fault can be improved, meanwhile, the data of the system display conditions under different faults can be accumulated, and the statistics of various fault conditions of the server is facilitated.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are necessary for the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application and that other embodiments may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a fault diagnosis method of a server system according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a processing procedure of a system fault screen and latest log information according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a method for implementing fault diagnosis of a server system according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a server system fault diagnosis system provided according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a computer readable storage medium implementing a server system failure diagnosis method according to an embodiment of the present application;
fig. 6 is a schematic hardware structure of a computer device for performing a server system fault diagnosis method according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the following embodiments of the present application will be described in further detail with reference to the accompanying drawings.
It should be noted that, in the embodiments of the present application, all the expressions "first" and "second" are used to distinguish two non-identical entities with the same name or non-identical parameters, and it is noted that the "first" and "second" are only used for convenience of expression, and should not be construed as limiting the embodiments of the present application. Furthermore, the terms "comprise" and "have," and any variations thereof, are intended to cover a non-exclusive inclusion, such as a process, method, system, article, or other step or unit that comprises a list of steps or units.
Based on the above object, in a first aspect of the embodiments of the present application, an embodiment of a server system fault diagnosis method is provided. Fig. 1 is a schematic diagram of an embodiment of a server system fault diagnosis method provided by the present application. As shown in fig. 1, the embodiment of the present application includes the following steps:
step S10, fault monitoring is carried out on a system of the server through KVM;
step S20, in response to the detection of the system fault, intercepting a currently displayed system fault picture through KVM, and acquiring the latest log information of a preset number, wherein the log is used for recording the running state of the system in real time;
and step S30, merging the system fault picture and the latest log information into the same picture, and intercepting the merged picture through the KVM so as to diagnose the system fault based on the merged picture.
According to the server system fault diagnosis method, the screen capturing function of the KVM is utilized, when the system faults are monitored, the currently displayed system fault images are captured through the KVM, the latest log information of the preset number is obtained, the system fault images and the latest log information are combined to the same image, the combined images are captured through the KVM, so that the system faults are diagnosed based on the combined images, the system faults and the logs when the faults occur can be corresponding, the concrete performance of the system faults can be known, the specific faults under the same system conditions can be distinguished, the efficiency of positioning the server faults can be improved, the accuracy of diagnosing the server faults can be improved, meanwhile, the data of the system display conditions under different faults can be accumulated, and the statistics of various fault conditions of the server is facilitated.
In some embodiments, intercepting, by the KVM, a currently displayed system failure picture, and obtaining a preset number of latest log information includes: intercepting a currently displayed system fault picture through KVM, and storing the intercepted system fault picture into a first cache; and acquiring a preset number of latest log information through the KVM, and storing the latest log information into the second cache.
Fig. 2 is a schematic diagram of a processing procedure of a system fault screen and latest log information provided according to an embodiment of the present application. As shown in fig. 2, the intercepted system fault picture (i.e. system fault picture) is stored in the cache 1 (i.e. first cache), and the acquired preset number of latest log information (i.e. N SEL logs) is stored in the cache 2 (i.e. second cache).
When the server fails, the KVM (Keyboard Video Mouse, keyboard, video, mouse; KVM can access and control the computer by directly connecting the keyboard, video or mouse ports) automatically saves the last N last SEL logs (system event logs) at that time, where the number N can be set according to the actual situation.
In this embodiment, by storing the intercepted system fault picture and the latest log information in different caches, it is convenient to obtain the system fault picture and the latest log information later without confusion.
In some embodiments, merging the system failure picture and the latest log information into the same picture, and intercepting the merged picture by KVM includes: obtaining a system fault picture from a first cache, scaling the system fault picture in an equal proportion by using an image scaling algorithm to obtain a scaled picture, and storing the scaled picture in a part of space of a third cache, wherein the size of the space of the third cache is the same as that of the space occupied by the picture; acquiring the latest log information from the second cache, and storing the acquired latest log information in the rest space of the third cache; and displaying the content in the third cache to the KVM, and intercepting the current display content in the third cache through the KVM.
As shown in fig. 2, after the system fault picture in the buffer 1 is scaled equally by an image scaling algorithm, the scaled picture is stored in the buffer 3 (i.e. the third buffer); taking out N SEL logs in the cache 2 and storing the N SEL logs in the cache 3; and then displaying the combined content in the cache 3 to the KVM to obtain a combined picture, and intercepting the combined picture through the screen capturing function of the KVM.
In fig. 2, the ratio of the scaled frame and the latest log information in the buffer 3 is set to 2:1, that is, the scaled frame occupies 2/3 of the buffer 3, and the latest log information occupies 1/3 of the buffer 3, or a suitable ratio may be set according to the actual situation. For example, if the preset number (N) is large, the space of the cache 3 occupied by the latest log information may be increased. The scale needs to be determined in advance to facilitate the determination of the zoom in when using an image scaling algorithm.
In this embodiment, the system fault picture is scaled and combined with the latest log information to be displayed in the same picture, so that the system performance when the fault occurs can be known, and meanwhile, the corresponding faults under the same performance situation can be distinguished, so that the accuracy and efficiency of the fault diagnosis of the server can be greatly improved by combining the system fault picture with the log information.
In some embodiments, the image scaling algorithm is a quadratic linear interpolation image scaling algorithm.
In this embodiment, by using the quadratic linear interpolation image scaling algorithm, the calculation amount for performing the scaling operation and the quality of the scaled image can be guaranteed to be undistorted, i.e., the balance between the scaling speed and the scaled image quality can be achieved.
In some embodiments, diagnosing a system failure based on the merged pictures includes: and determining the log information corresponding to the system fault picture in the latest log information of the preset number through the combined picture, and diagnosing the system fault based on the corresponding log information and the system fault picture.
In this embodiment, since the log information is generated in real time when the server runs, instead of only obtaining the latest log information, a plurality of pieces of log information are obtained, so that the log information corresponding to the time when the fault occurs is prevented from being omitted.
In some embodiments, obtaining a preset number of up-to-date logs includes: and acquiring a preset number of latest logs through a D-bus interface of the BMC of the calling server.
The BMC is called as a baseboard management controller (Baseboard Management Controller), is a special controller for monitoring and managing servers, and has the following main functions: (1) device information management: recording server information (model, manufacturer, date, production and technical information of each component, chassis information, main board information and the like), BMC information (information such as server host name, BMC firmware version and the like); (2) server state monitoring management: detecting the health states of the components (such as memory, hard disk, fan, machine frame and the like) of the server, and simultaneously adjusting the rotating speed of the fan in real time according to the conditions of all temperature acquisition points to ensure that the server does not generate over-temperature and the overall power consumption is controlled not to be too high; if any abnormality occurs in the single-board component, the information is timely reported to an upper network manager through various industry universal specifications; (3) remote control management of the server: powering on and powering off, restarting, maintaining, updating firmware, installing a system and the like of the server; (4) maintenance management: log management, user management, alarm management, etc.
D-bus (message bus system) is a low latency, low overhead, high availability ipc (inter-process communication) mechanism.
In some embodiments, the method further comprises: and responding to the system faults, and generating corresponding log information by the CPLD of the server according to the bit state of each running module.
In this embodiment, when the server system fails, the CPLD (Complex Programmable Logic Device ) may generate a corresponding SEL log according to different bit states, where a bit represents a different module, such as: CPU (Central Processing Unit ), power supply, fan, etc., while other bit state combinations may represent more detailed fault problems below a certain module, the SEL log at this time may more clearly reflect the faults occurring in a specific module.
Fig. 3 shows a schematic structural diagram for implementing a server system fault diagnosis method according to an embodiment of the present application. As shown in fig. 3, an exemplary embodiment of a server system failure diagnosis method of the present application is as follows:
1) Firstly, a D-bus interface for acquiring the SEL logs of a server is added in a service of a BMC so as to acquire N latest logs, wherein the N acquired logs can be customized by a user, 10 maximum supported logs can be set, and the acquired SEL logs are stored in a specific cache;
2) When the system fails, the KVM acquires the frame picture of the current system, and stores the frame picture in another specific cache;
3) And simultaneously displaying the system image acquired at the fault moment and the SEL log in the KVM without overlapping, intercepting the current KVM display picture by using the automatic KVM screen capturing function, and storing the current KVM display picture to a corresponding directory of the BMC local, so that the current KVM display picture can be checked by a manager.
The main way to realize the step 3) is as follows:
the obtained SEL log and the frame image of the system under the fault moment are respectively stored in different caches; in order to balance the operation speed and quality, a secondary linear interpolation image scaling algorithm is used for carrying out scaling on the system image under the acquired fault time, the display size of the system image is reduced to be only two thirds of that of the KVM, the system image is placed in the first two thirds of a third cache, the acquired SEL log is placed in the second one third of the third cache, the cached content is displayed in the KVM together, and the KVM display content is intercepted and stored in a local corresponding position of the BMC by using a KVM automatic screen capturing function.
In a second aspect of the embodiment of the present application, a server system fault diagnosis system is also provided. Fig. 4 is a schematic diagram of an embodiment of a server system fault diagnosis system provided by the present application. As shown in fig. 4, a server system failure diagnosis system includes: a monitoring module 10 configured to perform fault monitoring on a system of the server through KVM; the acquiring module 20 is configured to intercept a currently displayed system fault picture through KVM in response to the detection of a system fault occurrence, and acquire a preset number of latest log information, wherein the log is used for recording the running state of the system in real time; and a merging module 30 configured to merge the system failure picture and the latest log information into the same picture, and intercept the merged picture by KVM to diagnose the system failure based on the merged picture.
According to the server system fault diagnosis system, the screen capturing function of the KVM is utilized, when the system faults are monitored, the currently displayed system fault images are captured through the KVM, the latest log information of the preset number is obtained, the system fault images and the latest log information are combined to the same image, the combined images are captured through the KVM, so that the system faults are diagnosed based on the combined images, the logs when the system faults and the faults occur can be corresponding, the concrete performance of the system faults can be known, the concrete faults under the same system conditions can be distinguished, the efficiency of positioning the server faults can be improved, the accuracy of diagnosing the server faults can be improved, meanwhile, the data of the system display conditions under different faults can be accumulated, and the statistics of various fault conditions of the server is facilitated.
In some embodiments, the obtaining module 20 is further configured to intercept, by KVM, a currently displayed system failure picture, and store the intercepted system failure picture in the first cache; and acquiring a preset number of latest log information through the KVM, and storing the latest log information into the second cache.
In some embodiments, the merging module 30 is further configured to obtain the latest log information from the first cache, and store the obtained latest log information in a part of the space of the third cache, where the size of the space of the third cache is the same as the size of the space occupied by the picture; obtaining a system fault picture from the second cache, scaling the system fault picture in an equal proportion by using an image scaling algorithm to obtain a scaled picture, and storing the scaled picture in the residual part space of the third cache; and displaying the content in the third cache to the KVM, and intercepting the current display content in the third cache through the KVM.
In some embodiments, the image scaling algorithm is a quadratic linear interpolation image scaling algorithm.
In some embodiments, the merging module 30 includes a diagnosis module configured to determine, from the merged frames, log information corresponding to a system failure frame in a preset number of latest log information, and diagnose a system failure based on the corresponding log information and the system failure frame.
In some embodiments, the acquisition module 20 includes a log acquisition module configured to acquire a preset number of up-to-date logs over a D-bus interface of a BMC calling the server.
In some embodiments, the system further comprises a fault module configured to generate corresponding log information according to bit states of the respective running modules by the CPLD of the server in response to a system fault.
In a third aspect of the embodiment of the present application, a computer readable storage medium is provided, and fig. 5 shows a schematic diagram of a computer readable storage medium implementing a server system fault diagnosis method according to an embodiment of the present application. As shown in fig. 5, the computer-readable storage medium 3 stores computer program instructions 31. The computer program instructions 31 when executed by a processor implement the method of any of the embodiments described above.
It should be understood that all of the embodiments, features and advantages set forth above for the server system fault diagnosis method according to the present application equally apply to the server system fault diagnosis system and storage medium according to the present application, without conflicting therewith.
In a fourth aspect of the embodiment of the present application, there is also provided a computer device, including a memory 402 and a processor 401 as shown in fig. 6, where the memory 402 stores a computer program, and the computer program is executed by the processor 401 to implement the method of any one of the embodiments above.
As shown in fig. 6, a schematic hardware structure of an embodiment of a computer device for performing a server system fault diagnosis method according to the present application is shown. Taking the example of a computer device as shown in fig. 6, a processor 401 and a memory 402 are included in the computer device, and may further include: an input device 403 and an output device 404. The processor 401, memory 402, input device 403, and output device 404 may be connected by a bus or otherwise, for example in fig. 6. The input device 403 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the server system fault diagnosis system. The output 404 may include a display device such as a display screen.
The memory 402 is used as a non-volatile computer readable storage medium for storing a non-volatile software program, a non-volatile computer executable program, and modules, such as program instructions/modules corresponding to the server system fault diagnosis method in the embodiment of the present application. Memory 402 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created by the use of the server system failure diagnosis method, and the like. In addition, memory 402 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some embodiments, memory 402 may optionally include memory located remotely from processor 401, which may be connected to the local module via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The processor 401 executes various functional applications of the server and data processing, that is, implements the server system failure diagnosis method of the above-described method embodiment, by running nonvolatile software programs, instructions, and modules stored in the memory 402.
Finally, it should be noted that the computer-readable storage media (e.g., memory) herein can be either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory. By way of example, and not limitation, nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM), which acts as external cache memory. By way of example, and not limitation, RAM may be available in a variety of forms such as synchronous RAM (DRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), and Direct Rambus RAM (DRRAM). The storage devices of the disclosed aspects are intended to comprise, without being limited to, these and other suitable types of memory.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as software or hardware depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
The various illustrative logical blocks, modules, and circuits described in connection with the disclosure herein may be implemented or performed with the following components designed to perform the functions herein: a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP and/or any other such configuration.
The foregoing is an exemplary embodiment of the present disclosure, but it should be noted that various changes and modifications could be made herein without departing from the scope of the disclosure as defined by the appended claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. Furthermore, although elements of the disclosed embodiments may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated.
It should be understood that as used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items. The foregoing embodiment of the present application has been disclosed with reference to the number of embodiments for the purpose of description only, and does not represent the advantages or disadvantages of the embodiments.
Those of ordinary skill in the art will appreciate that: the above discussion of any embodiment is merely exemplary and is not intended to imply that the scope of the disclosure of embodiments of the application, including the claims, is limited to such examples; combinations of features of the above embodiments or in different embodiments are also possible within the idea of an embodiment of the application, and many other variations of the different aspects of the embodiments of the application as described above exist, which are not provided in detail for the sake of brevity. Therefore, any omission, modification, equivalent replacement, improvement, etc. of the embodiments should be included in the protection scope of the embodiments of the present application.

Claims (10)

1. A server system fault diagnosis method, comprising the steps of:
performing fault monitoring on a system of the server through KVM;
in response to the detection of the system fault, intercepting a currently displayed system fault picture through the KVM, and acquiring the latest log information of a preset number, wherein the log is used for recording the running state of the system in real time;
and merging the system fault picture and the latest log information into the same picture, and intercepting the merged picture through the KVM so as to diagnose the system fault based on the merged picture.
2. The method of claim 1, wherein intercepting, by the KVM, a currently displayed system failure picture and obtaining a preset number of latest log information comprises:
intercepting a currently displayed system fault picture through the KVM, and storing the intercepted system fault picture into a first cache;
and acquiring a preset number of latest log information through the KVM, and storing the latest log information into a second cache.
3. The method of claim 2, wherein merging the system failure picture with the latest log information into a same picture, and intercepting the merged picture by the KVM comprises:
obtaining the system fault picture from the first cache, scaling the system fault picture in equal proportion by using an image scaling algorithm to obtain a scaled picture, and storing the scaled picture in a part of space of a third cache, wherein the size of the space of the third cache is the same as that of the space occupied by the picture;
acquiring the latest log information from the second cache, and storing the acquired latest log information in the rest space of the third cache;
and displaying the content in the third cache to the KVM, and intercepting the current display content in the third cache through the KVM.
4. A method according to claim 3, wherein the image scaling algorithm is a quadratic linear interpolation image scaling algorithm.
5. The method of claim 1, wherein diagnosing the system fault based on the merged picture comprises:
and determining the log information corresponding to the system fault picture in the latest log information of the preset number through the combined picture, and diagnosing the system fault based on the corresponding log information and the system fault picture.
6. The method of claim 1, wherein obtaining a preset number of up-to-date logs comprises:
and acquiring a preset number of latest logs by calling a D-bus interface of the BMC of the server.
7. The method as recited in claim 1, further comprising:
and responding to the system faults, and generating corresponding log information by the CPLD of the server according to the bit state of each running module.
8. A server system failure diagnosis system, comprising:
the monitoring module is configured to monitor faults of the system of the server through the KVM;
the acquisition module is configured to respond to the detection of the system fault, intercept a currently displayed system fault picture through the KVM, and acquire the latest log information of a preset number, wherein the log is used for recording the running state of the system in real time; and
and the merging module is configured to merge the system fault picture and the latest log information into the same picture, intercept the merged picture through the KVM, and diagnose the system fault based on the merged picture.
9. A computer readable storage medium, characterized in that computer program instructions are stored, which, when executed by a processor, implement the method of any one of claims 1-7.
10. A computer device comprising a memory and a processor, wherein the memory has stored therein a computer program which, when executed by the processor, performs the method of any of claims 1-7.
CN202311054955.0A 2023-08-21 2023-08-21 Method, system, storage medium and equipment for diagnosing faults of server system Pending CN117033056A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311054955.0A CN117033056A (en) 2023-08-21 2023-08-21 Method, system, storage medium and equipment for diagnosing faults of server system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311054955.0A CN117033056A (en) 2023-08-21 2023-08-21 Method, system, storage medium and equipment for diagnosing faults of server system

Publications (1)

Publication Number Publication Date
CN117033056A true CN117033056A (en) 2023-11-10

Family

ID=88602180

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311054955.0A Pending CN117033056A (en) 2023-08-21 2023-08-21 Method, system, storage medium and equipment for diagnosing faults of server system

Country Status (1)

Country Link
CN (1) CN117033056A (en)

Similar Documents

Publication Publication Date Title
CN110851320A (en) Server downtime supervision method, system, terminal and storage medium
TW201514686A (en) Method and system for automated test and result comparison
CN109976959A (en) A kind of portable device and method for server failure detection
CN111698343B (en) PXE equipment positioning method and device
JP2014157505A (en) Controller, information processing device, and program
CN111817921B (en) Mobile test equipment, test method, device, system and networking method
EP3482267B1 (en) System and method for diagnosing automation systems
CN113064762B (en) Service self-recovery method based on various detection
WO2019111709A1 (en) Control system, information processing device, and abnormality factor estimation program
CN114978883B (en) Network wakeup management method and device, electronic equipment and storage medium
CN115525490A (en) Memory eye pattern testing method, hardware debugging equipment and storage medium
CN117033056A (en) Method, system, storage medium and equipment for diagnosing faults of server system
JP2008176353A (en) Information processor and log acquisition method, and program
JP2007207213A (en) Diagnostic information collecting method applied to real-time diagnosis of wireless device
CN112114993A (en) Configuration information processing method and device of application system
KR100792241B1 (en) Recording method and system of remote integrating
CN108880916B (en) IIC bus-based fault positioning method and system
CN106909489B (en) Method and device for testing EventLog state
WO2020147415A1 (en) Snapshot service process management method and apparatus, electronic device, and readable storage medium
CN112463883A (en) Reliability monitoring method, device and equipment based on big data synchronization platform
CN117872009B (en) Monitoring method, system, equipment and storage medium of excitation rectifying equipment
CN215118332U (en) Medical equipment management device
TW201413447A (en) System and method for displaying test status and marking abnormalities
CN114675991A (en) Method, system, equipment and storage medium for realizing effective positioning of log
JP2004021524A (en) Load test system to network server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination