CN113742166B - Method, device and system for recording logs of server system devices - Google Patents

Method, device and system for recording logs of server system devices Download PDF

Info

Publication number
CN113742166B
CN113742166B CN202110863417.0A CN202110863417A CN113742166B CN 113742166 B CN113742166 B CN 113742166B CN 202110863417 A CN202110863417 A CN 202110863417A CN 113742166 B CN113742166 B CN 113742166B
Authority
CN
China
Prior art keywords
information
server system
power
flash memory
turned
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110863417.0A
Other languages
Chinese (zh)
Other versions
CN113742166A (en
Inventor
陆俊宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202110863417.0A priority Critical patent/CN113742166B/en
Publication of CN113742166A publication Critical patent/CN113742166A/en
Application granted granted Critical
Publication of CN113742166B publication Critical patent/CN113742166B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0616Improving the reliability of storage systems in relation to life time, e.g. increasing Mean Time Between Failures [MTBF]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0679Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Debugging And Monitoring (AREA)
  • Power Sources (AREA)

Abstract

The invention provides a server system device log recording method, which comprises the following steps: acquiring time information when power is lost and state information and/or event information of different server system devices when power is lost; storing time information during power failure and state information and/or event information of different server system devices during power failure into a first user flash memory in the programmable logic device; when the power is re-on, the time information of the power failure in the first user flash memory and the state information and/or event information of different server system devices during the power failure are sent to the baseboard management controller in response to the read command of the baseboard management controller.

Description

Method, device and system for recording logs of server system devices
Technical Field
The present invention relates to the field of server log recording, and in particular, to a method, an apparatus, and a system for recording a log of a server system device.
Background
Programmable logic devices, such as CPLD (Complex Programmable Logic Device ), are already necessary and widely used devices in server systems today, and CPLD has mainly the following directions in the application of server systems: the power supply and time sequence control is used for controlling the power supply time sequence of the core device of the server system in the period of power up and power down; a system Reset signal is generated, and a Reset signal (Reset) required by the server system is generated in a starting stage or during system Reset, so that the server system is started in a stable and predictable initial state; monitoring system events, namely detecting and latching abnormal events during the running period of the server system; and a Watchdog (watch) function for detecting the working state of the core device and automatically resetting the system when the core device is in a dead state, thereby improving the reliability of the server system.
Since stability is an important item in a server system, various problems occurring when the system is running are collected and known in time as one of important factors which must be considered when designing the server system, compared with devices commonly used by servers such as BMC (Baseboard Management Controller ) and MCU (Microcontroller Unit, microcontroller), a CPLD can provide a large number of I/O (input/output) pins which are lacking in the BMC or the MCU, and a large number of I/O pins are necessary equipment for collecting various states or events on the system, and according to requirements of different systems and applications, the requirements for collecting I/O pins of various states and events are few, more than tens of I/O pins and hundreds of I/O pins.
In the prior art, a server system has widely used a CPLD to detect and collect various state information of devices not used when the system is running in combination with a BMC, as shown in fig. 1, the common combination mode of the CPLD and the BMC is that the CPLD detects and collects various signals or events of the system devices by utilizing the advantages of numerous GPIOs (general purpose I/O pins) of the CPLD, and stores the signals or events in a register in the CPLD temporarily, and the BMC obtains information when power is turned off from the CPLD in a polling or system interrupt (INT interface) mode, and stores the information into an SPI (Serial Peripheral Interface ) flash memory to become recorded log information.
However, in actual work, when the system is powered down due to no early warning of the system, the server system is powered down, the system can keep working state from PSU (Power supply unit ) to the residual power of the board card, the CPLD log recording time is short, enough time is needed for recording the states and events of different system devices when the power is off, but the BMC log recording time is long, the residual electric quantity cannot be met, and after the system is powered down, the different records of the CPLD internal registers can also disappear, so that the last states or events of the system devices cannot be effectively saved when the system is powered down, and the time information when the power is lost is lacking.
Disclosure of Invention
In order to solve the problems in the prior art, the invention innovatively provides a method, a device and a system for recording the log of a server system device, so that the problem of low reliability of the log recording of the server system when power is lost due to the prior art is effectively solved, and the reliability of the log recording of the server system is effectively improved.
The first aspect of the present invention provides a method for logging devices in a server system, including:
acquiring time information when power is lost and state information and/or event information of different server system devices when power is lost;
storing time information during power failure and state information and/or event information of different server system devices during power failure into a first user flash memory in the programmable logic device;
when the power is re-on, the time information of the power failure in the first user flash memory and the state information and/or event information of different server system devices of the power failure are sent to the baseboard management controller in response to the read command of the baseboard management controller.
Optionally, storing the time information when power is turned off and the state information and/or event information of different server system devices when power is turned off in the first user flash memory in the programmable logic device specifically includes:
the user flash memory controller sequentially stores the time information of different times of power failure and the state information and/or event information of different server system devices corresponding to the power failure into a plurality of rows of records of a first user flash memory in the programmable logic device; and recording time information when power is turned off once in each row, and state information and/or event information corresponding to different server system devices when power is turned off.
Further, the state information comprises working state information, alarm information and fault information, and the event information comprises fan rotation speed information and temperature information.
Optionally, the method further comprises:
synchronously storing time information during power failure and state information and/or event information of different server system devices during power failure into a second user flash memory in the programmable logic device;
and when the power is re-on, responding to a read command of the baseboard management controller, and sending the time information of the power failure, which is synchronously stored in the flash memory of the second user, and the state information and/or event information of different server system devices when the power failure occurs to the baseboard management controller.
Further, the method further comprises the following steps:
and setting a sending identifier for identifying whether the storage records of the first user flash memory and/or the second user flash memory are sent by the programmable logic device.
Optionally, the method further comprises:
acquiring storage state information of the first user flash memory and/or the second user flash memory in real time;
and when the storage records in the first user flash memory or the second user flash memory are fully written, automatically clearing the storage records in the first user flash memory or the second user flash memory.
Optionally, the method further comprises:
after the base plate management controller is powered on again, respectively reading the time information of the last power failure stored in the first user flash memory, and the state information and/or event information of different server system devices when the power failure occurs last time, and if the read information stored in the first user flash memory is abnormal, reading the time information of the last power failure stored in the second user flash memory, and the state information and/or event information of different server system devices when the power failure occurs last time.
Optionally, the server system devices include, but are not limited to, a server power supply, a server fan, a hard disk, a central processing unit, memory, a baseboard management controller.
The second aspect of the present invention provides a server system device logging apparatus, comprising:
the power-down control unit is used for obtaining time information when power is turned off and state information and/or event information of different server system devices when the power is turned off;
the first storage unit is used for storing time information when power is turned off and state information and/or event information of different server system devices when power is turned off into a first user flash memory in the programmable logic device;
and the first sending unit is used for responding to the reading command of the baseboard management controller when the power is re-turned on, and sending the time information of the power failure in the first user flash memory and the state information and/or event information of different server system devices when the power failure occurs to the baseboard management controller.
The third aspect of the present invention provides a server system device log recording system, comprising a programmable logic device and a baseboard management controller;
the programmable logic device is used for acquiring time information when power is off and state information and/or event information of different server system devices when power is off; storing time information during power failure and state information and/or event information of different server system devices during power failure into a first user flash memory in the programmable logic device; when the power is re-turned on, responding to a read command of the baseboard management controller, and sending time information when the power is turned off in the first user flash memory and state information and/or event information of different server system devices when the power is turned off to the baseboard management controller; or sending the time information of the power failure in the flash memory of the second user and the state information and/or event information of different server system devices of the power failure to a baseboard management controller;
the baseboard management controller is used for sending a reading command to the programmable logic device when the power is re-turned on, and reading time information when the power is turned off in the first user flash memory and state information and/or event information of different server system devices when the power is turned off; or time information at power down in the second user flash memory, and status information and/or event information of different server system devices at power down.
The technical scheme adopted by the invention comprises the following technical effects:
1. according to the technical scheme, the time information during power failure and the state information and/or event information of different server system devices during power failure are stored in the first user flash memory in the programmable logic device, after the first user flash memory is powered down, the data stored in the first user flash memory cannot be lost, not only one or a certain type of devices of the server are recorded, but also the state information and/or event information during power failure and the time information during power failure of the different server system devices are recorded, and the important time information is provided to enable a user to analyze the problem more effectively (the time information is very important for the user to analyze the problem, if the occurrence time of the event is known only, a plurality of problems are difficult to find, the problem that the reliability of the log record of the server system is low due to the fact that the prior art causes the power failure is effectively solved, and the reliability and the comprehensiveness of the log record of the server system are effectively improved.
2. In the technical scheme of the invention, the time information when power is turned off and the state information and/or event information of different server system devices when power is turned off are synchronously stored in a second user flash memory in the programmable logic device, and if the information stored in the read first user flash memory is abnormal, the time information when power is turned off last time stored in the second user flash memory is read, and the state information and/or event information of different server system devices when power is turned off last time; the reliability of the log record of the server system device is further improved.
3. According to the technical scheme, the sending identifier is used for identifying whether the storage records of the first user flash memory and/or the second user flash memory are sent by the programmable logic device or not, so that the CPLD can conveniently judge whether the storage records are sent (or read by the BMC) or not, and the convenience of log records of the server system device is improved.
4. According to the technical scheme, after the storage records in the first user flash memory or the second user flash memory are fully written, the storage records in the first user flash memory or the second user flash memory are automatically cleared, so that the continuity of log records of devices of a server system is ensured, and the problem of low service life of the user flash memory caused by frequent erasure is avoided.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed.
Drawings
For a clearer description of embodiments of the invention or of the solutions of the prior art, reference will be made to the accompanying drawings, which are used in the description of the embodiments or of the prior art, and it will be obvious to those skilled in the art that other drawings can be obtained from these without inventive labour.
FIG. 1 is a block diagram of a prior art CPLD and BMC in cooperation with reading a log record of a server system device;
FIG. 2 is a schematic flow chart of a method according to an embodiment of the present invention;
fig. 3 is a schematic diagram of an internal architecture of a CPLD according to a method of an embodiment of the present invention;
fig. 4 is a schematic diagram of a first user flash (UFM 0) memory record (when not stored, partially stored, fully written) according to an embodiment of the present invention;
FIG. 5 is a schematic flow chart of another method according to an embodiment of the present invention;
FIG. 6 is a schematic flow chart of another method according to an embodiment of the present invention;
FIG. 7 is a schematic flow chart of another method according to an embodiment of the present invention;
FIG. 8 is a schematic flow chart of another method according to an embodiment of the present invention;
FIG. 9 is a schematic diagram of a second embodiment of the present invention;
fig. 10 is a schematic structural diagram of a third system according to an embodiment of the present invention.
Detailed Description
In order to clearly illustrate the technical features of the present solution, the present invention will be described in detail below with reference to the following detailed description and the accompanying drawings. The following disclosure provides many different embodiments, or examples, for implementing different structures of the invention. In order to simplify the present disclosure, components and arrangements of specific examples are described below. Furthermore, the present invention may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed. It should be noted that the components illustrated in the figures are not necessarily drawn to scale. Descriptions of well-known components and processing techniques and processes are omitted so as to not unnecessarily obscure the present invention.
Example 1
As shown in fig. 2, the present invention provides a server system device logging method, including:
s1, acquiring time information when power is lost and state information and/or event information of different server system devices when power is lost;
s2, storing time information during power failure and state information and/or event information of different server system devices during power failure into a first user flash memory in the programmable logic device;
and S3, when the power is re-turned on, responding to a read command of the baseboard management controller, and sending time information when the power is turned off in the first user flash memory and state information and/or event information of different server system devices when the power is turned off to the baseboard management controller.
In step S1, the time information format when power is turned off may be in the form of time of year, month, day, minute, and second, or may be in other forms or specific to other time units (such as ms, etc.), which is not limited in this disclosure. The state information and/or event information of different server system devices during power failure may be obtained through a register inside the programmable logic device (i.e., CPLD), or may be obtained through other manners (e.g., directly obtained), which is not limited in this disclosure.
In step S2, as shown in fig. 3, storing time information during power failure and status information and/or event information of different server system devices during power failure into a first user flash memory in the programmable logic device specifically includes:
the user flash memory controller sequentially stores the time information of different times of power failure and the state information and/or event information of different server system devices corresponding to the power failure into a plurality of rows of records of a first user flash memory in the programmable logic device; and recording time information when power is turned off once in each row, and state information and/or event information corresponding to different server system devices when power is turned off.
Specifically, when the power-down detection module detects that the system is powered down, the user flash memory controller acquires time information when the power is down, and data in the server system device state information and/or event information file recording module when the power is down, and sequentially stores the time information when the power is down for different times, namely the state information and/or event information of different server system devices corresponding to the power is down, into a plurality of rows of records of a first user flash memory in the programmable logic device; and recording time information when power is turned off once in each row, and state information and/or event information corresponding to different server system devices when power is turned off.
As shown in fig. 4, the total amount of information recorded in each power-down is set to 128 Bytes, 1Byte Index is deducted, that is, the number of server system device log records is used to identify the number of server system device log records, the time information when 5 Bytes power down and the last bmc_read_mark are used to set to send an identifier, the recordable log data amount is 121 Bytes, the recorded number is enough to record all necessary information of different server system devices when power down, and the capacity of the first user flash memory can record up to 128 power-down events under the architecture. Before the power-down information (time information when power is down and state information and/or event information corresponding to different server system devices when power is down) is not written in the first user flash memory, each Byte data is 0xFF, before the data is written in, the programmable logic device checks Byte Index fields from top to bottom, the unwritten memory record rows are searched, the time information when power is down is written in the unwritten rows in sequence, and the state information and/or event information corresponding to different server system devices when power is down, for example, the next row of 0x01 is 0xFF, the time information when power is down, which indicates that 0xFF is not written at present, and the state information and/or event information corresponding to different server system devices when power is down are written in the 0x01+1 rows of Byte1-Byte126 by the programmable logic device.
Specifically, the status information may include operation status information, alarm information, fault information, and the event information may include fan rotation speed information, temperature information. The server system device may be a Power Supply (PSU) in a server system, a server system core device, such as a CPU (central processing unit ), PCH (integrated south bridge), memory, BMC, etc., a hard disk, a system fan, a system temperature sensor (for detecting server system core device or critical area temperature, e.g., CPU, memory, PCH, BMC, PSU), etc. The system Power state information includes working state information of a Power regulator, power failure information and the like, and the event information can be Power output voltage information, temperature information of a temperature sensor for monitoring a server Power supply and the like; the server system core device state information may be server system core device operation state information, abnormal information (e.g., CPU/memory device overheat), alarm information, etc., and the corresponding event information may be temperature information of a temperature sensor for monitoring the server system core device, etc.; the hard disk state information can be hard disk working state information and abnormal information, and the corresponding event information can be temperature information of a temperature sensor used for monitoring a hard disk of the server system, and the like; the state information of the system fan may be working state information, fault abnormality information, event information may be rotational speed information of the system fan, etc. The state information or event information of the same server system device is sequentially stored next to each other in turn, the state information or event information of different server system devices is sequentially stored next to each other in turn, for example, byte6-Byte10 stores state information and/or event information of a server power supply, byte11-Byte16 stores state information and/or event information of a core device CPU of the server system, and so on.
The first user flash may be a UFM flash (User Flash Memory, user flash) and the corresponding user flash controller may be a UFM flash controller.
In step S3, when the power is turned back on, the time information of the last power failure in the first user flash memory, and the state information and/or event information of the different server system devices of the last power failure are sent to the baseboard management controller in response to the read command of the baseboard management controller.
Further, as shown in fig. 5, the present invention provides a method for logging devices of a server system, which further includes:
s4, synchronously storing time information during power failure and state information and/or event information of different server system devices during power failure into a second user flash memory in the programmable logic device;
and S5, when the power is turned on again, responding to a read command of the baseboard management controller, and sending the time information of the power failure, and the state information and/or event information of different server system devices when the power failure, which are synchronously stored in the flash memory of the second user, to the baseboard management controller.
In step S4, the programmable logic device (i.e. the CPLD) writes the time information of the power failure to be written, and the state information and/or event information of the different server system devices when the power failure is/are written into the first user flash memory in the CPLD, and simultaneously writes the time information of the power failure to be written, and the state information and/or event information of the different server system devices when the power failure is/are written into the second user flash memory, so that no extra time is occupied. Specifically, when the power failure detection module detects that the system is powered down, the user flash memory controller acquires time information during power failure, and data in the server system device state information and/or event information file recording module during power failure, and stores the time information during power failure and the state information and/or event information of different server system devices corresponding to the power failure into the second user flash memory simultaneously while sequentially storing the time information during power failure and the state information and/or event information of different server system devices corresponding to the power failure into the first user flash memory in the programmable logic device.
In step S5, when the power is turned back on, the time information of the last power failure stored in the flash memory of the second user and the state information and/or event information of the different server system devices of the last power failure are sent to the baseboard management controller in response to the read command of the baseboard management controller.
Further, as shown in fig. 6, the present invention provides a method for logging devices of a server system, which further includes:
s6, setting a sending identifier, wherein the sending identifier is used for identifying whether the storage records of the first user flash memory and/or the second user flash memory are sent by the programmable logic device or not.
In step S6, as shown in fig. 3, a transmission identifier (or a read identifier) may be set in the Byte127, when the Byte127 is not transmitted (or read), the Byte127 content is 0xFF, when the Byte127 is transmitted (or read), the Byte127 content is 0x00, that is, the transmission identifier, by which it can be determined whether the current power-down information (time information when power is down, and state information and/or event information corresponding to different server system devices when power is down) is transmitted (or read by the BMC) by the CPLD, thereby improving the log recording efficiency and reliability.
Further, as shown in fig. 7, the present invention provides a method for logging devices of a server system, which further includes:
s7, acquiring storage state information of the first user flash memory and/or the second user flash memory in real time; and when the storage records in the first user flash memory or the second user flash memory are fully written, automatically clearing the storage records in the first user flash memory or the second user flash memory.
In step S7, the CPLD checks whether the first user flash memory (UFM 0) is full, if so, the CPLD automatically clears the first user flash memory, and the first user flash memory starts recording from the beginning when the power is turned off next time; similarly, the CPLD checks whether the second user flash memory (UFM 1) is fully written, if so, the CPLD automatically clears the second user flash memory, and the second user flash memory starts recording from the beginning when power is lost next time.
Further, as shown in fig. 8, the present invention provides a method for logging devices of a server system, which further includes:
and S8, after the substrate management controller is powered on again, respectively reading the time information stored in the first user flash memory and used for the last power failure, and the state information and/or event information of different server system devices when the last power failure occurs, and if the read information stored in the first user flash memory is abnormal, reading the time information stored in the second user flash memory and used for the last power failure, and the state information and/or event information of different server system devices when the last power failure occurs.
In step S8, after the baseboard management controller is powered on again, the time information of the last power failure stored in the first user flash memory is read, and the state information and/or event information of different server system devices when the power failure is last time are read respectively.
Server system devices include, but are not limited to, server power supplies, server fans, hard disks, central processing units, memory, baseboard management controllers.
It should be noted that, in the technical solution of the present invention, steps S1 to S8 may be implemented by hardware or software language programming, and the implemented thought corresponds to the step, and may also be implemented by other modes, which is not limited herein.
According to the technical scheme, the time information during power failure and the state information and/or event information of different server system devices during power failure are stored in the first user flash memory in the programmable logic device, after the first user flash memory is powered down, the data stored in the first user flash memory cannot be lost, not only one or a certain type of devices of the server are recorded, but also the state information and/or event information during power failure and the time information during power failure of the different server system devices are recorded, and the important time information is provided to enable a user to analyze the problem more effectively (the time information is very important for the user to analyze the problem, if the occurrence time of the event is known only, a plurality of problems are difficult to find, the problem that the reliability of the log record of the server system is low due to the fact that the prior art causes the power failure is effectively solved, and the reliability and the comprehensiveness of the log record of the server system are effectively improved.
In the technical scheme of the invention, the time information when power is turned off and the state information and/or event information of different server system devices when power is turned off are synchronously stored in a second user flash memory in the programmable logic device, and if the information stored in the read first user flash memory is abnormal, the time information when power is turned off last time stored in the second user flash memory is read, and the state information and/or event information of different server system devices when power is turned off last time; the reliability of the log record of the server system device is further improved.
According to the technical scheme, the sending identifier is used for identifying whether the storage records of the first user flash memory and/or the second user flash memory are sent by the programmable logic device or not, so that the CPLD can conveniently judge whether the storage records are sent (or read by the BMC) or not, and the convenience of log records of the server system device is improved.
According to the technical scheme, after the storage records in the first user flash memory or the second user flash memory are fully written, the storage records in the first user flash memory or the second user flash memory are automatically cleared, so that the continuity of log records of devices of a server system is ensured, and the problem of low service life of the user flash memory caused by frequent erasure is avoided.
Example two
As shown in fig. 9, the technical solution of the present invention further provides a device log recording apparatus of a server system, including:
an acquisition unit 101 that acquires time information at the time of power failure, and status information and/or event information of different server system devices at the time of power failure;
the first storage unit 102 stores time information when power is turned off and state information and/or event information of different server system devices when power is turned off into a first user flash memory in the programmable logic device;
the first sending unit 103, when the power is re-turned on, responds to the read command of the baseboard management controller, and sends the time information of the power failure in the first user flash memory, and the state information and/or event information of different server system devices when the power failure occurs, to the baseboard management controller.
According to the technical scheme, the time information during power failure and the state information and/or event information of different server system devices during power failure are stored in the first user flash memory in the programmable logic device, after the first user flash memory is powered down, the data stored in the first user flash memory cannot be lost, not only one or a certain type of devices of the server are recorded, but also the state information and/or event information during power failure and the time information during power failure of the different server system devices are recorded, and the important time information is provided to enable a user to analyze the problem more effectively (the time information is very important for the user to analyze the problem, if the occurrence time of the event is known only, a plurality of problems are difficult to find, the problem that the reliability of the log record of the server system is low due to the fact that the prior art causes the power failure is effectively solved, and the reliability and the comprehensiveness of the log record of the server system are effectively improved.
In the technical scheme of the invention, the time information when power is turned off and the state information and/or event information of different server system devices when power is turned off are synchronously stored in a second user flash memory in the programmable logic device, and if the information stored in the read first user flash memory is abnormal, the time information when power is turned off last time stored in the second user flash memory is read, and the state information and/or event information of different server system devices when power is turned off last time; the reliability of the log record of the server system device is further improved.
According to the technical scheme, the sending identifier is used for identifying whether the storage records of the first user flash memory and/or the second user flash memory are sent by the programmable logic device or not, so that the CPLD can conveniently judge whether the storage records are sent (or read by the BMC) or not, and the convenience of log records of the server system device is improved.
According to the technical scheme, after the storage records in the first user flash memory or the second user flash memory are fully written, the storage records in the first user flash memory or the second user flash memory are automatically cleared, so that the continuity of log records of devices of a server system is ensured, and the problem of low service life of the user flash memory caused by frequent erasure is avoided.
Example III
As shown in fig. 10, the technical solution of the present invention further provides a server system device log recording system, which includes a programmable logic device 201 and a baseboard management controller 202;
the programmable logic device 201 is configured to obtain time information when power is turned off, and status information and/or event information of different server system devices when power is turned off; storing time information during power failure and state information and/or event information of different server system devices during power failure into a first user flash 2011 in the programmable logic device 201; when the power is re-turned on, the time information of the power failure in the first user flash 2011 and the state information and/or event information of different server system devices of the power failure are sent to the baseboard management controller 202 in response to the read command of the baseboard management controller; or sending the time information at the time of power failure in the second user flash 2012, and the status information and/or event information of the different server system devices at the time of power failure to the baseboard management controller 202;
the baseboard management controller 202 is configured to send a read command to the programmable logic device 201 when the power is re-turned on, and read time information of the first user flash 2011 when the power is turned off, and status information and/or event information of different server system devices when the power is turned off; or time information at power down in second user flash 2012, as well as status information and/or event information for different server system devices at power down.
According to the technical scheme, the time information during power failure and the state information and/or event information of different server system devices during power failure are stored in the first user flash memory in the programmable logic device, after the first user flash memory is powered down, the data stored in the first user flash memory cannot be lost, not only one or a certain type of devices of the server are recorded, but also the state information and/or event information during power failure and the time information during power failure of the different server system devices are recorded, and the important time information is provided to enable a user to analyze the problem more effectively (the time information is very important for the user to analyze the problem, if the occurrence time of the event is known only, a plurality of problems are difficult to find, the problem that the reliability of the log record of the server system is low due to the fact that the prior art causes the power failure is effectively solved, and the reliability and the comprehensiveness of the log record of the server system are effectively improved.
In the technical scheme of the invention, the time information when power is turned off and the state information and/or event information of different server system devices when power is turned off are synchronously stored in a second user flash memory in the programmable logic device, and if the information stored in the read first user flash memory is abnormal, the time information when power is turned off last time stored in the second user flash memory is read, and the state information and/or event information of different server system devices when power is turned off last time; the reliability of the log record of the server system device is further improved.
According to the technical scheme, the sending identifier is used for identifying whether the storage records of the first user flash memory and/or the second user flash memory are sent by the programmable logic device or not, so that the CPLD can conveniently judge whether the storage records are sent (or read by the BMC) or not, and the convenience of log records of the server system device is improved.
According to the technical scheme, after the storage records in the first user flash memory or the second user flash memory are fully written, the storage records in the first user flash memory or the second user flash memory are automatically cleared, so that the continuity of log records of devices of a server system is ensured, and the problem of low service life of the user flash memory caused by frequent erasure is avoided.
While the foregoing description of the embodiments of the present invention has been presented in conjunction with the drawings, it should be understood that it is not intended to limit the scope of the invention, but rather, it is intended to cover all modifications or variations within the scope of the invention as defined by the claims of the present invention.

Claims (8)

1. A server system device logging method, comprising:
acquiring time information when power is lost and state information and/or event information of different server system devices when power is lost;
storing time information during power failure and state information and/or event information of different server system devices during power failure into a first user flash memory in the programmable logic device; synchronously storing time information during power failure and state information and/or event information of different server system devices during power failure into a second user flash memory in the programmable logic device;
after the base plate management controller is powered on again, responding to a reading command of the base plate management controller, respectively reading the time information stored in the first user flash memory when the power is turned off last time, and the state information and/or event information of different server system devices when the power is turned off last time, if the read information stored in the first user flash memory is abnormal, reading the time information stored in the second user flash memory when the power is turned off last time, and sending the stored time information when the power is turned off and the state information and/or event information of different server system devices when the power is turned off to the base plate management controller.
2. The method for logging server system devices according to claim 1, wherein storing the time information during power failure and the state information and/or event information of different server system devices during power failure in the first user flash memory in the programmable logic device specifically comprises:
the user flash memory controller sequentially stores the time information of different times of power failure and the state information and/or event information of different server system devices corresponding to the power failure into a plurality of rows of records of a first user flash memory in the programmable logic device; and recording time information when power is turned off once in each row, and state information and/or event information corresponding to different server system devices when power is turned off.
3. The server system component logging method of claim 2, wherein the status information includes operation status information, alarm information, fault information, and the event information includes fan rotation speed information, temperature information.
4. The server system device logging method of claim 1, further comprising:
and setting a sending identifier for identifying whether the storage records of the first user flash memory and/or the second user flash memory are sent by the programmable logic device.
5. The server system device logging method of claim 1, further comprising:
and setting a sending identifier for identifying whether the storage records of the first user flash memory and/or the second user flash memory are sent by the programmable logic device.
6. The method of claim 1-5, wherein the server system device includes, but is not limited to, a server power supply, a server fan, a hard disk, a central processing unit, a memory, a baseboard management controller.
7. A server system device logging apparatus, comprising:
the power-down control unit is used for obtaining time information when power is turned off and state information and/or event information of different server system devices when the power is turned off;
the first storage unit is used for storing time information when power is turned off and state information and/or event information of different server system devices when power is turned off into a first user flash memory in the programmable logic device; synchronously storing time information during power failure and state information and/or event information of different server system devices during power failure into a second user flash memory in the programmable logic device;
and after the first sending unit is powered on again, the baseboard management controller responds to a reading command of the baseboard management controller to respectively read the time information stored in the first user flash memory when the power is turned off last time, and the state information and/or event information of different server system devices when the power is turned off last time, and if the read information stored in the first user flash memory is abnormal, the time information stored in the second user flash memory when the power is turned off last time, the state information and/or event information of different server system devices when the power is turned off last time, and the stored time information when the power is turned off, and the state information and/or event information of different server system devices when the power is turned off are sent to the baseboard management controller.
8. The device log recording system of the server system is characterized by comprising a programmable logic device and a baseboard management controller;
the programmable logic device is used for acquiring time information when power is off and state information and/or event information of different server system devices when power is off; storing time information during power failure and state information and/or event information of different server system devices during power failure into a first user flash memory in the programmable logic device; synchronously storing time information during power failure and state information and/or event information of different server system devices during power failure into a second user flash memory in the programmable logic device; after the substrate management controller is powered on again, respectively reading time information stored in the first user flash memory and used for the last power-down time and/or event information of different server system devices when the power is turned off, if the read information stored in the first user flash memory is abnormal, reading the time information stored in the second user flash memory and used for the last power-down time and/or event information of different server system devices when the power is turned off, and sending the stored time information when the power is turned off and the stored state information and/or event information of different server system devices when the power is turned off to the substrate management controller;
the baseboard management controller is used for sending a reading command to the programmable logic device when the power is re-turned on, and reading time information stored in the flash memory of the first user and used for reading state information and/or event information of different server system devices when the power is turned off last time; or the second user flash memory stores the time information of the last power failure, and the state information and/or event information of different server system devices when the last power failure occurs.
CN202110863417.0A 2021-07-29 2021-07-29 Method, device and system for recording logs of server system devices Active CN113742166B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110863417.0A CN113742166B (en) 2021-07-29 2021-07-29 Method, device and system for recording logs of server system devices

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110863417.0A CN113742166B (en) 2021-07-29 2021-07-29 Method, device and system for recording logs of server system devices

Publications (2)

Publication Number Publication Date
CN113742166A CN113742166A (en) 2021-12-03
CN113742166B true CN113742166B (en) 2023-07-18

Family

ID=78729402

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110863417.0A Active CN113742166B (en) 2021-07-29 2021-07-29 Method, device and system for recording logs of server system devices

Country Status (1)

Country Link
CN (1) CN113742166B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113868101B (en) * 2021-12-06 2022-03-08 苏州浪潮智能科技有限公司 Server time sequence detection method, device and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103902435A (en) * 2012-12-26 2014-07-02 鸿富锦精密工业(深圳)有限公司 System and method for recording log events in server testing
CN105760252A (en) * 2014-12-19 2016-07-13 中兴通讯股份有限公司 Method and device for achieving transaction log image backup
CN106557438A (en) * 2015-09-30 2017-04-05 中兴通讯股份有限公司 A kind of method of power down protection, device and electronic equipment
CN112256499A (en) * 2020-08-28 2021-01-22 苏州浪潮智能科技有限公司 Power failure monitoring method and device, electronic equipment and computer readable storage medium
CN112667462A (en) * 2020-12-15 2021-04-16 苏州浪潮智能科技有限公司 System, method and medium for monitoring double flash memory operation of server

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103902435A (en) * 2012-12-26 2014-07-02 鸿富锦精密工业(深圳)有限公司 System and method for recording log events in server testing
CN105760252A (en) * 2014-12-19 2016-07-13 中兴通讯股份有限公司 Method and device for achieving transaction log image backup
CN106557438A (en) * 2015-09-30 2017-04-05 中兴通讯股份有限公司 A kind of method of power down protection, device and electronic equipment
CN112256499A (en) * 2020-08-28 2021-01-22 苏州浪潮智能科技有限公司 Power failure monitoring method and device, electronic equipment and computer readable storage medium
CN112667462A (en) * 2020-12-15 2021-04-16 苏州浪潮智能科技有限公司 System, method and medium for monitoring double flash memory operation of server

Also Published As

Publication number Publication date
CN113742166A (en) 2021-12-03

Similar Documents

Publication Publication Date Title
CN105938450B (en) The method and system that automatic debugging information is collected
CN108549591B (en) Black box device of embedded system and implementation method thereof
CN103500133A (en) Fault locating method and device
CN107908495B (en) Embedded system abnormal record display method
CN104850485A (en) BMC based method and system for remote diagnosis of server startup failure
US20140068350A1 (en) Self-checking system and method using same
CN104320308A (en) Method and device for detecting anomalies of server
CN108287780A (en) A kind of device and method of monitoring server CPLD states
CN113742166B (en) Method, device and system for recording logs of server system devices
CN111984487A (en) Method and device for recording fault hardware position off-line
CN101770404A (en) Watchdog circuit capable of keeping status and method for keeping restart status thereof
CN111858178B (en) Method, device and equipment for judging power supply starting type and readable medium
CN115480947A (en) Memory bank fault detection device and detection method
US11023335B2 (en) Computer and control method thereof for diagnosing abnormality
US9158646B2 (en) Abnormal information output system for a computer system
CN110543398A (en) method and system for recording fault information
CN113590405A (en) Hard disk error detection method and device, storage medium and electronic device
CN108920331A (en) A kind of alarm method that computer hardware configuration changes
CN107247505B (en) Cloud server power supply blackbox design method easy to view
CN105975382B (en) A kind of alarm method that hardware configuration changes
JP6880961B2 (en) Information processing device and log recording method
TWI802951B (en) Method, computer system and computer program product for storing state data of finite state machine
CN115470056A (en) Method, system, device and medium for troubleshooting power-on starting of server hardware
CN108873668A (en) Time calibrating method, processor and time calibration system
CN114138600A (en) Storage method, device, equipment and storage medium for firmware key information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant