CN107451035A - Error state data for computer installation provides method - Google Patents

Error state data for computer installation provides method Download PDF

Info

Publication number
CN107451035A
CN107451035A CN201610378723.4A CN201610378723A CN107451035A CN 107451035 A CN107451035 A CN 107451035A CN 201610378723 A CN201610378723 A CN 201610378723A CN 107451035 A CN107451035 A CN 107451035A
Authority
CN
China
Prior art keywords
state data
error state
error
control system
management control
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610378723.4A
Other languages
Chinese (zh)
Other versions
CN107451035B (en
Inventor
郭明义
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mitac Computer Shunde Ltd
Shencloud Technology Co Ltd
Original Assignee
Mitac Computer Shunde Ltd
Shencloud Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mitac Computer Shunde Ltd, Shencloud Technology Co Ltd filed Critical Mitac Computer Shunde Ltd
Priority to CN201610378723.4A priority Critical patent/CN107451035B/en
Publication of CN107451035A publication Critical patent/CN107451035A/en
Application granted granted Critical
Publication of CN107451035B publication Critical patent/CN107451035B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • G06F11/327Alarm or error message display

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

A kind of error state data provides method, implement by the substrate management control system included by a computer installation, the computer installation also includes a CPU for electrically connecting the substrate management control system, and the error state data provides method and comprised the steps of:(A) read and store the error condition data stored by the central processing list;(B) judge whether the error state data contains at least one of multiple particular errors;(C) when determining the error state data and being free of an at least particular error, step (A) is continued executing with;And (D) is receiving one after a request of data using end when determining the error state data and containing an at least particular error, transmission previously used end to this in the error state data stored by step (A).

Description

Error state data for computer installation provides method
Technical field
The invention relates to the error state data of computer installation, particularly relates to a kind of mistake for computer installation Status data provides method by mistake.
Background technology
A substrate management control system (baseboard is generally included currently as the computer installation that server uses Management controller system), the substrate management control system is used to the mistake of the computer installation Status data, to assist this computer installation of manager's management and control.
Come from an error notification of the CPU when this substrate management control system receives, such as fatal error (CATERR) when notifying, substrate management control system is read in the internal buffer of a CPU of the computer installation The stored error state data.However, in fact, because the computer installation is abnormal such as in a generation, fatal mistake occurs By mistake in the case of (CATERR), it will restart and therefore remove the CPU and retain and to should abnormal feelings The error state data of condition.It is worth noting that, this substrate management control is not interfered with when the computer installation restarts System operation.Then, if the CPU just receives in the computer installation after restarting carrys out substrate pipe since then The error notification of control system is managed, then the CPU that this substrate management control system reads and stored is internal The stored error state data such as machine check architecture error condition (Machine Check in buffer Architecture error status) error state data of data when being not correspond to occur the abnormal conditions, but it is right Should be in the error state data of the state after computer installation restarting, therefore, the manager of the computer installation fears nothing The error state data that method is provided according to this substrate management control system correctly analyzes the computer installation and exception occurs The reason for.
The content of the invention
Therefore, it is an object of the invention to provide a kind of error state data to provide method.
For the above-mentioned purpose, error state data of the present invention provides method, by the base included by a computer installation Board management control system is implemented, and the computer installation also includes a central processing list for electrically connecting the substrate management control system Member, the error state data provide method and comprised the steps of:
(A) read and store the error condition data stored by the central processing list;
(B) judge whether the error state data contains at least one of multiple particular errors;
(C) when determining the error state data and being free of an at least particular error, step (A) is continued executing with;And
(D) when determining the error state data and containing an at least particular error, one is being received from one using end After request of data, transmission previously used end to this in the error state data stored by step (A).
Compared with prior art, error state data of the invention provides method and read by the substrate management control system And error state data when an at least particular error occurs for the computer installation is stored, and contain in the error state data In the case of having an at least particular error, after the request of data for using end from this is received, the computer installation is transmitted Error state data when an at least particular error occurs uses end to this, to avoid the computer installation that this occurs at least Error state data during one particular error is eliminated before being stored by the substrate management control system, and then causes management Person can go out the reason for computer installation makes a mistake according to the error data analysis received using end.
【Brief description of the drawings】
The other features and effect of the present invention, will clearly it be presented in the embodiment with reference to schema, wherein:
Fig. 1 is a block diagram, illustrates the computer installation institute for performing the embodiment that error state data of the present invention provides method Including a substrate management control system electrically connect a CPU included by the computer installation, and via a communication Network connection one uses end.
Fig. 2 is a flow chart, illustrates that error state data of the present invention provides the embodiment of method.
Fig. 3 is a flow chart, illustrates that error state data of the present invention provides another embodiment of method.
【Embodiment】
Refering to Fig. 1, error state data of the present invention provides the embodiment of method, by the base included by a computer installation 1 Board management control system 11 is implemented.The substrate management control system 11 uses end 2 via a communication network 100 connection one.Should Computer installation 1 also includes a CPU 12 for electrically connecting the substrate management control system 11, in the present embodiment, should Computer installation 1 is, for example, a server, and the substrate management control system 11 is for example including a non-volatile memory module 111st, a communication module 112 and one for connecting the communication network 100 electrically connects the non-volatile memory module 111 and the communication The processing module 113 of module 112, and the CPU 12 is, for example, the processor that an Intel Company is produced.
Refering to Fig. 1 and Fig. 2, the embodiment that error state data of the present invention provides method comprises the steps of.
In step 31, the processing module 113 of the substrate management control system 11 is via a platform environment control interface (Platform Environmental Control Interface, abbreviation PECI) reads and stores the CPU 12 Internal buffer (not shown) in stored error state data, the wherein error state data and the computer installation 1 It is related.In the present embodiment, the error state data report error state data containing machine check architecture.In addition, the substrate management The system of processing module 113 of control system 11 was by will previously have been stored in the previous errors shape of the non-volatile memory module 111 State data are updated to the error state data read at present, to store the error state data.
In the step 32, the processing module 113 of the substrate management control system 11 judges the error condition stored by it Whether data (that is, being stored in the error state data of the non-volatile memory module 111) contain multiple particular errors At least one.Such particular error is meets a fatal error (CATERR), one can not correct peripheral component interface mistake (Uncorrectable PCI error), a fatal peripheral component interface mistake (Fatal PCI error), a system administration Interrupt time-out (SMI timeout), together bit-errors (PERR), and a system mistake (SERR) error type wherein at least One.When determining the error state data and containing an at least particular error, flow is carried out to step 33.Otherwise, flow Carry out to step 34.
In step 33, received in the processing module 113 of the substrate management control system 11 via the communication module 112 One after the request of data that this uses end 2, and the processing module 113 of the substrate management control system 11 is via the communication module 112 transmission previously used end 2 to this in the error state data stored by step 31.In the other embodiment of the present invention, The error state data provides method and a step 30 (see Fig. 3), the substrate management control system 11 is also included before step 31 Processing module 113 judge a reference time of the computer installation 1 before the current time during in whether once restarted. During the reference time of the computer installation 1 before the current time is determined when once restarting, flow carry out to Step 33.Otherwise, flow is carried out to step 31.
In step 34, after the processing module 113 of the substrate management control system 11 is counted during a default time, continue Perform step 31.
It is worth special instruction, in practice, existing substrate management control system 11 is detecting the calculating During machine 1 abnormal running of device, a System Event Log (the System Event for being relevant to the computer installation 1 can be stored Log, abbreviation SEL), operate the reason for abnormal to assist manager to understand the computer installation 1.However, manager is except reference Outside the System Event Log, it must also refer to such as, the information such as machine check architecture error state data is comprehensively to understand the computer Device 1 operates the reason for abnormal.When the error state data contains an at least particular error, the computer installation 1 is by nothing Method normal operation, thus some abnormal informations will be contained in the System Event Log, when manager is from the System Event Log When knowing the computer installation 1 because with some abnormal informations without normal operation, manager can utilize this using end 2 to this Substrate management control system 11 sends the non-volatile memory module for being stored in inside the substrate management control system 11 The request of data of 111 error state data, the substrate management control system 11 return its (base according to the request of data Board management control system 11) error state data that the internal non-volatile memory module 111 stores, the manager is thereby To obtain when an at least particular error occurs by the error state data stored by the substrate management control system 11.
2 the request of data is sent using end using this in manager, obtain when an at least particular error occurs i.e. quilt After the error state data stored by the substrate management control system 11, the substrate management control system 11 just may proceed to perform Step 31 ~ step 32 (see Fig. 2) or step 30 ~ step 32 (see Fig. 3), in other words, received in the substrate management control system 11 To the number of the error state data of the non-volatile memory module 111 being stored in inside the substrate management control system 11 Before request, the substrate management control system 11 periodically will not be read to the buffer inside the CPU 12 appoints What error condition data, the error state data stored by the buffer inside the CPU 12 will not also be stored to The non-volatile memory module 111 inside the substrate management control system 11, in order to avoid override or destroyed during storage An at least particular error is stored in the non-volatile memory module inside the substrate management control system 11 when occurring 111 error state data.
In the case where the error state data does not contain an at least particular error, the substrate management control system 11 is counted Number the default times during such as, after 50ms, that is, may proceed to perform step 31 ~ step 32 (see Fig. 2) or step 30 ~ step 32 (see Fig. 3), because the computer installation 1 because an at least particular error occurs restarts phase time that need to be expended more than 50ms Between, therefore, the error state data for including an at least particular error is removed in response to the restarting of the computer installation 1 Also more than 50ms during the time of required consuming, therefore the substrate management control system 11 is read by every 50ms is i.e. periodically automatic Error state data stored in the internal buffer of the CPU 12 is taken, thereby, can be avoided at least one spy Error state data when fixed mistake occurs is eliminated before being stored by the substrate management control system 11.Furthermore due to this Substrate management control system 11 is when an at least particular error occurs by being somebody's turn to do stored by the substrate management control system 11 Error state data has been sent to this using behind end 2, just may proceed to execution step 31 ~ step 32 (see Fig. 2) or step 30 ~ step 32 (see Fig. 3) " that is, when an at least particular error occurs by the mistake stored by the substrate management control system 11 Status data is sent to this using before end 2, and the processing module 113 of the substrate management control system 11 is not carried out step 31 ~ step Rapid 32 (see Fig. 2) or step 30 ~ step 32 (see Fig. 3) ", thereby, it can avoid being stored up when an at least particular error occurs The error state data for the non-volatile memory module 111 being stored in inside the substrate management control system 11 is by the substrate pipe The error state data covering or destroy that reason control system 11 is subsequently read.In other embodiments of the invention, the substrate Management control system 11 is except by the mistake stored in the buffer periodically read inside the CPU 12 Status data by mistake, to obtain outside the error state data, comes from the centre when the substrate management control system 11 receives An error notification of unit 12 is managed, when being notified such as fatal error (CATERR), the substrate management control system 11 can also read this Stored error state data in buffer inside CPU 12, to obtain the error state data.
In summary, error state data of the present invention provides method, by the substrate management control system 11 periodically Read and store the error state data of the buffer inside the CPU 12, and contain in the error state data In the case of an at least particular error, the substrate management control system 11 is temporarily ceased periodically to the CPU Buffer inside 12 reads and stores the action of any error state data, please using the data at end 2 from this receiving After asking, it is transmitted in error state data when an at least particular error occurs and uses end 2 to this, to ensure that the manager can The error state data when an at least particular error occurs is obtained, therefore the purpose of the present invention can be reached really.
The various embodiments described above and diagram be only presently preferred embodiments of the present invention, but can not with restrictions the present invention implementation Scope, i.e., the equivalent changes and modifications made generally according to claims of the present invention, should all belong to the model that patent of the present invention covers In enclosing.

Claims (7)

1. a kind of error state data provides method, come by the substrate management control system included by a computer installation real Apply, the computer installation also includes a CPU for electrically connecting the substrate management control system, it is characterised in that the mistake Status data provides method and comprised the steps of by mistake:
(A) read and store the error condition data stored by the central processing list;
(B) judge whether the error state data contains at least one of multiple particular errors;
(C) when determining the error state data and being free of an at least particular error, step (A) is continued executing with;And
(D) when determining the error state data and containing an at least particular error, one is being received from one using end After request of data, transmission previously used end to this in the error state data stored by step (A).
2. error state data according to claim 1 provides method, it is characterised in that in the step (C), works as judgement When going out the error state data and being free of an at least particular error, during the substrate management control system counts a default time Afterwards, repeat step (A) to step (B) once.
3. error state data according to claim 1 provides method, it is characterised in that the substrate management control system bag A non-volatile memory module for being used to store the error state data is included, wherein, in step (A), substrate management control The previous errors status data for previously having been stored in the non-volatile memory module is updated to be read in step (A) by system The error state data, to store the error state data.
4. error state data according to claim 1 provides method, it is characterised in that is also included after step (D) One step (E), repeat step (A) to step (B).
5. error state data according to claim 1 provides method, it is characterised in that
In step (A), the error state data report error state data containing machine check architecture;And
In step (B), such particular error includes a fatal error, one can not correct peripheral component interface mistake, one fatal Peripheral component interface mistake, system management interrupt time-out, together bit-errors, and the error type of a system mistake wherein extremely Few one.
6. error state data according to claim 1 provides method, it is characterised in that in step (A), the substrate pipe Reason control system system reads the error state data of the CPU via a platform environment control interface.
7. error state data according to claim 1 provides method, it is characterised in that before step (A), also includes Following steps:
(F) the substrate management control system judge in during a reference time of the computer installation before the current time whether Once restarted;
(G) during the reference time of the computer installation before the current time is determined in when once restarting, connecing Receive from this using end another request of data after, transmission previously in the previous errors status data stored by step (A) extremely This uses end;And
(H) during the reference time of the computer installation before the current time is determined in when never restarting, step Suddenly (A) is performed.
CN201610378723.4A 2016-05-31 2016-05-31 Error state data providing method for computer device Active CN107451035B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610378723.4A CN107451035B (en) 2016-05-31 2016-05-31 Error state data providing method for computer device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610378723.4A CN107451035B (en) 2016-05-31 2016-05-31 Error state data providing method for computer device

Publications (2)

Publication Number Publication Date
CN107451035A true CN107451035A (en) 2017-12-08
CN107451035B CN107451035B (en) 2020-11-10

Family

ID=60485919

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610378723.4A Active CN107451035B (en) 2016-05-31 2016-05-31 Error state data providing method for computer device

Country Status (1)

Country Link
CN (1) CN107451035B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201351133A (en) * 2012-06-13 2013-12-16 Hon Hai Prec Ind Co Ltd Method and system for reading system event
TW201423390A (en) * 2012-12-06 2014-06-16 Inventec Corp Computer system and operating method thereof
CN104424068A (en) * 2013-08-29 2015-03-18 鸿富锦精密工业(深圳)有限公司 System and method for pressure testing of firmware update
TWI512490B (en) * 2014-10-27 2015-12-11 Quanta Comp Inc System for retrieving console messages and method thereof and non-transitory computer-readable medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201351133A (en) * 2012-06-13 2013-12-16 Hon Hai Prec Ind Co Ltd Method and system for reading system event
TW201423390A (en) * 2012-12-06 2014-06-16 Inventec Corp Computer system and operating method thereof
CN104424068A (en) * 2013-08-29 2015-03-18 鸿富锦精密工业(深圳)有限公司 System and method for pressure testing of firmware update
TWI512490B (en) * 2014-10-27 2015-12-11 Quanta Comp Inc System for retrieving console messages and method thereof and non-transitory computer-readable medium

Also Published As

Publication number Publication date
CN107451035B (en) 2020-11-10

Similar Documents

Publication Publication Date Title
CN103201724B (en) Providing application high availability in highly-available virtual machine environments
JP6333410B2 (en) Fault processing method, related apparatus, and computer
US10585755B2 (en) Electronic apparatus and method for restarting a central processing unit (CPU) in response to detecting an abnormality
US7979744B2 (en) Fault model and rule based fault management apparatus in home network and method thereof
US10275330B2 (en) Computer readable non-transitory recording medium storing pseudo failure generation program, generation method, and generation apparatus
CN108292342A (en) The notice of intrusion into firmware
CN117389790B (en) Firmware detection system, method, storage medium and server capable of recovering faults
WO2023109880A1 (en) Service recovery method, data processing unit and related device
US20090235112A1 (en) Information processing apparatus, information processing apparatus control method and control program
CN113536320A (en) Error information processing method, device and storage medium
US20050223207A1 (en) Method and apparatus for remote flashing of a bios memory in a data processing system
CN110764962A (en) Log processing method and device
US20080216057A1 (en) Recording medium storing monitoring program, monitoring method, and monitoring system
CN116627702A (en) Method and device for restarting virtual machine in downtime
CN107451035A (en) Error state data for computer installation provides method
US20220100766A1 (en) Platform and service disruption avoidance using deployment metadata
TWI602054B (en) Method of providing error status data for computer device
US9176806B2 (en) Computer and memory inspection method
TWI554876B (en) Method for processing node replacement and server system using the same
US11797368B2 (en) Attributing errors to input/output peripheral drivers
US8533331B1 (en) Method and apparatus for preventing concurrency violation among resources
KR20020065188A (en) Method for managing fault in computer system
CN108415788B (en) Data processing apparatus and method for responding to non-responsive processing circuitry
CN117075977A (en) Method and device for starting processor, electronic equipment and storage medium
CN114706739A (en) Fault recording and positioning method and device and server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant