JP2008084080A

JP2008084080A - Failure information storage system, service processor, failure information storage method, and program

Info

Publication number: JP2008084080A
Application number: JP2006264436A
Authority: JP
Inventors: Osamu Shimizu; 修清水
Original assignee: NEC Computertechno Ltd
Current assignee: NEC Computertechno Ltd
Priority date: 2006-09-28
Filing date: 2006-09-28
Publication date: 2008-04-10

Abstract

<P>PROBLEM TO BE SOLVED: To store failure information, in the event of a failure, in a nonvolatile memory on a suspicious board or a suspicious structure on the board, and provide information effective for analysis of the failure at the time of repair. <P>SOLUTION: In the failure information storage system for a computer system including a plurality of boards and a service processor, each board includes a board failure log information storage part for storing logging data of failure, and a nonvolatile memory for storing the failure information. The service processor includes a failure type identification processing part for identifying the type of failure from a failure content, a failure suspicious object specification processing part for specifying the suspicious board from the type of failure, and a failure information suspicious object storage processing part for storing logging data of failure corresponding to the suspicious board in the nonvolatile memory within the suspicious board. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、障害情報格納システム、サービスプロセッサ、障害情報格納方法、及びプログラムに関し、特に、複数のハードウェアボードから構成される計算機システムの障害情報格納システム、サービスプロセッサ、障害情報格納方法、及びプログラムに関する。 The present invention relates to a failure information storage system, a service processor, a failure information storage method, and a program, and in particular, a failure information storage system, a service processor, a failure information storage method, and a program for a computer system composed of a plurality of hardware boards. About.

特開平７−０９８６６８号公報には、従来の計算機システムの障害情報格納方法が開示されている。 Japanese Patent Laid-Open No. 7-098668 discloses a conventional fault information storage method for a computer system.

この従来の計算機システムの障害情報格納方法は、運用中に障害が発生すると、サービスプロセッサが、その障害が発生した箇所を含む論理カード内のレジスタの情報を障害情報として読み出し、その障害情報を当該論理カード内の不揮発メモリに格納することで、部品を修理する際に、部品内の障害箇所を特定するために必要な情報を提供するというものである。 In this conventional computer system failure information storage method, when a failure occurs during operation, the service processor reads the information of the register in the logical card including the location where the failure has occurred as failure information, and the failure information is By storing in a non-volatile memory in a logical card, information necessary for identifying a fault location in the component is provided when the component is repaired.

特開平７−０９８６６８号公報Japanese Patent Laid-Open No. 7-098668

しかしながら、この特開平７−０９８６６８号公報に記載されている計算機システムの障害情報格納方法には、以下の３つの問題点があった。 However, the fault information storage method for the computer system described in Japanese Patent Laid-Open No. 7-098668 has the following three problems.

第１の問題点は、この計算機システムの障害情報格納方法では、受信側の論理カード上で障害を検出したにもかかわらず、障害が発生した要因が受信側カードにはなく、送信側のカードにあるような場合には、障害を検出した受信側の論理カードのレジスタの内容のみを障害情報として取得するだけであり、障害が発生した要因がある送信側の論理カード内の障害情報が欠落してしまい、修理の際の情報として役に立たないという点である。 The first problem is that, in this fault information storage method of the computer system, although the fault is detected on the receiving side logical card, the cause of the fault is not in the receiving side card but the sending side card. In this case, only the contents of the register of the receiving logical card that detected the failure are acquired as the failure information, and the failure information in the transmitting logical card with the cause of the failure is missing. Therefore, it is not useful as information for repair.

第２の問題点は、この計算機システムの障害情報格納方法では、障害が発生した要因が１つの論理カードに特定できず、障害が発生した疑いのあるカードである被擬カードが複数となってしまう場合には、障害を検出した受信側の論理カードのレジスタの内容のみを障害情報として取得するだけであり、その他の被擬論理カード内の障害情報が欠落してしまい、修理の際の情報として役に立たないという点である。 The second problem is that in the failure information storage method of this computer system, the cause of the failure cannot be specified for one logical card, and there are a plurality of simulated cards that are suspected of causing the failure. In such a case, only the contents of the register of the receiving logical card that detected the failure are acquired as the failure information, and other failure information in the simulated logic card is lost. As it is useless as.

第３の問題点は、この計算機システムの障害情報格納方法では、論路カードがメモリモジュールで構成されている等、論理カード上に構成品が搭載されている場合には、障害を検出した論理メモリカード内のレジスタの内容のみを障害情報として取得するだけであり、障害が発生した要因がメモリカード内のどのメモリモジュールにあるかを示す障害情報が欠落してしまい、修理の際に、障害箇所を解析するための分解能の高い障害情報を提供できないという点である。 The third problem is that in this fault information storage method of the computer system, when a logical card is configured with a memory module such as a logical card, the logic in which the fault is detected is detected. Only the contents of the register in the memory card are acquired as failure information. Failure information indicating which memory module in the memory card is causing the failure is lost. This means that failure information with high resolution for analyzing the location cannot be provided.

本発明の目的は、障害が発生したときに、障害を検出したカード自体ではなく、障害の要因となったカードまたは障害の要因になる疑いのある全ての被擬カード上の不揮発性メモリに、障害情報を格納し、カードの修理の際に、障害が発生した部品を特定するのに有効な情報を与えることができる障害情報格納システム、障害情報格納装置、障害情報格納方法、及びプログラムを提供することにある。 The object of the present invention is not to the card itself that detects the failure, but to the non-volatile memory on the card that caused the failure or on all the simulated cards that are suspected of causing the failure when the failure occurs. A failure information storage system, a failure information storage device, a failure information storage method, and a program capable of storing failure information and providing effective information for identifying a failed part when repairing a card are provided. There is to do.

本発明の第１の障害情報格納システムは、計算機システムを構成する複数のボードと前記計算機システムで発生した障害を処理するサービスプロセッサとから構成される障害情報格納システムであって、
前記複数のボードのそれぞれは、障害の解析に必要なデータである障害のロギングデータを格納するボード障害ログ情報格納部と、障害情報を格納するための不揮発性メモリとを備え、
前記サービスプロセッサは、前記ボードから障害内容を読み出す障害内容読出処理部と、前記障害内容の障害種別を識別する障害種別識別処理部と、前記障害種別に基づいて障害が発生した疑いのあるボードを被擬ボードとして特定する障害被擬対象特定処理部と、前記被擬ボードに対応する前記障害のロギングデータを前記ボード障害ログ情報格納部から読み出して前記被擬ボード内の前記不揮発メモリに格納する障害情報被擬対象格納処理部とを備えることを特徴とする。 A first failure information storage system according to the present invention is a failure information storage system including a plurality of boards constituting a computer system and a service processor for processing a failure occurring in the computer system.
Each of the plurality of boards includes a board failure log information storage unit for storing failure logging data that is data necessary for failure analysis, and a nonvolatile memory for storing failure information.
The service processor includes a fault content read processing unit that reads fault content from the board, a fault type identification processing unit that identifies a fault type of the fault content, and a board that is suspected of having failed based on the fault type. The fault simulation target specifying processing unit specified as the simulated board and the fault logging data corresponding to the simulated board are read from the board fault log information storage unit and stored in the nonvolatile memory in the simulated board. And a failure information simulated object storage processing unit.

本発明の第１のサービスプロセッサは、計算機システムを構成する複数のボードの内、障害が発生した前記ボードから障害内容を読み出す障害内容読出処理部と、前記障害内容の障害種別を識別する障害種別識別処理部と、前記障害種別に基づいて障害が発生した疑いのあるボードを被擬ボードとして特定する障害被擬対象特定処理部と、前記被擬ボードに対応する前記障害のロギングデータを前記ボード障害ログ情報格納部から読み出して前記被擬ボード内の前記不揮発メモリに格納する障害情報被擬対象格納処理部とを備えることを特徴とする。 The first service processor according to the present invention includes a failure content read processing unit that reads a failure content from the board in which a failure has occurred among a plurality of boards constituting a computer system, and a failure type that identifies a failure type of the failure content An identification processing unit; a fault target identification processing unit that identifies a board that is suspected of generating a fault based on the fault type as a simulated board; and the fault logging data corresponding to the simulated board And a failure information simulated object storage processing unit which reads out from the failure log information storage unit and stores it in the nonvolatile memory in the simulated board.

本発明の第２のサービスプロセッサは、第１のサービスプロセッサにおいて、前記障害被擬対象特定処理部は、前記障害種別を識別する障害種別識別番号と、前記被擬ボードとが関連付けられた障害被疑対象テーブルを含むことを特徴とする。 According to a second service processor of the present invention, in the first service processor, the failure target identification processing unit is configured to suspect a failure in which a failure type identification number for identifying the failure type is associated with the simulated board. It includes a target table.

本発明の第１の障害情報格納方法は、計算機システムを構成する複数のボードの内、障害が発生した前記ボードから障害内容を読み出す障害内容読出処理ステップと、前記障害内容の障害種別を識別する障害種別識別処理ステップと、前記障害種別に基づいて障害が発生した疑いのあるボードを被擬ボードとして特定する障害被擬対象特定処理ステップと、前記被擬ボードに対応する前記障害のロギングデータを前記ボード障害ログ情報格納部から読み出して前記被擬ボード内の前記不揮発メモリに格納する障害情報被擬対象格納処理ステップと、を含むことを特徴とする。 The first failure information storage method of the present invention identifies a failure content read processing step of reading a failure content from the board in which a failure has occurred among a plurality of boards constituting a computer system, and a failure type of the failure content A fault type identification processing step, a fault mimic target specifying process step for identifying a board suspected of having a fault based on the fault type as a simulated board, and logging data of the fault corresponding to the simulated board And a failure information simulated object storage processing step of reading from the board failure log information storage unit and storing it in the non-volatile memory in the simulated board.

本発明の第２の障害情報格納方法は、第１の障害情報格納方法において、前記障害被擬対象特定処理ステップは、前記障害種別を識別する障害種別識別番号と、前記被擬ボードとが関連付けられた障害被疑対象テーブルを含むことを特徴とする。 According to a second failure information storage method of the present invention, in the first failure information storage method, the failure mimic object specifying processing step associates a failure type identification number identifying the failure type with the simulated board. The failure suspected object table is included.

本発明の第１のプログラムは、計算機システムを構成する複数のボードの内、障害が発生した前記ボードから障害内容を読み出す障害内容読出処理と、前記障害内容の障害種別を識別する障害種別識別処理と、前記障害種別に基づいて障害が発生した疑いのあるボードを被擬ボードとして特定する障害被擬対象特定処理と、前記被擬ボードに対応する前記障害のロギングデータを前記ボード障害ログ情報格納部から読み出して前記被擬ボード内の前記不揮発メモリに格納する障害情報被擬対象格納処理とをコンピュータに行わせることを特徴とする。 The first program of the present invention includes a failure content reading process for reading a failure content from the board in which a failure has occurred among a plurality of boards constituting a computer system, and a failure type identification process for identifying a failure type of the failure content A fault mimic target identifying process for identifying a board suspected of having a fault based on the fault type as a simulated board, and storing the fault logging data corresponding to the simulated board in the board fault log information A fault information simulated object storage process that is read from the storage unit and stored in the non-volatile memory in the simulated board is performed by a computer.

本発明の第２のプログラムは、第１のプログラムにおいて、前記障害被擬対象特定処理は、前記障害種別を識別する障害種別識別番号と、前記被擬ボードとが関連付けられた障害被疑対象テーブルを含むことを特徴とする。
本発明の第２の障害情報格納システムは、構成品が搭載されたボードを含む複数のボードを有する計算機システムと、前記計算機システムで発生した障害を処理するサービスプロセッサとから構成される障害情報格納システムであって、
前記複数のボードのうち、前記構成品が搭載されているボードのそれぞれは、障害の解析に必要なデータである障害のロギングデータを格納するボード障害ログ情報格納部と、障害情報を格納するための第１の不揮発性メモリと、前記構成品に関する障害情報を格納する前記構成品内の第２の不揮発性メモリとを備え、
前記サービスプロセッサは、前記ボードから障害内容を読み出す障害内容読出処理部と、前記障害内容の障害種別を識別する障害種別識別処理部と、前記障害種別に基づいて障害が発生した疑いのある構成品を被擬構成品として特定する障害被擬対象特定処理部と、前記被擬構成品に対応する前記障害のロギングデータを前記ボード障害ログ情報格納部から読み出して前記被擬ボード内の前記第１の不揮発メモリ、及び前記被擬構成品内の第２の不揮発メモリに格納する障害情報被擬対象格納処理部と、を備えることを特徴とする。 According to a second program of the present invention, in the first program, the fault simulation target specifying process includes a fault suspect target table in which a fault type identification number for identifying the fault type and the simulated board are associated with each other. It is characterized by including.
A second failure information storage system according to the present invention is a failure information storage comprising a computer system having a plurality of boards including a board on which components are mounted, and a service processor for processing a failure that has occurred in the computer system. A system,
Of the plurality of boards, each of the boards on which the component is mounted has a board fault log information storage unit for storing fault logging data, which is data necessary for fault analysis, and fault information. A first non-volatile memory and a second non-volatile memory in the component for storing fault information relating to the component,
The service processor includes a fault content read processing unit that reads fault content from the board, a fault type identification processing unit that identifies a fault type of the fault content, and a component that is suspected of having failed based on the fault type A fault impersonation target specifying processing unit for specifying the faulty component as a simulated component, and reading out the fault logging data corresponding to the simulated component from the board fault log information storage unit to read the first in the simulated board And a failure information imitation object storage processing unit for storing in a second nonvolatile memory in the imitated component.

本発明の第３のサービスプロセッサは、構成品が搭載されたボードを含む複数のボードを有する計算機システムにおいて、
障害が発生した前記ボードから障害内容を読み出す障害内容読出処理部と、前記障害内容の障害種別を識別する障害種別識別処理部と、前記障害種別に基づいて障害が発生した疑いのある構成品を被擬構成品として特定する障害被擬対象特定処理部と、前記被擬構成品に対応する前記障害のロギングデータを前記ボード障害ログ情報格納部から読み出して前記被擬ボード内の前記不揮発メモリ、及び前記被擬構成品内の不揮発メモリに格納する障害情報被擬対象格納処理部と、を備えることを特徴とする。 A third service processor of the present invention is a computer system having a plurality of boards including a board on which components are mounted.
A failure content reading processing unit that reads out the failure content from the board in which the failure has occurred, a failure type identification processing unit that identifies the failure type of the failure content, and a component that is suspected of having failed based on the failure type A fault imitation target specifying processing unit for specifying as a simulated component, and reading out the fault logging data corresponding to the simulated component from the board fault log information storage unit, and the nonvolatile memory in the simulated board, And a failure information simulated object storage processing unit for storing in a non-volatile memory in the simulated component.

本発明の第４のサービスプロセッサは、第３のサービスプロセッサにおいて、前記障害被擬対象特定処理部は、前記障害種別を識別する障害種別識別番号と、前記被擬ボードとが関連付けられた障害被疑対象テーブルを含むことを特徴とする。 According to a fourth service processor of the present invention, in the third service processor, the failure target identification processing unit is configured to suspect a failure in which a failure type identification number for identifying the failure type is associated with the simulated board. It includes a target table.

本発明の第３の障害情報格納方法は、構成品が搭載されたボードを含む複数のボードを有する計算機システムにおいて、
障害が発生した前記ボードから障害内容を読み出す障害内容読出処理ステップと、前記障害内容の障害種別を識別する障害種別識別処理ステップと、前記障害種別に基づいて障害が発生した疑いのある構成品を被擬構成品として特定する障害被擬対象特定処理ステップと、前記被擬構成品に対応する前記障害のロギングデータを前記ボード障害ログ情報格納部から読み出して前記被擬ボード内の前記不揮発メモリ、及び前記被擬構成品内の不揮発メモリに格納する障害情報被擬対象格納処理ステップと、を含むことを特徴とする。 A third failure information storage method of the present invention is a computer system having a plurality of boards including a board on which components are mounted.
A failure content reading processing step for reading the failure content from the board where the failure has occurred, a failure type identification processing step for identifying the failure type of the failure content, and a component suspected of having failed based on the failure type A fault imitation target specifying processing step for specifying as an imitated component, and reading out the fault logging data corresponding to the imitated component from the board error log information storage unit, the nonvolatile memory in the imitation board, And a failure information simulated object storage processing step for storing in a nonvolatile memory in the simulated component.

本発明の第４障害情報格納方法は、第３の障害情報格納方法において、前記障害被擬対象特定処理ステップは、前記障害種別を識別する障害種別識別番号と、前記被擬ボードとが関連付けられた障害被疑対象テーブルを含むことを特徴とする。
本発明の第３のプログラムは、構成品が搭載されたボードを含む複数のボードを有する計算機システムにおいて、
障害が発生した前記ボードから障害内容を読み出す障害内容読出処理と、前記障害内容の障害種別を識別する障害種別識別処理と、前記障害種別に基づいて障害が発生した疑いのある構成品を被擬構成品として特定する障害被擬対象特定処理と、前記被擬構成品に対応する前記障害のロギングデータを前記ボード障害ログ情報格納部から読み出して前記被擬ボード内の前記不揮発メモリ、及び前記被擬構成品内の不揮発メモリに格納する障害情報被擬対象格納処理と、コンピュータに行わせることを特徴とする。 According to a fourth fault information storage method of the present invention, in the third fault information storage method, the fault mimic target specifying processing step associates a fault type identification number for identifying the fault type with the simulated board. And a failure suspect table.
The third program of the present invention is a computer system having a plurality of boards including a board on which components are mounted.
A failure content reading process for reading the failure content from the board in which the failure has occurred, a failure type identification process for identifying the failure type of the failure content, and a component suspected of having failed based on the failure type are simulated. Fault identification target identification processing specified as a component, and logging data of the fault corresponding to the pseudo component are read from the board fault log information storage unit, the nonvolatile memory in the simulated board, and the target It is characterized in that a fault information simulated object storage process stored in a non-volatile memory in a pseudo-structured product is performed by a computer.

本発明の第４のプログラムは、第３のプログラムにおいて、前記障害被擬対象特定処理は、前記障害種別を識別する障害種別識別番号と、前記被擬ボードとが関連付けられた障害被疑対象テーブルを含むことを特徴とする。 According to a fourth program of the present invention, in the third program, the fault simulation target specifying process includes a fault suspect target table in which a fault type identification number for identifying the fault type and the simulated board are associated with each other. It is characterized by including.

本発明は、障害が発生した要因になる疑いのある全ての被擬カードまたは被擬構成品内に格納された障害情報を、カードを修理する際に障害が発生した箇所を特定するのに有効な情報として、提供できる効果がある。 The present invention is effective for identifying the failure information stored in all the simulated cards or simulated components that are suspected to be the cause of the failure when the card is repaired. Effective information can be provided.

その理由は、障害が発生したときに、障害を検出したカード上のエラーレジスタを読み出し、障害種別を識別し、この障害種別から障害被擬テーブルから障害の要因となる全ての被擬カードを特定し、この被擬カードに対応する障害情報をそれぞれの被擬カード内の不揮発メモリに格納し、カードを修理する際に障害が発生した部品の特定に有効な情報を提示することができるようにしたからである。 The reason is that when a failure occurs, the error register on the card where the failure is detected is read, the failure type is identified, and all the mimic cards that cause the failure are identified from the failure mimic table from this failure type. Then, the failure information corresponding to the simulated card is stored in the nonvolatile memory in each simulated card so that information effective for identifying the failed part can be presented when the card is repaired. Because.

次に、本発明を実施するための最良の形態について図面を参照して詳細に説明する。 Next, the best mode for carrying out the present invention will be described in detail with reference to the drawings.

まず、本発明の第１の実施の形態について詳細に説明する。 First, the first embodiment of the present invention will be described in detail.

図１を参照すると、本発明の障害情報格納システムの第１の実施の形態の全体構成図が示されている。 Referring to FIG. 1, there is shown an overall configuration diagram of a first embodiment of a failure information storage system of the present invention.

本発明の障害情報格納システムは、計算機システムを構成するボードからの障害内容を識別し、ボードへの障害情報の格納を制御する等の診断動作を行うサービスプロセッサ１００と、計算機システムを構成する複数のボード２００、ボード３００、ボード５００とから構成されており、相互にバス接続されている。 The fault information storage system of the present invention identifies a fault content from a board constituting the computer system and performs a diagnostic operation such as controlling storage of the fault information on the board, and a plurality of faults constituting the computer system. The board 200, the board 300, and the board 500 are connected to each other via a bus.

サービスプロセッサ１００は、診断プログラムで動作し、ボード内で障害を検出したことを通知する障害検出通知を受信する障害検出受信処理部１０２と、この障害検出通知信号を受信した後、ボード内のパリティエラー等の全ての障害が示されているエラーレジスタの内容を読み出す障害内容読出処理部１０３と、このエラーレジスタの内容から障害の種別を識別する障害種別識別処理部１０４と、この障害種別を識別する障害種別識別番号とこの障害の被擬対象となる被擬対象ボードとが、関連付けられた障害被疑対象テーブル１０５と、障害種別識別番号からこの障害被擬対象テーブル１０５を参照し、障害が発生した疑いのあるボードである被擬対象ボードを特定する障害被擬対象特定処理部１０６と、ボード内で障害が発生したときに、障害の要因を解析するための情報をロギングデータとしてボード内に格納した、全てのボード内のボード障害ログ情報の中から、障害被擬対象ボード分のボード障害ログ情報のみを選択する障害ログ被擬対象選択処理部１０８と、この障害ログ情報をサービスプロセッサ１００内に格納するシステム障害ログ情報格納部１０７と、一方この障害ログ情報を当該被擬ボード内の不揮発メモリである障害情報格納部へ格納する障害情報被擬対象格納処理部１０９と、診断動作を有効にする診断モードビットを制御する診断モード制御処理部１１０と、診断動作を作動させる診断クロックの停止、起動を制御する診断用クロック制御処理部１１１と、診断動作の全体を制御するプロセッサ、主記憶メモリ、診断プログラム等からなる診断制御処理部１０１から構成されている。 The service processor 100 operates according to a diagnostic program, receives a failure detection notification for notifying that a failure has been detected in the board, and receives a failure detection notification signal. A failure content reading processing unit 103 that reads the contents of an error register indicating all failures such as errors, a failure type identification processing unit 104 that identifies the type of failure from the contents of the error register, and identifying this failure type The fault type identification number and the simulated target board that is the target of the fault are associated with the fault suspect target table 105, and the fault type identification number is referenced from the fault type target table 105 to generate a fault. The fault target simulation target specifying unit 106 that identifies the target board that is the suspected board, and that a fault has occurred in the board. In addition, the failure to select only the board failure log information for the board to be simulated from the board failure log information in all the boards that stores the information to analyze the cause of the failure as logging data in the board A log mimic target selection processing unit 108, a system fault log information storage unit 107 that stores the fault log information in the service processor 100, and a fault information storage that is a non-volatile memory in the simulated board. Fault information simulated object storage processing unit 109 to be stored in the unit, diagnostic mode control processing unit 110 for controlling the diagnostic mode bit for enabling the diagnostic operation, and diagnosis for controlling stop and activation of the diagnostic clock for operating the diagnostic operation Diagnostic control comprising a clock control processing unit 111, a processor for controlling the entire diagnostic operation, a main memory, a diagnostic program, etc. The processing unit 101 is configured.

ボード２００は、システムで障害が発生したときに、障害を検知したことをサービスプロセッサ１００に通知する障害検出通知処理部２０２と、ボード内のパリティエラー、メモリＥＣＣエラー、インターフェイスタイムアウトエラー等の全ての障害の内容を示すエラーレジスタ２０３と、ボード内に障害が発生したときに、障害の要因を追求するために必要な情報、例えばパリティビットを含む演算用レジスタ、ＥＣＣ付きのメモリ、構成制御レジスタ、制御用のフリップフロップ等をロギングデータとして格納するボード障害ログ情報格納部２０４と、ボードをシステムから取り外してもデータが消えない不揮発性のメモリで構成され、サービスプロセッサ１００からの被擬対象の障害情報を受信し、格納するための障害情報格納部２０５と、診断動作を有効にする診断モードビットを制御する診断モード制御処理部２０８と、診断動作を作動させる診断クロックの停止、起動を制御する診断用クロック制御処理部２０９と、診断動作を制御する診断制御処理部２０１から構成されている。 The board 200 has a failure detection notification processing unit 202 that notifies the service processor 100 that a failure has been detected when a failure occurs in the system, and all of the parity error, memory ECC error, interface timeout error, etc. in the board. An error register 203 indicating the content of the failure, and information necessary for pursuing the cause of the failure when a failure occurs in the board, such as an arithmetic register including a parity bit, a memory with ECC, a configuration control register, Consists of a board failure log information storage unit 204 that stores control flip-flops and the like as logging data, and a non-volatile memory that does not erase data even when the board is removed from the system. Fault information storage unit 20 for receiving and storing information A diagnostic mode control processing unit 208 that controls a diagnostic mode bit that enables the diagnostic operation, a diagnostic clock control processing unit 209 that controls stop and activation of a diagnostic clock that activates the diagnostic operation, and a diagnostic operation The diagnostic control processing unit 201 is configured.

ボード３００、ボード４００もボード２００と同様な構成となっている。 The board 300 and the board 400 have the same configuration as the board 200.

図２を参照すると、障害被擬テーブル１０５の１例が示されている。 Referring to FIG. 2, an example of the fault simulated table 105 is shown.

障害種別識別番号５０１に対して、この障害の種別の障害の要因となる可能性のある複数の被擬対象ボードが被擬対象Ａ５０２と、被擬対象Ｂ５０３等として、示されている。 For the fault type identification number 501, a plurality of simulated target boards that may cause a fault of this fault type are shown as a simulated target A 502, a simulated target B 503, and the like.

ここで、図２の１例として、障害種別識別番号が０００２に関連付けられて、被擬対象ボードがボード３００と記載されているケースについて、図３を用いて説明する。 Here, as an example of FIG. 2, a case where the failure type identification number is associated with 0002 and the simulated target board is described as the board 300 will be described with reference to FIG. 3.

図３は、回路がボード２００とボード３００に跨る場合の障害の対応について説明する図面である。 FIG. 3 is a diagram for explaining how to deal with a failure when a circuit straddles the board 200 and the board 300.

ボード３００には、Ｎビットの演算処理部の本体と、パリティビットの修正処理部とが実装されている。 The board 300 is mounted with a main body of an N-bit arithmetic processing unit and a parity bit correction processing unit.

一方、ボード２００には、ボード３００からのデータとパリティビットを入力としてパリティエラーを検出するパリティエラー検出処理部が二重化されて実装されている。 On the other hand, on the board 200, a parity error detection processing unit that detects a parity error by using the data and the parity bit from the board 300 as inputs is duplicated and mounted.

このような回路例で、ボード２００でパリティエラーが発生し、二重化されたパリティエラー検出処理部の両方のパリティエラー検出処理部でパリティエラーが検出された場合には、ボード内のパリティエラー処理部は正常に動作し、障害はＮビットの演算処理部を含むボード３００内で発生した疑いが大きく、パリティエラーはボード２００で検出したにもかかわらず、被擬ボードとしてはボード３００であると想定している。 In such a circuit example, when a parity error occurs in the board 200 and a parity error is detected in both parity error detection processing units of the duplicated parity error detection processing unit, the parity error processing unit in the board Is normally operated, and it is highly likely that the failure occurred in the board 300 including the N-bit arithmetic processing unit, and the parity error is detected by the board 200, but the simulated board is the board 300. is doing.

また、図２の１例として、障害種別識別番号が０００３に関連付けられて、被擬対象ボードがボード２００とボード３００とが記載されているケースについて、図４を用いて説明する。 As an example of FIG. 2, a case in which the failure type identification number is associated with 0003 and the board 200 and the board 300 are described is described with reference to FIG. 4.

図４は、回路がボード２００とボード３００に跨る他の場合の障害の対応について説明する図面である。 FIG. 4 is a diagram for explaining the correspondence of a failure in another case where the circuit straddles the board 200 and the board 300.

一方、ボード２００にも、Ｎビットのシフト処理部の本体と、パリティビットの修正処理部とが実装されている。 On the other hand, the main body of the N-bit shift processing unit and the parity bit correction processing unit are also mounted on the board 200.

このような回路例で、ボード２００でパリティエラーが検出された場合には、ボード３００とボード２００は共に、大きな回路規模であるＮビットの演算処理部、シフト処理部を含み、障害の発生した疑いはボード３００、ボード２００の両方にあり、被擬対象ボードはボード２００とボード３００であると想定している。 In such a circuit example, when a parity error is detected in the board 200, both the board 300 and the board 200 include an N-bit arithmetic processing unit and a shift processing unit having a large circuit scale, and a failure has occurred. It is assumed that the doubt exists in both the board 300 and the board 200, and the board to be simulated is the board 200 and the board 300.

次に、本発明の第１の実施の形態についての動作を図７の処理フローチャートで説明する。 Next, the operation of the first embodiment of the present invention will be described with reference to the processing flowchart of FIG.

障害が発生したときの処理の事前準備として、ボード２００では、通常動作が行われているときに、障害が発生したときに障害の解析に役に立つ情報として、例えば、演算回路の障害であれば演算で使用されるオペランドレジスタとパリティビット、メモリの障害であればメモリのデータとＥＣＣチェックビット、プログラム処理中も障害であればプログラムの実行命令履歴等のロギングデータをボード内障害ログ情報格納部２０４へ格納する（ステップＳ６０１）。 As advance preparations for processing when a failure occurs, the board 200 performs calculation as information useful for failure analysis when a failure occurs during normal operation. In-board failure log information storage 204 (Step S601).

また、サービスプロセッサ１００では、障害種別を識別する障害種別識別番号とこの障害の被擬対象となる被擬対象ボードとが関連付けられた障害被疑対象テーブル１０５を予め設定する（ステップＳ６０５）。 Further, the service processor 100 presets a failure suspect target table 105 in which a failure type identification number for identifying a failure type is associated with a simulated target board that is a simulated target of the failure (step S605).

システムにてある障害が発生し（ステップＳ６０３）、ボード２００の障害検出通知処理部２０２で障害が検出されると、サービスプロセッサ１００へ通知される（ステップＳ６０４）。 When a failure occurs in the system (step S603) and the failure detection notification processing unit 202 of the board 200 detects a failure, the service processor 100 is notified (step S604).

サービスプロセッサ１００は、障害検出受信処理部１０２の制御により、ボード２００からの障害が検出された通知を受信する（ステップＳ６０６）。 The service processor 100 receives a notification that a failure has been detected from the board 200 under the control of the failure detection reception processing unit 102 (step S606).

サービスプロセッサ１００は、診断モード制御処理部１１０で、診断動作を有効にする診断モードビットをＯＮにし、また、診断用クロック制御処理部１１１で、診断動作を作動させる診断クロックの起動を行う（ステップＳ６０７）。 In the service processor 100, the diagnostic mode control processing unit 110 turns on the diagnostic mode bit for enabling the diagnostic operation, and the diagnostic clock control processing unit 111 activates the diagnostic clock that activates the diagnostic operation (step S100). S607).

次に、サービスプロセッサ１００は、障害内容読出処理部１０３の制御により、ボード２００内のパリティエラー、メモリＥＣＣエラー、インターフェイスタイムアウトエラー等の全ての障害の内容を示すエラーレジスタ２０３から障害内容を読み出す（ステップＳ６０８）。 Next, under the control of the failure content read processing unit 103, the service processor 100 reads the failure content from the error register 203 indicating the content of all failures such as a parity error, a memory ECC error, and an interface timeout error in the board 200 ( Step S608).

サービスプロセッサ１００は、障害種別識別処理部１０４の制御により、読み出されたエラーレジスタの内容から障害の種別を識別して、障害種別識別番号を得る（ステップＳ６０９）。 Under the control of the failure type identification processing unit 104, the service processor 100 identifies the failure type from the read contents of the error register, and obtains a failure type identification number (step S609).

サービスプロセッサ１００は、障害被擬対象特定処理部１０６の制御により、この障害種別識別番号をキーにして、障害被擬対象テーブル１０５を参照し、障害が発生した疑いのある被擬対象ボードを特定する（ステップＳ６１０、ステップＳ６１１）。 The service processor 100 uses the fault type identification number as a key under the control of the fault mimic target identification processing unit 106 to refer to the fault mimic target table 105 and identify the mimic target board suspected of having a fault. (Step S610, Step S611).

次に、サービスプロセッサ１００は、障害ログ被擬対象選択処理部１０８の制御により、被擬対象ボードに対応するボード内のボード障害ログ情報格納部から障害のロギングデータを読み出す（ステップＳ６１２）。 Next, the service processor 100 reads out fault logging data from the board fault log information storage unit in the board corresponding to the simulated target board under the control of the fault log simulated target selection processing unit 108 (step S612).

また、サービスプロセッサ１００は、障害情報被擬格納処理部１０９の制御により、この選択された障害のロギングデータを当該被擬ボードに対応するボード内の不揮発性メモリで構成された障害情報格納部に格納する（ステップＳ６１４）。 In addition, the service processor 100 controls the failure information simulated storage processing unit 109 to log the selected failure logging data to the failure information storage unit configured by the nonvolatile memory in the board corresponding to the simulated board. Store (step S614).

サービスプロセッサ１００は、この選択された障害のロギングデータをサービスプロセッサ１００内のシステム障害ログ情報格納部１０７に格納する（ステップＳ６１３）。 The service processor 100 stores the selected fault logging data in the system fault log information storage unit 107 in the service processor 100 (step S613).

例えば、図２の障害被疑対象テーブルと、図３の障害事例とを参照して、動作の１例について説明する。 For example, an example of the operation will be described with reference to the failure suspected object table in FIG. 2 and the failure example in FIG.

図３では、ボード３００には、Ｎビットの演算処理部の本体と、パリティビットの修正処理部とが実装されており、一方、ボード２００には、ボード３００からのデータとパリティビットを入力としてパリティエラーを検出するパリティエラー検出処理部が二重化されて実装されている。 In FIG. 3, the board 300 is mounted with a main body of an N-bit arithmetic processing unit and a parity bit correction processing unit, while the board 200 receives data and parity bits from the board 300 as inputs. A parity error detection processing unit for detecting a parity error is duplicated and mounted.

このような回路例で、ボード２００でパリティエラーが検出された場合には、ボード内のパリティエラー処理部は正常に動作し、障害はＮビットの演算処理部を含むボード３００内で発生した疑いが大きく、被擬ボードとしてはボード３００である。 In such a circuit example, when a parity error is detected in the board 200, the parity error processing unit in the board operates normally, and the failure is suspected to have occurred in the board 300 including the N-bit arithmetic processing unit. The board 300 is a simulated board.

この場合には、ボード２００で障害が検出されて、障害情報の識別番号が０００２と識別され、その被疑対象はボード３００でるので、ボード３００の障害のロギングデータが選択され、ボード３００の障害情報格納部に格納することができる。 In this case, a fault is detected in the board 200, the fault information identification number is identified as 0002, and the suspected object is the board 300. Therefore, the fault logging data of the board 300 is selected, and the fault information of the board 300 is selected. It can be stored in the storage unit.

また、図２の障害被疑対象テーブルと、図４の障害事例とを参照して、その他の動作例について説明する。 Other operation examples will be described with reference to the failure suspected object table in FIG. 2 and the failure example in FIG.

図４のボード３００には、Ｎビットの演算処理部の本体と、パリティビットの修正処理部の回路が実装されており、一方、ボード２００には、Ｎビットのシフト処理部の本体と、パリティビットの修正処理部の回路が実装されている。 The board 300 of FIG. 4 is mounted with an N-bit arithmetic processing unit main body and a parity bit correction processing unit circuit, while the board 200 has an N-bit shift processing unit main body and a parity bit. A circuit of a bit correction processing unit is mounted.

このような回路例で、ボード２００でパリティエラーが検出された場合には、障害の発生した疑いはボード３００、ボード２００の両方にあり、被擬対象ボードはボード２００とボード３００と想定している。 In such a circuit example, when a parity error is detected in the board 200, it is assumed that the failure has occurred in both the board 300 and the board 200, and the board to be simulated is the board 200 and the board 300. Yes.

このような場合には、ボード２００上の障害情報格納部２０５、ボード３００上の障害情報格納部３０５へ障害情報が格納される。 In such a case, the failure information is stored in the failure information storage unit 205 on the board 200 and the failure information storage unit 305 on the board 300.

このように、障害が発生した要因が１つの論理カードに特定できず、障害が発生した疑いのあるカードである被擬カードが複数となってしまう場合には、障害を検出した受信側ボードの障害情報ではなく、その他の被擬ボード内の障害情報を不揮発性メモリに格納することにより、修理にときに有効な情報を提供できる。 In this way, when the cause of the failure cannot be specified for one logical card and there are multiple simulated cards that are suspected of the failure, the receiving board that detected the failure By storing not only the failure information but also other failure information in the simulated board in the nonvolatile memory, it is possible to provide information useful for repair.

以上説明したように、本発明の第１の実施の形態の第１の効果は、障害内容を含むレジスタを持つボードが障害発生の要因でない場合に対しても、被疑対象となるボード障害情報を格納することができ、修理するときに障害箇所を解析するのに有効な情報を提供できる点である。 As described above, the first effect of the first embodiment of the present invention is that the board failure information to be suspected is obtained even when the board having the register including the failure content is not the cause of the failure. It can be stored, and can provide information useful for analyzing the fault location when repairing.

また、本発明の第１の実施の形態の第２の効果は、障害内容が複数のボードに跨る場合にも、被疑対象となるボード全てに障害情報を格納することができ、修理するときに障害箇所を解析するのに有効な情報を提供できる点である。 In addition, the second effect of the first embodiment of the present invention is that failure information can be stored in all the boards subject to suspicion even when the failure content extends over a plurality of boards. It is a point that can provide information useful for analyzing the fault location.

次に、本発明の第２の実施の形態について詳細に説明する。 Next, a second embodiment of the present invention will be described in detail.

図５を参照すると、本発明の障害情報格納システムの第２の実施の形態の全体図が示されている。 Referring to FIG. 5, there is shown an overall view of a second embodiment of the failure information storage system of the present invention.

第１の実施の形態では、障害の対象としてボードのみを取り扱っているが、本発明の障害情報格納システムでは、障害の対象として、ボード上に搭載された構成品を扱っている点で異なる。 In the first embodiment, only the board is handled as a failure target, but the failure information storage system of the present invention is different in that a component mounted on the board is handled as a failure target.

ボード４００とボード２００との差異は、共通であるボード内の障害情報格納部４０５に加えて、ボード内の４つの構成品（以下、メモリモジュールとする。）内に、それぞれ障害情報格納部４１１、障害情報格納部４２１、障害情報格納部４３１、障害情報格納部４４１が追加されている点である。 The difference between the board 400 and the board 200 is that, in addition to the common fault information storage unit 405 in the board, the fault information storage unit 411 is included in four components in the board (hereinafter referred to as memory modules). A failure information storage unit 421, a failure information storage unit 431, and a failure information storage unit 441 are added.

また、サービスプロセッサ１００の差異は、障害被擬対象テーブル１０５に被擬対象として被擬対象ボードに加えて、被擬対象構成品が追加される点と、障害情報被擬対象格納処理部１０９が被擬対象構成品内の障害情報格納部に、障害のロギングデータを格納する機能を有する点である。 In addition, the difference between the service processors 100 is that, in addition to the simulated target board, the simulated target component is added to the fault simulated target table 105, and the fault information simulated target storage processing unit 109 The fault information storage unit in the component to be simulated has a function of storing fault logging data.

その他の構成、機能は第１の実施の形態と同様である。 Other configurations and functions are the same as those in the first embodiment.

図５では、ボードがメモリボードで、構成品がメモリモジュールのケースを示している。 FIG. 5 shows a case where the board is a memory board and the component is a memory module.

図５のメモリボード上には、メモリモジュール４１０、メモリモジュール４２０、メモリモジュール４３０、メモリモジュール４４０が実装されており、メモリモジュール４１０上には障害情報格納部４１１が、メモリモジュール４２０上には障害情報格納部４２１が、メモリモジュール４３０上には障害情報格納部４３１が、メモリモジュール４４０上には障害情報格納部４４１が実装されている。 A memory module 410, a memory module 420, a memory module 430, and a memory module 440 are mounted on the memory board of FIG. 5. A failure information storage unit 411 is mounted on the memory module 410 and a failure is stored on the memory module 420. The information storage unit 421 includes a failure information storage unit 431 on the memory module 430 and a failure information storage unit 441 on the memory module 440.

ここで、図２の１例として、障害種別識別番号が０００４に関連付けられて、被擬対象ボードがボード４００と、メモリモジュール４４０と記載されているケースについて、図６を用いて説明する。 Here, as an example of FIG. 2, a case where the fault type identification number is associated with 0004 and the board to be simulated is described as the board 400 and the memory module 440 will be described with reference to FIG. 6.

図６には、メモリボードであるボード４００とこのボード上にこのメモリボードの構成品である４つのメモリモジュール４１０、メモリモジュール４２０、メモリモジュール４３０、メモリモジュール４４０が搭載されている場合の障害の対応が示されている。 FIG. 6 shows a fault when a board 400 as a memory board and four memory modules 410, a memory module 420, a memory module 430, and a memory module 440 as components of the memory board are mounted on the board. Correspondence is shown.

このケースでは、ボード４００に搭載されている４つのメモリモジュールの中で、メモリモジュール４４０でのみメモリ障害が発生した場合を想定している。 In this case, it is assumed that a memory failure occurs only in the memory module 440 among the four memory modules mounted on the board 400.

次に、図２と図５を用いて、本発明の第２の実施の形態についての動作について説明する。 Next, the operation of the second embodiment of the present invention will be described with reference to FIGS.

図２の１例では、障害情報の識別番号が０００４であり、被疑対象はメモリボードであるボード４００、４つのメモリモジュールの中のメモリモジュール４４０であると想定されている。 In the example of FIG. 2, it is assumed that the identification number of the failure information is 0004, and the suspected object is the board 400 that is a memory board, and the memory module 440 among the four memory modules.

このような場合には、障害情報被擬対象格納処理部１０９の制御により、ボード４００
のボード障害ログ情報格納部４０４内の障害ログが、ボード４００内の障害情報格納部４０５と、メモリモジュール４４０上の障害情報格納部４４１に格納される。 In such a case, the board 400 is controlled under the control of the fault information simulated object storage processing unit 109.
The fault log in the board fault log information storage unit 404 is stored in the fault information storage unit 405 in the board 400 and the fault information storage unit 441 on the memory module 440.

このようにして、本発明の第２の実施の形態においては、障害が発生した要因になる疑いのある被擬ボードまたは被擬構成品内に障害情報を格納し、メモリカードを修理するときに、障害が発生した箇所を特定するのに有効な情報を提供することができる。 In this way, in the second embodiment of the present invention, when fault information is stored in a simulated board or simulated component that is suspected of causing a fault, and the memory card is repaired. It is possible to provide information effective for identifying the location where the failure has occurred.

尚構成品はメモリモジュールに限らず、追加のプロセッサ、オプション機能を有する孫ボード等でも良い。 The component is not limited to the memory module, but may be an additional processor, a grandchild board having an optional function, or the like.

本発明の第１の実施の形態を示す障害情報格納システムの全体の構成図である。1 is an overall configuration diagram of a failure information storage system showing a first exemplary embodiment of the present invention. 本発明の第１の実施の形態における障害被擬対象テーブル１０５の１例である。It is an example of the fault imitation object table 105 in the 1st Embodiment of this invention. 本発明の第１の実施の形態における２枚のボードに跨る障害の１例である。It is an example of the obstacle straddling two boards in the 1st Embodiment of this invention. 本発明の第１の実施の形態における２枚のボードに跨る障害の他の例である。It is another example of the obstacle straddling two boards in the 1st embodiment of the present invention. 本発明の第２の実施の形態を示す障害情報格納システムの全体の構成図である。It is a whole block diagram of the failure information storage system which shows the 2nd Embodiment of this invention. 本発明の第２の実施の形態におけるボードがメモリモジュールで構成されている１例である。It is an example in which the board in the 2nd Embodiment of this invention is comprised with the memory module. 本発明の第１の実施の形態を示す障害情報格納システムのフローチャートである。It is a flowchart of the failure information storage system which shows the 1st Embodiment of this invention.

符号の説明Explanation of symbols

１００サービスプロセッサ
１０１診断制御処理部
１０２障害検出受信処理部
１０３障害内容読出処理部
１０４障害種別識別処理部
１０５障害被擬対象テーブル
１０６障害被擬対象特定処理部
１０７システム障害ログ情報格納部
１０８障害ログ被擬対象選択処理部
１０９障害情報被擬対象格納処理部
１１０診断モード制御処理部
１１１診断用クロック制御処理部
２００ボード
２０１診断制御処理部
２０２障害検出通知処理部
２０３エラーレジスタ
２０４ボード障害ログ情報格納部
２０５障害情報格納部
２０８診断モード制御処理部
２０９診断用クロック制御処理部
３０５障害情報格納部
４００ボード
４１０メモリモジュール
４１１障害情報格納部
４２０メモリモジュール
４２１障害情報格納部
４３０メモリモジュール
４３１障害情報格納部
４４０メモリモジュール
４４１障害情報格納部
５０１障害種別識別番号
５０２被擬対象Ａ
５０３被擬対象Ｂ DESCRIPTION OF SYMBOLS 100 Service processor 101 Diagnosis control processing part 102 Fault detection reception processing part 103 Fault content read-out processing part 104 Fault type identification processing part 105 Fault imitation target table 106 Fault imitation target specific processing part 107 System fault log information storage part 108 Fault log Simulated target selection processing unit 109 Fault information simulated target storage processing unit 110 Diagnostic mode control processing unit 111 Diagnostic clock control processing unit 200 Board 201 Diagnostic control processing unit 202 Fault detection notification processing unit 203 Error register 204 Board fault log information storage Unit 205 fault information storage unit 208 diagnostic mode control processing unit 209 diagnostic clock control processing unit 305 fault information storage unit 400 board 410 memory module 411 fault information storage unit 420 memory module 421 fault information storage unit 430 memo Module 431 failure information storage unit 440 memory module 441 failure information storage unit 501 fault type identification number 502 the pseudo target A
503 Simulated object B

Claims

計算機システムを構成する複数のボードと前記計算機システムで発生した障害を処理するサービスプロセッサとから構成される障害情報格納システムであって、
前記複数のボードのそれぞれは、障害の解析に必要なデータである障害のロギングデータを格納するボード障害ログ情報格納部と、障害情報を格納するための不揮発性メモリとを備え、
前記サービスプロセッサは、前記ボードから障害内容を読み出す障害内容読出処理部と、前記障害内容の障害種別を識別する障害種別識別処理部と、前記障害種別に基づいて障害が発生した疑いのあるボードを被擬ボードとして特定する障害被擬対象特定処理部と、前記被擬ボードに対応する前記障害のロギングデータを前記ボード障害ログ情報格納部から読み出して前記被擬ボード内の前記不揮発メモリに格納する障害情報被擬対象格納処理部とを備えることを特徴とする障害情報格納システム。 A fault information storage system comprising a plurality of boards constituting a computer system and a service processor for handling faults occurring in the computer system,
Each of the plurality of boards includes a board failure log information storage unit for storing failure logging data that is data necessary for failure analysis, and a nonvolatile memory for storing failure information.
The service processor includes a fault content read processing unit that reads fault content from the board, a fault type identification processing unit that identifies a fault type of the fault content, and a board that is suspected of having failed based on the fault type. The fault simulation target specifying processing unit specified as the simulated board and the fault logging data corresponding to the simulated board are read from the board fault log information storage unit and stored in the nonvolatile memory in the simulated board. A failure information storage system comprising a failure information simulated object storage processing unit.

計算機システムを構成する複数のボードの内、障害が発生した前記ボードから障害内容を読み出す障害内容読出処理部と、前記障害内容の障害種別を識別する障害種別識別処理部と、前記障害種別に基づいて障害が発生した疑いのあるボードを被擬ボードとして特定する障害被擬対象特定処理部と、前記被擬ボードに対応する前記障害のロギングデータを前記ボード障害ログ情報格納部から読み出して前記被擬ボード内の前記不揮発メモリに格納する障害情報被擬対象格納処理部とを備えることを特徴とするサービスプロセッサ。 Based on the failure type, a failure content read processing unit that reads out the failure content from the board in which a failure has occurred, a failure type identification processing unit that identifies the failure type of the failure content, among the plurality of boards constituting the computer system A fault suspicious target identification processing unit that identifies a board that is suspected of causing a fault as a simulated board, and reads out logging data of the fault corresponding to the simulated board from the board fault log information storage unit. A service processor comprising a failure information simulated object storage processing unit for storing in a non-volatile memory in a pseudo board.

前記障害被擬対象特定処理部は、前記障害種別を識別する障害種別識別番号と、前記被擬ボードとが関連付けられた障害被疑対象テーブルを含むことを特徴とする請求項２記載のサービスプロセッサ。 The service processor according to claim 2, wherein the fault suspicious target specifying processing unit includes a fault suspicious target table in which a fault type identification number for identifying the fault type and the simulated board are associated with each other.

計算機システムを構成する複数のボードの内、障害が発生した前記ボードから障害内容を読み出す障害内容読出処理ステップと、前記障害内容の障害種別を識別する障害種別識別処理ステップと、前記障害種別に基づいて障害が発生した疑いのあるボードを被擬ボードとして特定する障害被擬対象特定処理ステップと、前記被擬ボードに対応する前記障害のロギングデータを前記ボード障害ログ情報格納部から読み出して前記被擬ボード内の前記不揮発メモリに格納する障害情報被擬対象格納処理ステップと、を含むことを特徴とする障害情報格納方法。 Based on the failure type, a failure content reading processing step for reading out the failure content from the board in which a failure has occurred, a failure type identification processing step for identifying the failure type of the failure content, among the plurality of boards constituting the computer system A fault suspicious object specifying processing step for identifying a board that is suspected of having a fault as a simulated board, and reading out the fault logging data corresponding to the simulated board from the board fault log information storage unit. A failure information storage method comprising: a failure information simulated object storage processing step of storing in a non-volatile memory in a pseudo board.

前記障害被擬対象特定処理ステップは、前記障害種別を識別する障害種別識別番号と、前記被擬ボードとが関連付けられた障害被疑対象テーブルを含むことを特徴とする請求項４記載の障害情報格納方法。 5. The fault information storage according to claim 4, wherein the fault target object specifying processing step includes a fault suspect target table in which a fault type identification number for identifying the fault type and the simulated board are associated with each other. Method.

計算機システムを構成する複数のボードの内、障害が発生した前記ボードから障害内容を読み出す障害内容読出処理と、前記障害内容の障害種別を識別する障害種別識別処理と、前記障害種別に基づいて障害が発生した疑いのあるボードを被擬ボードとして特定する障害被擬対象特定処理と、前記被擬ボードに対応する前記障害のロギングデータを前記ボード障害ログ情報格納部から読み出して前記被擬ボード内の前記不揮発メモリに格納する障害情報被擬対象格納処理とをコンピュータに行わせることを特徴とするプログラム。 Of the plurality of boards constituting the computer system, a fault content reading process for reading the fault content from the board in which a fault has occurred, a fault type identifying process for identifying the fault type of the fault content, and a fault based on the fault type Processing for identifying a simulated fault target for identifying a board suspected of having occurred as a simulated board, and reading out fault logging data corresponding to the simulated board from the board fault log information storage unit. A program for causing a computer to perform a failure information simulated object storage process stored in the nonvolatile memory.

前記障害被擬対象特定処理は、前記障害種別を識別する障害種別識別番号と、前記被擬ボードとが関連付けられた障害被疑対象テーブルを含むことを特徴とする請求項６記載のプログラム。 The program according to claim 6, wherein the fault suspicious target specifying process includes a fault suspicious target table in which a fault type identification number for identifying the fault type and the simulated board are associated with each other.

構成品が搭載されたボードを含む複数のボードを有する計算機システムと、前記計算機システムで発生した障害を処理するサービスプロセッサとから構成される障害情報格納システムであって、
前記複数のボードのうち、前記構成品が搭載されているボードのそれぞれは、障害の解析に必要なデータである障害のロギングデータを格納するボード障害ログ情報格納部と、障害情報を格納するための第１の不揮発性メモリと、前記構成品に関する障害情報を格納する前記構成品内の第２の不揮発性メモリとを備え、
前記サービスプロセッサは、前記ボードから障害内容を読み出す障害内容読出処理部と、前記障害内容の障害種別を識別する障害種別識別処理部と、前記障害種別に基づいて障害が発生した疑いのある構成品を被擬構成品として特定する障害被擬対象特定処理部と、前記被擬構成品に対応する前記障害のロギングデータを前記ボード障害ログ情報格納部から読み出して前記被擬ボード内の前記第１の不揮発メモリ、及び前記被擬構成品内の第２の不揮発メモリに格納する障害情報被擬対象格納処理部と、を備えることを特徴とする障害情報格納システム。 A fault information storage system comprising a computer system having a plurality of boards including a board on which a component is mounted, and a service processor for handling faults occurring in the computer system,
Of the plurality of boards, each of the boards on which the component is mounted has a board fault log information storage unit for storing fault logging data, which is data necessary for fault analysis, and fault information. A first non-volatile memory and a second non-volatile memory in the component for storing fault information relating to the component,
The service processor includes a fault content read processing unit that reads fault content from the board, a fault type identification processing unit that identifies a fault type of the fault content, and a component that is suspected of having failed based on the fault type A fault impersonation target specifying processing unit for specifying the faulty component as a simulated component, and reading out the fault logging data corresponding to the simulated component from the board fault log information storage unit to read the first in the simulated board A failure information storage system comprising: a non-volatile memory; and a failure information simulated object storage processing unit for storing in a second nonvolatile memory in the simulated component.

構成品が搭載されたボードを含む複数のボードを有する計算機システムにおいて、
障害が発生した前記ボードから障害内容を読み出す障害内容読出処理部と、前記障害内容の障害種別を識別する障害種別識別処理部と、前記障害種別に基づいて障害が発生した疑いのある構成品を被擬構成品として特定する障害被擬対象特定処理部と、前記被擬構成品に対応する前記障害のロギングデータを前記ボード障害ログ情報格納部から読み出して前記被擬ボード内の前記不揮発メモリ、及び前記被擬構成品内の不揮発メモリに格納する障害情報被擬対象格納処理部と、を備えることを特徴とするサービスプロセッサ。 In a computer system having a plurality of boards including a board on which components are mounted,
A failure content reading processing unit that reads out the failure content from the board in which the failure has occurred, a failure type identification processing unit that identifies the failure type of the failure content, and a component that is suspected of having failed based on the failure type A fault imitation target specifying processing unit for specifying as a simulated component, and reading out the fault logging data corresponding to the simulated component from the board fault log information storage unit, and the nonvolatile memory in the simulated board, And a failure information simulated object storage processing unit for storing in a non-volatile memory in the simulated component.

前記障害被擬対象特定処理部は、前記障害種別を識別する障害種別識別番号と、前記被擬ボードとが関連付けられた障害被疑対象テーブルを含むことを特徴とする請求項９記載のサービスプロセッサ。 The service processor according to claim 9, wherein the fault suspicious target specifying processing unit includes a fault suspicious target table in which a fault type identification number for identifying the fault type and the simulated board are associated with each other.

構成品が搭載されたボードを含む複数のボードを有する計算機システムにおいて、
障害が発生した前記ボードから障害内容を読み出す障害内容読出処理ステップと、前記障害内容の障害種別を識別する障害種別識別処理ステップと、前記障害種別に基づいて障害が発生した疑いのある構成品を被擬構成品として特定する障害被擬対象特定処理ステップと、前記被擬構成品に対応する前記障害のロギングデータを前記ボード障害ログ情報格納部から読み出して前記被擬ボード内の前記不揮発メモリ、及び前記被擬構成品内の不揮発メモリに格納する障害情報被擬対象格納処理ステップと、を含むことを特徴とする障害情報格納方法。 In a computer system having a plurality of boards including a board on which components are mounted,
A failure content reading processing step for reading the failure content from the board where the failure has occurred, a failure type identification processing step for identifying the failure type of the failure content, and a component suspected of having failed based on the failure type A fault imitation target specifying processing step for specifying as an imitated component, and reading out the fault logging data corresponding to the imitated component from the board error log information storage unit, the nonvolatile memory in the imitation board, And a failure information simulated object storage processing step of storing in a non-volatile memory in the simulated component, and a failure information storage method.

前記障害被擬対象特定処理ステップは、前記障害種別を識別する障害種別識別番号と、前記被擬ボードとが関連付けられた障害被疑対象テーブルを含むことを特徴とする請求項１１記載の障害情報格納方法。 12. The fault information storage according to claim 11, wherein the fault target object specifying processing step includes a fault suspect target table in which a fault type identification number for identifying the fault type and the simulated board are associated with each other. Method.

構成品が搭載されたボードを含む複数のボードを有する計算機システムにおいて、
障害が発生した前記ボードから障害内容を読み出す障害内容読出処理と、前記障害内容の障害種別を識別する障害種別識別処理と、前記障害種別に基づいて障害が発生した疑いのある構成品を被擬構成品として特定する障害被擬対象特定処理と、前記被擬構成品に対応する前記障害のロギングデータを前記ボード障害ログ情報格納部から読み出して前記被擬ボード内の前記不揮発メモリ、及び前記被擬構成品内の不揮発メモリに格納する障害情報被擬対象格納処理と、コンピュータに行わせることを特徴とするプログラム。 In a computer system having a plurality of boards including a board on which components are mounted,
A failure content reading process for reading the failure content from the board in which the failure has occurred, a failure type identification process for identifying the failure type of the failure content, and a component suspected of having failed based on the failure type are simulated. Fault identification target identification processing specified as a component, and logging data of the fault corresponding to the pseudo component are read from the board fault log information storage unit, the nonvolatile memory in the simulated board, and the target A fault information impersonation target storage process stored in a non-volatile memory in a pseudo-configuration product, and a program that causes a computer to perform the program.

前記障害被擬対象特定処理は、前記障害種別を識別する障害種別識別番号と、前記被擬ボードとが関連付けられた障害被疑対象テーブルを含むことを特徴とする請求項１３記載のプログラム。 The program according to claim 13, wherein the fault suspicious object specifying process includes a fault suspicion target table in which a fault type identification number for identifying the fault type and the simulated board are associated with each other.