JPH05158749A

JPH05158749A - Fault monitor control system

Info

Publication number: JPH05158749A
Application number: JP3349575A
Authority: JP
Inventors: Tsunemichi Shiozawa; 恒道塩澤; Shuji Miki; 修次三木; Michihiro Aoki; 道宏青木; Takanari Hoshiai; 隆成星合; Eiji Ishikawa; 英治石川
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1991-12-06
Filing date: 1991-12-06
Publication date: 1993-06-25

Abstract

PURPOSE:To improve the overall reliability of a fault monitor control system against the increase the number of processing units by monitoring the faults of all processing units of a multiprocessor through 8 single fault monitoring part. CONSTITUTION:The processing units 1-3 set the reset instruction information to the prescribed reset registers 61-63 respectively. At the same time, a monitor timer 60 is reset and also the reset instruction information set to the registers 61-63 ere reset. The timer 60 sets repetitively the value obtained by adding '1' to the present value in a fixed cycle. Then the timer 60 informs an interruption control part 64 of an overflow state if the value of the timer 60 exceeds a prescribed level. The part 64 informs a fact that the units 1-3 have the faults. The processors 10-30 of the units 1-5 store their states in each individual memory and then processes the faults of the units 1-3.

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、複数の処理ユニットと
障害監視ユニットで構成されたマルチプロセッサシステ
ムにおいて、障害の発生を監視タイマによって検出する
障害監視制御方式に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a fault monitoring control system for detecting the occurrence of a fault by a monitoring timer in a multiprocessor system including a plurality of processing units and a fault monitoring unit.

【０００２】[0002]

【従来の技術】図４は複数のプロセッサで構成されたマ
ルチプロセッサにおける従来の障害監視制御方式を説明
するための図であり、１，２，３は処理ユニットであ
り、４はシステムバスであり、５はメモリユニットのメ
インメモリであり、１０，２０，３０はプロセッサであ
り、１１，２１，３１は各プロセッサが個別にアクセス
する情報を格納する個別メモリであり、１２，２２、３
２は障害監視部であり、１２０，２２０，３２０は一定
周期で値が＋１される監視タイマであり、１２１，２２
１，３２１は割り込み制御部である。2. Description of the Related Art FIG. 4 is a diagram for explaining a conventional fault monitoring control system in a multiprocessor composed of a plurality of processors. Reference numerals 1, 2 and 3 are processing units, and 4 is a system bus. 5, 5 is a main memory of the memory unit, 10, 20, 30 are processors, 11, 21, 31 are individual memories for storing information individually accessed by each processor, 12, 22, 3,
Reference numeral 2 is a failure monitoring unit, 120, 220, and 320 are monitoring timers whose value is incremented by 1 in a constant cycle.
Reference numerals 1 and 321 denote interrupt control units.

【０００３】１３，１４，１５，２３，２４，２５，３
３，３４，３５は信号線である。13, 14, 15, 23, 24, 25, 3
3, 34 and 35 are signal lines.

【０００４】以下では、監視タイマを用いた従来の障害
監視制御方式について説明する。A conventional fault monitoring control system using a monitoring timer will be described below.

【０００５】プロセッサ１０で実行する処理の中には予
め監視タイマ１２０がオーバフロー（タイマ値が予め定
められた値以上の値となる）する前に信号線１３を介し
て監視タイマのリセット（現在の値を０とする）を行う
処理が含まれている。Among the processes executed by the processor 10, before the monitoring timer 120 overflows (the timer value exceeds a predetermined value) in advance, the monitoring timer is reset via the signal line 13 (current The process of performing (setting the value to 0) is included.

【０００６】メインメモリ５、システムバス４またはプ
ロセッサ１０で障害が発生しなければ、監視タイマ１２
０はオーバフローする前にリセットが行われるため障害
の検出は行われない。If no failure occurs in the main memory 5, the system bus 4 or the processor 10, the monitoring timer 12
0 is reset before overflow, so no fault is detected.

【０００７】ここでは、メインメモリ５で障害が発生
し、処理ユニット１からのアクセスに対して応答を行わ
なくなった場合について説明する。Here, a case where a failure occurs in the main memory 5 and no response is made to the access from the processing unit 1 will be described.

【０００８】プロセッサ１０がシステムバス４を介して
メインメモリ５にアクセスした時、プロセッサ１０はメ
インメモリ５からの応答をシステムバス４を介して受信
するまで待ち状態となり、プロセッサ１０の処理の実行
は中断する。When the processor 10 accesses the main memory 5 via the system bus 4, the processor 10 waits until it receives a response from the main memory 5 via the system bus 4, and the processor 10 cannot execute the processing. Suspend.

【０００９】メインメモリ５で障害が発生してプロセッ
サ１０に応答が送信されないと、プロセッサ１０は信号
線１３を介して監視タイマ１２０のリセットを行わない
ため、監視タイマ１２０は一定周期で現在の値に＋１し
た値をセットする動作を繰り返す。If a failure occurs in the main memory 5 and a response is not sent to the processor 10, the processor 10 does not reset the monitoring timer 120 via the signal line 13, so the monitoring timer 120 has a current value at a constant cycle. The operation of setting a value obtained by adding +1 to is repeated.

【００１０】監視タイマ１２０の値がオーバフローする
と、監視タイマ１２０は信号線１４を介して割り込み制
御１２１にオーバフローしたことを通知する。When the value of the monitoring timer 120 overflows, the monitoring timer 120 notifies the interrupt control 121 via the signal line 14 of the overflow.

【００１１】割り込み制御部１２１は信号線１５を介し
てプロセッサ１０の待ち状態を強制的に解除または実行
中の処理を中断して予め定められた障害処理を実行する
ように指示する。The interrupt control unit 121 instructs via the signal line 15 to forcibly cancel the waiting state of the processor 10 or interrupt the process being executed and execute a predetermined fault process.

【００１２】プロセッサ１０はプロセッサの状態（割り
込みが行われた時点でプロセッサ１０が実行していた処
理の情報、プロセッサ内部の情報等）を個別メモリ１１
に格納した後、障害処理を実行する。The processor 10 stores the state of the processor (information on processing being executed by the processor 10 at the time of the interruption, information inside the processor, etc.) in the individual memory 11
After storing in, the failure processing is executed.

【００１３】上記ではメインメモリ５で障害が発生した
場合について述べたが、システムバス４で障害が発生し
てプロセッサ１０からのアクセス要求が消失して応答が
受信できない場合、プロセッサ１０の内部で異常が発生
して監視タイマ１２０のリセットが行われなくなった場
合等についても同様に障害の監視が行われる。In the above, the case where a failure occurs in the main memory 5 has been described, but when a failure occurs in the system bus 4 and the access request from the processor 10 disappears and a response cannot be received, an abnormality occurs inside the processor 10. In the case where the monitoring timer 120 is no longer reset due to the occurrence of the error, the failure is similarly monitored.

【００１４】[0014]

【発明が解決しようとする課題】マルチプロセッサにお
いて障害の監視を行う場合、従来、図４に示すように各
処理ユニット毎に障害監視部を設ける方法があるが、こ
の方法では障害監視部の全てのハード量がプロセッサ台
数に従って増加する欠点がある。In the case of monitoring a failure in a multiprocessor, there is a conventional method of providing a failure monitoring section for each processing unit as shown in FIG. 4, but in this method, all failure monitoring sections are used. There is a drawback that the amount of hardware increases with the number of processors.

【００１５】また、障害監視部自身は障害監視の対象と
ならないため、システム全体の中で障害監視部のハード
量が増加すると障害監視部で監視できない障害が発生す
る確率が増加し、システム全体の信頼性が低下する欠点
もある。Further, since the fault monitoring unit itself is not the target of fault monitoring, if the hardware amount of the fault monitoring unit in the entire system increases, the probability that a fault that cannot be monitored by the fault monitoring unit will occur increases, and There is also the drawback of reduced reliability.

【００１６】本発明の目的はこれら従来の欠点を除去す
るために、各処理ユニット毎に障害監視部を持たず、一
つの障害監視部によって全ての処理ユニットの障害監視
を行うことによって、処理ユニットの増加によって障害
監視部のハードウェアが比例して増加することなく、処
理ユニットが増加してもシステムの信頼性が低下しない
マルチプロセッサシステムを構成することが可能となる
障害監視制御方式を提供することにある。The object of the present invention is to eliminate the above-mentioned conventional drawbacks by not providing a fault monitoring unit for each processing unit, but by performing fault monitoring of all the processing units by one fault monitoring unit. To provide a fault monitoring control method capable of constructing a multiprocessor system in which the hardware of the fault monitoring unit does not increase proportionally due to an increase in the number of processing units and system reliability does not deteriorate even if the number of processing units increases. Especially.

【００１７】[0017]

【課題を解決するための手段】本発明は上記目的を達成
するため、複数の処理ユニットと障害監視ユニットで構
成されるマルチプロセッサシステムの障害の発生を監視
タイマによって検出する障害監視制御方式において、Ｎ
台（Ｎは正の整数）の処理ユニットと障害監視ユニット
で構成され、前記障害監視ユニットは監視タイマ、割り
込み制御部および前記処理ユニットに１対１に対応する
Ｎ個のリセットレジスタを備え、各処理ユニットは予め
定められたリセットレジスタにリセット指示情報をセッ
トし、Ｎ個のリセットレジスタにリセット指示情報がセ
ットされた時、前記監視タイマをリセットするとともに
Ｎ個のリセットレジスタにセットされたリセット指示情
報をリセットし、前記監視タイマは予め定められた値以
上の値となると前記割り込み制御部にオーバフローを通
知し、前記割り込み制御部は前記処理ユニットに障害が
発生したことを通知することを特徴とする。In order to achieve the above object, the present invention provides a fault monitoring control system for detecting a fault in a multiprocessor system composed of a plurality of processing units and a fault monitoring unit by a monitoring timer. N
Each of the processing units and the failure monitoring unit (N is a positive integer), the failure monitoring unit includes a monitoring timer, an interrupt controller, and N reset registers corresponding to the processing unit in a one-to-one correspondence. The processing unit sets the reset instruction information in a predetermined reset register, and when the reset instruction information is set in the N reset registers, resets the monitoring timer and reset instructions set in the N reset registers. When the information is reset, the monitoring timer reaches a value equal to or larger than a predetermined value, the interrupt control unit is notified of an overflow, and the interrupt control unit notifies the processing unit that a failure has occurred. To do.

【００１８】[0018]

【作用】本発明は、マルチプロセッサにおいて障害の監
視を行う場合、僅かのハードを追加することによって障
害監視部および監視タイマを複数のプロセッサで共用す
ることが可能となり、各処理ユニット毎に障害監視部を
設ける必要が無くなり、障害監視部のハード量を大幅に
削減することが可能となる。According to the present invention, when a fault is monitored in a multiprocessor, the fault monitoring section and the monitoring timer can be shared by a plurality of processors by adding a little hardware, and the fault monitoring can be performed for each processing unit. Since it is not necessary to provide a section, it is possible to significantly reduce the amount of hardware of the failure monitoring section.

【００１９】また、障害監視部のハード量を削減するこ
とによって、プロセッサ台数が増加してもシステム全体
の信頼性が低下する欠点も解決できる。Further, by reducing the amount of hardware of the failure monitoring unit, it is possible to solve the drawback that the reliability of the entire system deteriorates even if the number of processors increases.

【００２０】以下図面にもとづき実施例について説明す
る。Embodiments will be described below with reference to the drawings.

【００２１】[0021]

【実施例】図１は、本発明による障害監視制御方式を説
明するための一実施例であり、１，２，３は処理ユニッ
トであり、４はシステムバスであり、５はメモリユニッ
トのメインメモリであり、６は障害監視部であり、１
０，２０，３０はプロセッサであり、１１、２１、３１
は個別メモリであり、６０は一定周期で値が＋１される
監視タイマであり、６１，６２，６３はリセットレジス
タであり、６４は割り込み制御部であり、６５，６６，
６７，６８はアンドゲートであり、６９，７０，７１は
インバータである。1 is an embodiment for explaining a fault monitoring control system according to the present invention, wherein 1, 2, 3 are processing units, 4 is a system bus, and 5 is a main memory unit. A memory, 6 is a failure monitoring unit, and 1
0, 20, 30 are processors, 11, 21, 31
Is an individual memory, 60 is a monitoring timer whose value is incremented by 1 at a constant cycle, 61, 62 and 63 are reset registers, 64 is an interrupt control unit, 65, 66,
67 and 68 are AND gates, and 69, 70 and 71 are inverters.

【００２２】６０１，６０２，６０３，６０４，６０
５，６０６，６０７，６０８，６０９は信号線である。601, 602, 603, 604, 60
5, 606, 607, 608 and 609 are signal lines.

【００２３】プロセッサ１０は監視タイマ６０がオーバ
フロー（タイマ値が予め定められた値以上の値となる）
する前に信号線６０１を介して監視タイマ６０のリセッ
ト指示（値を「１」とする）をリセットレジスタ６１に
セットする。In the processor 10, the monitoring timer 60 overflows (the timer value becomes a value greater than or equal to a predetermined value).
Prior to this, a reset instruction (value is set to “1”) of the monitoring timer 60 is set in the reset register 61 via the signal line 601.

【００２４】プロセッサ２０および３０も同様に動作す
る。Processors 20 and 30 operate similarly.

【００２５】リセットレジスタ６１，６２および６３の
全てに値「１」がセットされると、アンドゲート６５出
力である信号線６０４が「１」となる。When the value "1" is set in all the reset registers 61, 62 and 63, the signal line 604 which is the output of the AND gate 65 becomes "1".

【００２６】信号線６０４が「１」となると、監視タイ
マ６０がリッセト（現在の値を「０」とする）されると
ともに、リセットレジスタ６１，６２および６３の値も
リセット（値を「０」とする）される。When the signal line 604 becomes "1", the monitoring timer 60 is reset (current value is "0") and the values of the reset registers 61, 62 and 63 are also reset (value is "0"). Will be).

【００２７】メインメモリ５、システムバス４または各
プロセッサ１０，２０，３０で障害が発生せず、リセッ
トレジスタ６１，６２，６３へのリセット指示のセット
が行われれば、監視タイマ６０はオーバフローする前に
リセットされるため障害の検出は行われない。If no failure occurs in the main memory 5, the system bus 4 or each of the processors 10, 20, 30 and the reset instruction is set in the reset registers 61, 62, 63, the monitoring timer 60 will not overflow. No fault is detected because it is reset to.

【００２８】処理ユニットが３台の場合で、リセットレ
ジスタにリセット指示をセットし、監視タイマ６０をリ
セットする動作列を図２に示す。FIG. 2 shows an operation sequence in which a reset instruction is set in the reset register and the monitoring timer 60 is reset when there are three processing units.

【００２９】ここでは、メインメモリ５で障害が発生
し、処理ユニット１からのアクセスに対して応答を行わ
なくなった場合について説明する。Here, a case where a failure occurs in the main memory 5 and no response is made to the access from the processing unit 1 will be described.

【００３０】プロセッサ１０がシステムバス４を介して
メインメモリ５にアクセスした時、プロセッサ１０はメ
インメモリ５からの応答をシステムバス４を介して受信
するまで待ち状態となり、プロセッサ１０の処理の実行
は中断する。When the processor 10 accesses the main memory 5 via the system bus 4, the processor 10 waits until it receives a response from the main memory 5 via the system bus 4, and the processor 10 cannot execute the processing. Suspend.

【００３１】プロセッサ２０および３０が実行している
処理で障害が発生していなければ、リセットレジスタ６
２および６３には値「１」がセットされる。If no failure has occurred in the processing executed by the processors 20 and 30, the reset register 6
The value "1" is set in 2 and 63.

【００３２】リセットレジスタ６１にリセット指示がセ
ットされないため、監視タイマ６０は一定周期で現在の
値に＋１した値をセットする動作を繰り返す。Since the reset instruction is not set in the reset register 61, the monitoring timer 60 repeats the operation of setting a value obtained by adding +1 to the current value in a constant cycle.

【００３３】監視タイマ６０の値がオーバフローする
と、監視タイマ６０は信号線６０５を介して割り込み制
御部６４にオーバフローしたことを通知する。When the value of the monitor timer 60 overflows, the monitor timer 60 notifies the interrupt controller 64 via the signal line 605 that the overflow has occurred.

【００３４】割り込み制御部６４は信号線６０６の値を
「１」とすると、アンドゲート６６の出力である信号線
６０７の値のみが「１」となり、プロセッサ１０の待ち
状態を強制的に解除または実行中の処理を中断して予め
定められた障害処理を実行するように指示する。When the value of the signal line 606 is set to "1", the interrupt control unit 64 sets only the value of the signal line 607 which is the output of the AND gate 66 to "1", forcibly canceling the waiting state of the processor 10 or It is instructed to interrupt the process being executed and execute a predetermined failure process.

【００３５】プロセッサ１０はプロセッサの状態（割り
込みが行われた時点でプロセッサ１０が実行していた処
理の情報、プロセッサ内部の情報等）を個別メモリ１１
に格納した後、障害処理を実行する。The processor 10 stores the state of the processor (information on the processing being executed by the processor 10 at the time of the interrupt, information inside the processor, etc.) in the individual memory 11
After storing in, the failure processing is executed.

【００３６】図３は本発明による障害監視制御方式を説
明するための他の実施例の図で、図１と同じ符号は同じ
部分を示し、特に監視タイマでオーバフローが発生した
場合に全てのプロセッサに対して割り込みを発生する場
合を示す一実施例である。FIG. 3 is a diagram of another embodiment for explaining the fault monitoring control system according to the present invention, in which the same reference numerals as those in FIG. 1 indicate the same parts, and in particular, all processors when an overflow occurs in the monitoring timer. 2 is an embodiment showing a case where an interrupt is generated with respect to.

【００３７】この場合、信号線６０６を介してプロセッ
サ１０，２０および３０の待ち状態を強制的に解除また
は実行中の処理を中断して予め定められた障害処理を実
行するように指示する。In this case, the waiting state of the processors 10, 20 and 30 is forcibly released via the signal line 606, or the processing being executed is interrupted and a predetermined failure processing is executed.

【００３８】プロセッサ１０，２０および３０はプロセ
ッサの状態（割り込みが行われた時点でプロセッサが実
行していた処理の情報、プロセッサ内部の情報等）を個
別メモリ１１，２１，３１に格納した後、障害処理を実
行する。The processors 10, 20 and 30 store the processor states (information on the processing being executed by the processor at the time of the interrupt, information inside the processor, etc.) in the individual memories 11, 21, 31 and then Perform fault handling.

【００３９】障害処理においてはプロセッサ間での相互
通信およびリセットレジスタを参照することによって、
正常に動作しているか否かのチェックを行う。In the fault processing, by referring to the mutual communication between the processors and the reset register,
Check whether it is operating normally.

【００４０】この実施例では、プロセッサ台数が増加し
た場合に増加する障害監視部のハードが１ビットのレジ
スタのみであること、およびプロセッサでハード障害が
発生し、当該プロセッサが割り込みを受け付けない状態
になった場合でも他プロセッサによって障害処理ができ
るという利点がある。In this embodiment, the hardware of the fault monitoring unit that increases when the number of processors increases is only a 1-bit register, and a hardware fault occurs in the processor, so that the processor does not accept an interrupt. Even if it happens, there is an advantage that other processors can handle the failure.

【００４１】以上述べたように、本発明では、プロセッ
サ台数が１台増加した場合に増加する障害監視部で必須
となるハードは１ビットのレジスタ（障害の原因となっ
たプロセッサにのみ割り込みを発生する場合には、さら
にアンドゲートはインバータがそれぞれ１個）となる。As described above, in the present invention, the hardware required for the fault monitoring unit that increases when the number of processors increases by 1 is a 1-bit register (an interrupt is generated only to the processor causing the fault). In this case, the AND gates are each provided with one inverter).

【００４２】障害監視部を構成するためのハード量は監
視タイマおよび割り込み制御部が大きな割合を占める。The monitoring timer and the interrupt control unit occupy a large proportion of the hardware amount for constructing the fault monitoring unit.

【００４３】本発明を用いることによって、僅かのハー
ドを追加することによって障害監視部および監視タイマ
を複数のプロセッサで共用することが可能となり、マル
チプロセッサにおける障害監視のためのハードウェア量
を大幅に削減することが可能となる。By using the present invention, it becomes possible to share the fault monitoring unit and the monitoring timer by a plurality of processors by adding a small amount of hardware, and the hardware amount for fault monitoring in a multiprocessor can be greatly increased. It is possible to reduce.

【００４４】[0044]

【発明の効果】以上説明したように、本発明により、マ
ルチプロセッサにおいて障害の監視を行う場合、各処理
ユニット毎に障害監視部を設ける必要が無くなり、障害
監視部のハード量を大幅に削減することが可能となる。As described above, according to the present invention, when a fault is monitored in a multiprocessor, it is not necessary to provide a fault monitoring unit for each processing unit, and the hardware amount of the fault monitoring unit is greatly reduced. It becomes possible.

【００４５】また、障害監視部のハード量を削減するこ
とによって、プロセッサ台数が増加してもシステム全体
の信頼性が低下する欠点も解決できる。Further, by reducing the amount of hardware of the failure monitoring unit, it is possible to solve the drawback that the reliability of the entire system deteriorates even if the number of processors increases.

【図面の簡単な説明】[Brief description of drawings]

【図１】本発明の障害監視タイマによる割り込みを特定
のプロセッサに通知する一実施例である。FIG. 1 is an example of notifying a specific processor of an interrupt by a fault monitoring timer of the present invention.

【図２】監視タイマをリセットする動作例である。FIG. 2 is an operation example of resetting a monitoring timer.

【図３】本発明の障害監視タイマによる割り込みを全て
のプロセッサに通知する他の実施例である。FIG. 3 is another embodiment for notifying all processors of an interrupt by the fault monitoring timer of the present invention.

【図４】従来技術を説明するための図である。FIG. 4 is a diagram for explaining a conventional technique.

【符号の説明】[Explanation of symbols]

１，２，３処理ユニット４システムバス５メインメモリ６障害監視部１０，２０，３０プロセッサ１１，２１，３１個別メモリ１２，２２，３２障害監視部１２０，２２０，３２０監視タイマ１２１，２２１，３２１割り込み制御部６０監視タイマ６１，６２，６３リセットレジスタ６４割り込み制御部６５，６６，６７，６８アンドゲート６９，７０，７１インバータ１３，１４，１５，２３，２４，２５，３３，３４，３
５，６０１，６０２，６０３，６０４，６０５，６０
６，６０７，６０８，６０９信号線1, 2, 3 Processing unit 4 System bus 5 Main memory 6 Fault monitoring unit 10, 20, 30 Processor 11, 21, 31 Individual memory 12, 22, 32 Fault monitoring unit 120, 220, 320 Monitoring timer 121, 221, 321 Interrupt control unit 60 Monitoring timer 61, 62, 63 Reset register 64 Interrupt control unit 65, 66, 67, 68 AND gate 69, 70, 71 Inverter 13, 14, 15, 23, 24, 25, 33, 34, 3
5,601,602,603,604,605,60
6,607,608,609 Signal line

───────────────────────────────────────────────────── フロントページの続き (72)発明者星合隆成東京都千代田区内幸町一丁目１番６号日本電信電話株式会社内 (72)発明者石川英治東京都千代田区内幸町一丁目１番６号日本電信電話株式会社内 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Takanari Hoshiai 1-6, Uchisaiwaicho, Chiyoda-ku, Tokyo Inside Nippon Telegraph and Telephone Corporation (72) Eiji Ishikawa 1-6, Uchisaiwaicho, Chiyoda-ku, Tokyo Nippon Telegraph and Telephone Corporation

Claims

【特許請求の範囲】[Claims]

【請求項１】複数の処理ユニットと障害監視ユニット
で構成されるマルチプロセッサシステムの障害の発生を
監視タイマによって検出する障害監視制御方式におい
て、Ｎ台（Ｎは正の整数）の処理ユニットと障害監視ユニッ
トで構成され、前記障害監視ユニットは監視タイマ、割
り込み制御部および前記処理ユニットに１対１に対応す
るＮ個のリセットレジスタを備え、各処理ユニットは予め定められたリセットレジスタにリ
セット指示情報をセットし、Ｎ個のリセットレジスタにリセット指示情報がセットさ
れた時、前記監視タイマをリセットするとともにＮ個の
リセットレジスタにセットされたリセット指示情報をリ
セットし、前記監視タイマは予め定められた値以上の値となると前
記割り込み制御部にオーバフローを通知し、前記割り込み制御部は前記処理ユニットに障害が発生し
たことを通知することを特徴とする障害監視制御方式。1. A failure monitoring control system for detecting a failure occurrence of a multiprocessor system comprising a plurality of processing units and a failure monitoring unit by a monitoring timer, wherein N (N is a positive integer) processing units and failures. The failure monitoring unit is provided with a monitoring timer, an interrupt controller, and N reset registers corresponding to the processing units in a one-to-one correspondence. Each processing unit has reset instruction information stored in a predetermined reset register. When the reset instruction information is set in the N reset registers, the monitoring timer is reset and the reset instruction information set in the N reset registers is reset, and the monitoring timer is set in advance. When the value is equal to or more than the value, the overflow is notified to the interrupt control unit, Failure monitoring control system interrupt controller, characterized in that the notifying that a failure in the processing unit is generated.