JPH01200441A

JPH01200441A - Mutual monitoring method among plural processors

Info

Publication number: JPH01200441A
Application number: JP63023724A
Authority: JP
Inventors: Masaki Obara; 正樹小原
Original assignee: Fuji Electric Co Ltd
Current assignee: Fuji Electric Co Ltd
Priority date: 1988-02-05
Filing date: 1988-02-05
Publication date: 1989-08-11

Abstract

PURPOSE:To detect the abnormality of a processor at a high speed and with no error by adding a counter to a shared memory and deciding the contents of the counter after changing them with clearing or increase. CONSTITUTION:A shared memory 1 and processors 3A-3N are connected to a common bus 4 and a bus arbitrating device 2 is connected to the processors 3A-3N via a control signal line 5. The device 2 controls each of those processors 3A-3N so that they can use the memory 1 in order. The memory 1 contains the counters in accordance with those processors respectively and each processor clears its own counter and increases the count values of other counters. Then the abnormality is decided when the count value exceeds a fixed level. Thus it is possible to detect the abnormality of the processor at a high speed and with no error despite the increase the number of processors just by increasing the number of corresponding counters and performing a checking process.

Description

【発明の詳細な説明】〔産業上の利用分野〕この発明は、共有メモリを介してデータを授受スルマル
チプロセッサシステムにおいて、プロセッサの暴走など
の異常を検出するためのプロセッサ監視方法に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a processor monitoring method for detecting abnormalities such as processor runaway in a multiprocessor system that exchanges data via a shared memory.

〔従来の技術〕[Conventional technology]

この種の方式として、従来から（１）　　プロセッサ相互で共有メモリを介して特定デ
ータを授受する。Conventionally, this type of system includes (1) exchanging specific data between processors via a shared memory;

（２）各プロセッサに監視用のタイマ（ＷＤＴ：Ｗａｔ
ｃｈ　Ｄｏｇ　Ｔｉｍｅｒ　）を設け、タイマアップ時
に他のプロセッサに割り込みを発生する。(2) Each processor has a monitoring timer (WDT: Wat
ch Dog Timer ) is provided to generate an interrupt to other processors when the timer is up.

ことにより、他のプロセッサの異常を検出するものが知
られている。There is a known method for detecting abnormalities in other processors.

〔発明が解決しようとする課題〕[Problem to be solved by the invention]

しかしながら、これらには以下の問題点がある。 However, these have the following problems.

＜ａ＞　プレセッサ相互で共有メモリを介して特定デー
タを授受する方式では、共有メモリの内容が予め定めら
れた特定データに変化している場合は相手プロセッサを
正常、その内容が変化していない場合は異常と判定して
おり、−回の判定で直ちに異常と判定すると正常な場合
に異常と見做す誤検出（相手のプロセッサは正常動作を
しているにもか＼わらず、相手のプロセッサが共有メモ
リのデータを特定データに変更する前にデータを読み出
し、変化しているかどうかをチエツクして異常と判定し
た場合）を行うおそれがあるため、数回繰り返し再チエ
ツクを行い、連続してデータが変化しない時に異常と判
断している。この結果、再チエツクの回数を判定するた
めにプワセツサ個別のメモリ領域にカウンタと、そのカ
ウンタ値の大きさを判定するプログラムが必要であり、
しかもプロセッサの台数が増加するとカウンタ数とソフ
ト処理が増大するという欠点がある０（ｂ）各プロセッサに監視用タイマ（ＷＤＴ）を設け、
タイムアツプ時に他のプロセッサに割り込みを発生して
プロセッサ異常を検出する方式では、各プログラムの開
始時にＷＤＴのリセットが行われ、一つのプログラムを
プロセッサが実行中はＷＤＴがタイムアツプしないよう
にするため、通常は５０ｍ５〜　ｏ　ｏｍｓに設定して
いる。この結果、異常発生（例えばプロセッサの暴走）
からその状態を検出するまでに時間がかかりすぎ、異常
に対する保護動作が遅れる。更に、ＷＤＴにて異常を検
出しているため、ＷＤＴが誤動作成いは故障した場合、
誤検出や検出不能になるという欠点がある。<a> In a method in which specific data is exchanged between processors via shared memory, if the content of the shared memory has changed to predetermined specific data, the other processor is considered normal, and if the content has not changed, the other processor is considered normal. is determined to be abnormal, and if it is immediately determined to be abnormal in - times, the normal case is considered to be abnormal (false detection (even though the other party's processor is operating normally read the data before changing the data in the shared memory to specific data, check whether it has changed, and determine that it is abnormal), so please re-check several times and continuously. It is determined that there is an abnormality when the data does not change. As a result, in order to determine the number of rechecks, a counter is required in the memory area of each printer, and a program is required to determine the size of the counter value.
Moreover, as the number of processors increases, the number of counters and software processing increases.0 (b) Each processor is provided with a monitoring timer (WDT),
In the method of detecting a processor abnormality by generating an interrupt to other processors when time-up occurs, the WDT is reset at the start of each program, and the WDT is normally reset to prevent the WDT from time-up while a processor is executing a single program. is set at 50m5~oms. As a result, an abnormality occurs (for example, a runaway processor)
It takes too much time to detect the condition, and protective actions against abnormalities are delayed. Furthermore, since abnormalities are detected in the WDT, if the WDT malfunctions or breaks down,
This method has the disadvantage of false positive detection or failure to detect.

したがって、この発明は各プロセッサの異常を迅速に、
しかも誤検出なしで検出が可能なプロセッサ監視方法を
提供することを目的とする。Therefore, this invention can quickly detect abnormalities in each processor.
Moreover, it is an object of the present invention to provide a processor monitoring method that allows detection without false detection.

〔課題を解決するための手段〕[Means to solve the problem]

複数のプロセッサ間でデータの授受を行うための共通メ
モリ上に各プロセッサ対応のカウンタ領域を割り当て、
各プロセッサは所定の処理を実行する毎または一定周期
毎に自己のカウンタ領域はクリアする一方、他のプロセ
ッサのカウンタ領域のデータを増加する処理を行い、カ
ウンタ領域のデータの大きさにより他のプロセッサの異
常を判定する。Allocates a counter area for each processor on the common memory to exchange data between multiple processors,
Each processor clears its own counter area every time it executes a predetermined process or at regular intervals, while also increasing the data in the counter area of other processors, depending on the size of the data in the counter area. Determine the abnormality of.

〔作用〕[Effect]

共有メモリ上にカウンタを設け、それをクリア或いは増
加により変化させ、その内容を判定するコトにより各プ
ロセッサが正常に動作しているか否かを判定でき、しか
も誤動作を防止するための繰り返し再チエツク用のカウ
ンタを代用できる点ニ着目し、各プロセッサに対応する
カウンタを共有メモリに設けて自分のカウンタはクリア
し、他のプロセッサ用のカウンタを増加してその値が一
定以上になれば異常と判定することで、相互にプロセッ
サの異常を検出しようとするものである０〔実施例〕第１図はこの発明の実施例を示すフローチャート、第２
図はこの発明が適用されるシステムを示すブロック図、
第３図は共有メモリの構成を示す概要図である。By providing a counter on the shared memory, changing it by clearing or incrementing it, and determining its contents, it is possible to determine whether each processor is operating normally, and it is also used for repeated re-checking to prevent malfunctions. Focusing on the fact that the counter for each processor can be used instead, we set up a counter corresponding to each processor in the shared memory, clear its own counter, and increment the counter for other processors, and if the value exceeds a certain value, it is determined to be abnormal. 0 [Embodiment] FIG. 1 is a flowchart showing an embodiment of the present invention, and FIG.
The figure is a block diagram showing a system to which this invention is applied.
FIG. 3 is a schematic diagram showing the configuration of the shared memory.

まず、第２図から説明する。First, explanation will be given from FIG. 2.

コモンバス４には共有メモリ１およびプロセッサ３（３
Ａ〜３Ｎ）が接続され、パス調停装置２とプロセッサ３
（３Ａ〜３Ｎ）はコントロール信号線５を介して接続さ
れており、ノ（ス調停装置２はプロセッサ３（３Ａ〜３
Ｎ）が共有メモリ１を順序よく使用できるように、各プ
ロセッサをコントロールしている。Common bus 4 includes shared memory 1 and processor 3 (3
A to 3N) are connected, and the path arbitration device 2 and processor 3
(3A to 3N) are connected via the control signal line 5, and the node arbitration device 2 is connected to the processor 3 (3A to 3N) through the control signal line 5.
N) controls each processor so that the shared memory 1 can be used in an orderly manner.

各プロセッサ３（３Ａ〜３Ｎ）は第１＠のプログラムを
常時或いは定周期にて実行する。すなわち、第３図に示
す共有メモリ１内の自分に対するカウンタを零クリアし
た後（■参照）、他のプロセッサ用の全てのカウンタ（
例えハ、フロセッサ３Ａであれば、プルセッサ３Ｂ月カ
ウンタからプロセッサ３Ｎ用カウンタ）を＋１増加しく
■参照）、そのカウンタ値が一定値以上であるかをチエ
ツクすることにより、他のプロセッサの異常を判定する
（■、■参照）。このように各プロセッサが動作してい
るので、全てのプロセッサが正常であればカウンタ値は
常にクリアされているが、−台でも異常のプロセッサが
あればそのプロセッサに対するカウンタはクリアされず
、他のプロセッサにより増加を続けながら常時その値が
監視されることになるため、一定値を越えた時点でその
カウンタを増加させたプロセッサにより、直ちに異常の
プロセッサを検出することができる。Each processor 3 (3A to 3N) executes the first @ program constantly or at regular intervals. That is, after clearing the counter for itself in the shared memory 1 shown in FIG. 3 to zero (see ■), all counters for other processors (
For example, if it is processor 3A, increase the processor 3N counter (from processor 3B month counter) by +1 (see ■), and check whether the counter value is above a certain value to determine if there is an abnormality in other processors. (see ■, ■). Since each processor operates in this way, if all processors are normal, the counter value is always cleared, but if there is an abnormal processor, the counter for that processor is not cleared, and other Since the value is constantly monitored by the processor as it continues to increase, an abnormal processor can be immediately detected by the processor that increments the counter when it exceeds a certain value.

〔発明の効果〕共有メモリに各プロセッサに対応するカウンタ領域を設
け、自分のカウンタはクリアし、他のプロセッサ用のカ
ウンタは増加させ、その値が一定値以上であれば異常と
判定するようにしたため、故障横用用の特別な回路や共
有メモリ以外のメモリは不用であり、しかもプロセッサ
の台数が増加しても、その台数分のカウンタの増加とチ
エツクのための簡単な処理を行うだけでプロセッサの異
常を迅速に、しかも誤検田なしで検出できる利点がある
。[Effect of the invention] A counter area corresponding to each processor is provided in the shared memory, its own counter is cleared, counters for other processors are incremented, and if the value exceeds a certain value, it is determined to be abnormal. Therefore, there is no need for special circuits for handling failures or memory other than shared memory, and even if the number of processors increases, the counter can be increased by the number of processors and simple processing for checking can be performed. This has the advantage of being able to detect processor abnormalities quickly and without false positives.

【図面の簡単な説明】[Brief explanation of the drawing]

第１図はこの発明の実施例を示すフローチャート、第２
図はこの発明が適用されるシステムを示すブロック図、
第３図は共有メモリの構成を示す概要図である。符号説明１・・・・・・共有メモリ、２・・・・・・パス調停装
置、３（３Ａ〜３Ｎ）・・・・・・プロセッサ、４・・
・・・・コモンバス、５・・・・・・コントローｋｍ号
１７３゜代理人　弁理士　並　木　昭　夫代理人　弁理士　松　崎　　　　清第１図FIG. 1 is a flowchart showing an embodiment of the invention, and FIG.
The figure is a block diagram showing a system to which this invention is applied.
FIG. 3 is a schematic diagram showing the configuration of the shared memory. Description of symbols 1...Shared memory, 2...Path arbitration device, 3 (3A to 3N)...Processor, 4...
...Common bus, 5...Control km No. 173゜Representative Patent attorney Akio Namiki Representative Patent attorney Kiyoshi Matsuzaki Figure 1

Claims

【特許請求の範囲】[Claims]

複数のプロセッサ間でデータの授受を行うための共通メ
モリ上に各プロセッサ対応のカウンタ領域を割り当て、
各プロセッサは所定の処理を実行する毎または一定周期
毎に自己のカウンタ領域をクリアする一方、他のプロセ
ッサのカウンタ領域のデータを増加する処理を行い、カ
ウンタ領域のデータの大きさにより他のプロセッサの異
常を判定することを特徴とする複数プロセッサの相互監
視方法。Allocates a counter area for each processor on the common memory to exchange data between multiple processors,
Each processor clears its own counter area every time it executes a predetermined process or at regular intervals, and also performs processing to increase the data in the counter area of other processors. A mutual monitoring method for multiple processors, characterized by determining an abnormality in a processor.